GSE17536 Processing Pipeline

GSE code_examples 3 steps

Publication

DDX5 promotes oncogene C3 and FABP1 expressions and drives intestinal inflammation and tumorigenesis.

Life science alliance (2020) — PMID 32817263

Dataset

Metastasis Gene Expression Profile Predicts Recurrence and Death in Colon Cancer Patients (Moffitt Samples)

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization.

R vR 4.3.x, affy 1.78.0 GitHub

$ Bash example

# Install R and Bioconductor (if not already installed)
# For Ubuntu/Debian:
# sudo apt update
# sudo apt install r-base
# R -e "install.packages('BiocManager')"
# R -e "BiocManager::install('affy')"
# R -e "BiocManager::install('hgu133plus2.db')" # Example annotation package for a common chip, replace with actual chip type if known

# Create a dummy directory for CEL files for demonstration
mkdir -p cel_files
# For a real run, place your actual .CEL files into the 'cel_files' directory.
# Example: cp /path/to/your/sample1.CEL cel_files/
# Example: cp /path/to/your/sample2.CEL cel_files/

# R script to perform RMA normalization
Rscript -e '
  library(affy)
  
  # Define the directory containing CEL files
  cel_dir <- "cel_files"
  
  # Read CEL files from the specified directory
  # This assumes all CEL files in the directory belong to the same experiment
  # and are compatible with the same chip type. If a specific chip type
  # CDF package is installed (e.g., hgu133plus2.db), you can specify it:
  # affybatch <- ReadAffy(celfile.path=cel_dir, cdfname="hgu133plus2")
  affybatch <- ReadAffy(celfile.path=cel_dir)
  
  # Perform RMA normalization with default settings
  # Default settings for rma: background.correct=TRUE, normalize=TRUE, verbose=TRUE
  eset <- rma(affybatch)
  
  # Extract normalized expression matrix
  normalized_data <- exprs(eset)
  
  # Save the normalized data to a CSV file
  write.csv(normalized_data, "rma_normalized_expression.csv", row.names = TRUE)
  
  # Optionally, save the ExpressionSet object for further analysis
  # save(eset, file="rma_eset.RData")
  
  message("RMA normalization complete. Normalized data saved to rma_normalized_expression.csv")
'

View on GitHub

Cox regression hazards model was applied to the processed data using the survival package.

R (survival package) vR 4.2.0, survival 3.5-7 (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install R if not already installed (example for Ubuntu)
# sudo apt update
# sudo apt install r-base

# Create an R script for Cox regression
cat << 'EOF' > run_cox_regression.R
# Load the survival package
# If the package is not installed, it will attempt to install it.
if (!requireNamespace("survival", quietly = TRUE)) {
    install.packages("survival", repos='http://cran.us.r-project.org')
}
library(survival)

# Load processed data (replace 'processed_data.csv' with your actual file path)
# This example assumes a CSV file with columns: time, event (0=censored, 1=event), and covariates.
# Adjust column names and data loading based on your specific data format.
# Example data structure:
# time,event,covariate1,covariate2
# 10,1,2.5,A
# 15,0,3.1,B
# 20,1,1.8,A
# ...
# Placeholder for input data. You MUST replace 'processed_data.csv' with your actual input file.
data <- read.csv("processed_data.csv")

# Ensure 'time' and 'event' are numeric
data$time <- as.numeric(data$time)
data$event <- as.numeric(data$event)

# Create a Surv object
surv_object <- Surv(time = data$time, event = data$event)

# Apply Cox regression hazards model
# Replace 'covariate1 + covariate2' with your actual covariates from the 'data' dataframe.
# Example model: coxph(surv_object ~ age + sex + treatment, data = data)
cox_model <- coxph(surv_object ~ covariate1 + covariate2, data = data)

# Print model summary to standard output
summary(cox_model)

# Optionally, save the full summary to a text file
sink("cox_regression_summary.txt")
print(summary(cox_model))
sink()

# Optionally, save the Cox model object for later use (e.g., prediction, plotting)
saveRDS(cox_model, "cox_model.rds")
EOF

# Execute the R script
Rscript run_cox_regression.R

View on GitHub

All analyses were performed using R software.

R GitHub

$ Bash example

# This step indicates that R software was used for analyses.
# The specific R script and parameters are not provided in the description.
# A placeholder command for running an R script is shown below.
# Replace 'your_analysis_script.R' with the actual script used.
# Replace 'arg1 arg2' with any specific arguments.
Rscript your_analysis_script.R # Add any specific arguments here

View on GitHub

Tools Used

Raw Source Text

Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization. Cox regression hazards model was applied to the processed data using the survival package. All analyses were performed using R software.

← Back to Analysis