GSE17538 Processing Pipeline

GSE code_examples 3 steps

Publication

DDX5 promotes oncogene C3 and FABP1 expressions and drives intestinal inflammation and tumorigenesis.

Life science alliance (2020) — PMID 32817263

Dataset

Experimentally Derived Metastasis Gene Expression Profile Predicts Recurrence and Death in Colon Cancer Patients

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization.

R v4.3.x (Bioconductor 3.18) GitHub

$ Bash example

# Install R and the affy Bioconductor package
# conda create -n affy_env r-base bioconductor-affy -c conda-forge -c bioconda
# conda activate affy_env

# Create an R script for RMA normalization (e.g., run_rma.R)
cat << 'EOF' > run_rma.R
# Load the affy package
library(affy)

# Define the directory containing CEL files
# Replace 'path/to/your/cel_files' with the actual path to your Affymetrix CEL files
cel_files_dir <- "path/to/your/cel_files"

# Read CEL files into an AffyBatch object
# This function automatically detects CEL files in the specified path.
data <- ReadAffy(celfile.path = cel_files_dir)

# Perform RMA normalization with default settings
# The 'rma' function performs background correction, normalization, and summarization
# using default parameters as described in the affy package documentation.
eset <- rma(data)

# Optionally, save the normalized expression matrix to a tab-separated file
# Replace 'rma_normalized_expression.txt' with your desired output file name
write.exprs(eset, file = "rma_normalized_expression.txt")

# You can also save the ExpressionSet object itself for further R analysis
# save(eset, file = "rma_normalized_eset.RData")
EOF

# Execute the R script
Rscript run_rma.R

View on GitHub

Cox regression hazards model was applied to the processed data using the survival package.

R (survival package) vR 4.3.2 (survival 3.6-17) (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install R if not already installed (example for Ubuntu/Debian)
# sudo apt update
# sudo apt install r-base

# To run the R script:
# Rscript run_cox_regression.R

# --- run_cox_regression.R content ---
# Install the 'survival' package if not already installed
# if (!requireNamespace("survival", quietly = TRUE)) {
#   install.packages("survival")
# }

# Load the 'survival' package
library(survival)

# Load processed data
# Assuming 'processed_data.csv' is a CSV file with columns:
# 'time': survival time (numeric)
# 'event': event indicator (numeric, 0 = censored, 1 = event)
# 'covariate1', 'covariate2', ...: other covariates for the model
data <- read.csv("processed_data.csv")

# Ensure 'event' column is numeric (0 or 1)
data$event <- as.numeric(data$event)

# Apply Cox regression hazards model
# Replace 'covariate1 + covariate2' with the actual covariates from your data
# Example with two covariates:
cox_model <- coxph(Surv(time, event) ~ covariate1 + covariate2, data = data)

# Print summary of the model results
summary(cox_model)

# Optionally, save the model object or its summary
# save(cox_model, file = "cox_model.RData")
# sink("cox_model_summary.txt")
# summary(cox_model)
# sink()

View on GitHub

All analyses were performed using R software.

R GitHub

$ Bash example

# Install R (example using conda)
# conda install -c conda-forge r-base

# The description is generic, so a specific R command cannot be inferred.
# This is a placeholder for a typical R script execution.
# Rscript your_analysis_script.R --input_file data.csv --output_file results.tsv

View on GitHub

Tools Used

Raw Source Text

Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization. Cox regression hazards model was applied to the processed data using the survival package. All analyses were performed using R software.

← Back to Analysis