GSE17538 Processing Pipeline

GSE code_examples 3 steps

Publication

DDX5 promotes oncogene C3 and FABP1 expressions and drives intestinal inflammation and tumorigenesis.

Life science alliance (2020) — PMID 32817263

Dataset

GSE17538

Experimentally Derived Metastasis Gene Expression Profile Predicts Recurrence and Death in Colon Cancer Patients

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization.

    R v4.3.x (Bioconductor 3.18) GitHub
    $ Bash example
    # Install R and the affy Bioconductor package
    # conda create -n affy_env r-base bioconductor-affy -c conda-forge -c bioconda
    # conda activate affy_env
    
    # Create an R script for RMA normalization (e.g., run_rma.R)
    cat << 'EOF' > run_rma.R
    # Load the affy package
    library(affy)
    
    # Define the directory containing CEL files
    # Replace 'path/to/your/cel_files' with the actual path to your Affymetrix CEL files
    cel_files_dir <- "path/to/your/cel_files"
    
    # Read CEL files into an AffyBatch object
    # This function automatically detects CEL files in the specified path.
    data <- ReadAffy(celfile.path = cel_files_dir)
    
    # Perform RMA normalization with default settings
    # The 'rma' function performs background correction, normalization, and summarization
    # using default parameters as described in the affy package documentation.
    eset <- rma(data)
    
    # Optionally, save the normalized expression matrix to a tab-separated file
    # Replace 'rma_normalized_expression.txt' with your desired output file name
    write.exprs(eset, file = "rma_normalized_expression.txt")
    
    # You can also save the ExpressionSet object itself for further R analysis
    # save(eset, file = "rma_normalized_eset.RData")
    EOF
    
    # Execute the R script
    Rscript run_rma.R
  2. 2

    Cox regression hazards model was applied to the processed data using the survival package.

    R (survival package) vR 4.3.2 (survival 3.6-17) (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install R if not already installed (example for Ubuntu/Debian)
    # sudo apt update
    # sudo apt install r-base
    
    # To run the R script:
    # Rscript run_cox_regression.R
    
    # --- run_cox_regression.R content ---
    # Install the 'survival' package if not already installed
    # if (!requireNamespace("survival", quietly = TRUE)) {
    #   install.packages("survival")
    # }
    
    # Load the 'survival' package
    library(survival)
    
    # Load processed data
    # Assuming 'processed_data.csv' is a CSV file with columns:
    # 'time': survival time (numeric)
    # 'event': event indicator (numeric, 0 = censored, 1 = event)
    # 'covariate1', 'covariate2', ...: other covariates for the model
    data <- read.csv("processed_data.csv")
    
    # Ensure 'event' column is numeric (0 or 1)
    data$event <- as.numeric(data$event)
    
    # Apply Cox regression hazards model
    # Replace 'covariate1 + covariate2' with the actual covariates from your data
    # Example with two covariates:
    cox_model <- coxph(Surv(time, event) ~ covariate1 + covariate2, data = data)
    
    # Print summary of the model results
    summary(cox_model)
    
    # Optionally, save the model object or its summary
    # save(cox_model, file = "cox_model.RData")
    # sink("cox_model_summary.txt")
    # summary(cox_model)
    # sink()
    
  3. 3

    All analyses were performed using R software.

    $ Bash example
    # Install R (example using conda)
    # conda install -c conda-forge r-base
    
    # The description is generic, so a specific R command cannot be inferred.
    # This is a placeholder for a typical R script execution.
    # Rscript your_analysis_script.R --input_file data.csv --output_file results.tsv

Tools Used

Raw Source Text
Bioconductor's affy package was used for RMA normalization and raw data processing using default settings for background correction and normalization. Cox regression hazards model was applied to the processed data using the survival package. All analyses were performed using R software.
← Back to Analysis