GSE76008 Processing Pipeline
GSE
code_examples
1 step
Publication
The splicing factor RBM17 drives leukemic stem cell maintenance by evading nonsense-mediated decay of pro-leukemic factors.Nature communications (2022) — PMID 35781533
Dataset
GSE76008A 17-Gene Stemness Score for Rapid Identification of High-Risk AML Patients [Illumina]
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
The data were normalized with variance stabilization and quantile normalization using the lumi (v2.16.0) package in R (v3.1.0).
$ Bash example
# Install R (v3.1.0) if not already installed. Specific old versions of R might require manual installation or using tools like `r-env` or `conda`. # For example, using conda: # conda create -n r3.1 r-base=3.1.0 -y # conda activate r3.1 # Install BiocManager (if not already installed) # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager", repos="https://cran.rstudio.com")' # Install lumi package (v2.16.0) compatible with R 3.1.0 (Bioconductor 2.14) # R -e 'BiocManager::install("lumi", version="2.14", ask=FALSE)' # Create an R script for normalization cat << 'EOF' > normalize_lumi_data.R # Load the lumi package library(lumi) # --- Input data loading --- # This script assumes you have a LumiBatch object named 'lumi_object' # available in your R environment or can load it from a file. # Replace 'path/to/your/input_lumi_object.RData' with the actual path # to your LumiBatch object saved from a previous step, or # load raw data using lumiR() or similar. # Example: Load a pre-saved LumiBatch object # load("path/to/your/input_lumi_object.RData") # If starting from raw data, use lumiR: # lumi_object <- lumiR("path/to/your/raw_data.txt") # For demonstration, creating a dummy LumiBatch object if not loaded if (!exists("lumi_object")) { message("Input 'lumi_object' not found. Creating a dummy object for demonstration.") # Simulate expression data (e.g., 100 probes, 5 samples) exprs_data <- matrix(rnorm(100 * 5, mean = 1000, sd = 200), nrow = 100, ncol = 5) rownames(exprs_data) <- paste0("Probe", 1:100) colnames(exprs_data) <- paste0("Sample", 1:5) # Simulate pheno data pheno_data <- data.frame(row.names = colnames(exprs_data), Group = c("A", "A", "B", "B", "A")) # Create a LumiBatch object lumi_object <- new("LumiBatch", exprs = exprs_data, phenoData = new("AnnotatedDataFrame", data = pheno_data)) message("Dummy LumiBatch object created.") } # Perform variance stabilization transformation (VST) # The 'lumiT' function with method="vst" performs variance stabilization. message("Performing variance stabilization transformation...") vst_lumi_object <- lumiT(lumi_object, method = "vst") # Perform quantile normalization # The 'lumiN' function with method="quantile" performs quantile normalization. message("Performing quantile normalization...") normalized_lumi_object <- lumiN(vst_lumi_object, method = "quantile") # --- Output data saving --- # Save the normalized expression matrix to a tab-separated file write.table(exprs(normalized_lumi_object), file = "normalized_expression_data.txt", sep = "\t", quote = FALSE, row.names = TRUE, col.names = TRUE) message("Normalized expression data saved to normalized_expression_data.txt") # Optionally, save the entire normalized LumiBatch object for further R analysis save(normalized_lumi_object, file = "normalized_lumi_object.RData") message("Normalized LumiBatch object saved to normalized_lumi_object.RData") EOF # Execute the R script using the specified R version # Ensure R 3.1.0 is in your PATH or specify its full path if needed. Rscript normalize_lumi_data.R
Tools Used
Raw Source Text
The data were normalized with variance stabilization and quantile normalization using the lumi (v2.16.0) package in R (v3.1.0).