GSE28359 Processing Pipeline — Yeo Lab Publications

Publication

Musashi-2 attenuates AHR signalling to expand human haematopoietic stem cells.

Nature (2016) — PMID 27121842

Dataset

Aryl hydrocarbon receptor antagonists promote the expansion of human hematopoietic stem cells

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

1

gcRMA

gcrma (R package) (Inferred with models/gemini-2.5-flash) vN/A GitHub

$ Bash example

# Install R and Bioconductor if not already installed.
# For example, using conda:
# conda create -n r_env r-base bioconductor-affy bioconductor-gcrma -y
# conda activate r_env

# Create an R script to perform gcRMA normalization
cat << 'EOF' > run_gcrma.R
# Load necessary libraries
library(affy)
library(gcrma)

# Define input directory containing .CEL files
# Replace 'path/to/your/cel_files' with the actual path to your raw microarray data.
cel_files_dir <- "path/to/your/cel_files"

# Read .CEL files from the specified directory.
# This assumes all .CEL files in the directory are part of the same experiment
# and should be normalized together.
cel_files <- list.files(cel_files_dir, pattern = "\\.CEL$", full.names = TRUE)
if (length(cel_files) == 0) {
  stop(paste("No .CEL files found in the specified directory:", cel_files_dir))
}
raw_data <- ReadAffy(filenames = cel_files)

# Perform gcRMA normalization.
# This step adjusts for background noise and normalizes probe intensities.
normalized_data <- gcrma(raw_data)

# Extract the normalized expression matrix.
expression_matrix <- exprs(normalized_data)

# Define output file path.
# Replace 'normalized_gcrma_expression.tsv' with your desired output file name and path.
output_file <- "normalized_gcrma_expression.tsv"

# Save the normalized expression matrix to a TSV file.
# row.names = TRUE ensures probe IDs are included.
write.table(expression_matrix, file = output_file, sep = "\t", quote = FALSE, row.names = TRUE)

message(paste("gcRMA normalization complete. Normalized data saved to:", output_file))
EOF

# Execute the R script to perform gcRMA normalization.
Rscript run_gcrma.R

# Example usage:
# 1. Place your .CEL files into a directory, e.g., 'raw_cel_data'.
# 2. Edit 'run_gcrma.R' to set 'cel_files_dir' to 'raw_cel_data' and 'output_file' to your desired output path.
#    e.g., cel_files_dir <- "raw_cel_data"
#    e.g., output_file <- "results/normalized_gcrma_expression.tsv"
# 3. Ensure the output directory exists: mkdir -p results
# 4. Run the script: Rscript run_gcrma.R

View on GitHub