GSE86462 Processing Pipeline
GSE
code_examples
3 steps
Publication
Protein-RNA Networks Regulated by Normal and ALS-Associated Mutant HNRNPA2B1 in the Nervous System.Neuron (2016) — PMID 27773581
Dataset
GSE86462HNRNPA2B1 regulates alternative RNA processing in the nervous system and accumulates in granules in ALS IPSC-derived motor neurons [hnRNPA2B1_Arrays_…
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.
Microarray vNot specified (Inferred with models/gemini-2.5-flash)$ Bash example
# Install Affymetrix Power Tools (APT) via Bioconda # conda install -c bioconda affy-power-tools # Example usage of apt-probeset-summarize # This command processes raw Affymetrix CEL files to generate probe set summaries. # Replace <array_type_library_file>, <output_directory>, <probeset_file>, and <cel_file(s)> with actual paths. # The array type library file (.pgf, .clf, .qcc) and probeset file (.mps) are specific to the Affymetrix array used. # Common summarization algorithms include RMA (Robust Multi-array Average), MAS5, PLIER. # The specific algorithm can be chosen using options like --rma, --mas5, --plier. # For example, using RMA: apt-probeset-summarize --log-file apt_summarize.log --cel-files <path/to/cel_file1.CEL> <path/to/cel_file2.CEL> --out-dir <output_directory> --analysis-files-path <path/to/library_files_directory> --set-analysis-algorithm rma # Or, if using a list of CEL files: # apt-probeset-summarize --log-file apt_summarize.log --cel-file-list <path/to/cel_file_list.txt> --out-dir <output_directory> --analysis-files-path <path/to/library_files_directory> --set-analysis-algorithm rma # Note: <path/to/library_files_directory> should contain the .pgf, .clf, .qcc, .mps files for your array type. # The output will typically be a .chp file and a text file with expression values.
-
2
Iter-plier algorithm used to quantify probesets.
affy (R package) (Inferred with models/gemini-2.5-flash) v1.80.0 (Inferred with models/gemini-2.5-flash)$ Bash example
# Install R and Bioconductor if not already installed # For Bioconductor 3.19 (R 4.4): # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")' # R -e 'BiocManager::install("affy")' # Define input directory containing .CEL files CEL_DIR="path/to/your/cel_files" OUTPUT_FILE="probeset_quantification.txt" # Create an R script to perform PLIER quantification cat <<EOF > run_plier.R library(affy) # Get list of CEL files cel_files <- list.files("${CEL_DIR}", pattern = "\\.CEL$", full.names = TRUE) if (length(cel_files) == 0) { stop("No .CEL files found in ${CEL_DIR}") } # Read CEL files raw_data <- ReadAffy(filenames = cel_files) # Perform PLIER summarization # The 'just.plier' function performs background correction, normalization, and summarization eset_plier <- just.plier(raw_data) # Extract expression matrix expression_matrix <- exprs(eset_plier) # Write the expression matrix to a tab-separated file write.table(expression_matrix, file = "${OUTPUT_FILE}", sep = "\t", quote = FALSE, col.names = NA) message(paste0("Probeset quantification completed and saved to ${OUTPUT_FILE}")) EOF # Execute the R script Rscript run_plier.R # Clean up the R script rm run_plier.R -
3
HTA-2_0.r3.pgf
Affymetrix Power Tools (APT) (Inferred with models/gemini-2.5-flash) vNot explicitly stated.$ Bash example
# Install Affymetrix Power Tools (APT) if not already installed. # APT binaries can be downloaded from the Thermo Fisher Scientific website. # Example installation (conceptual, as exact steps vary by OS and APT version): # wget https://assets.thermofisher.com/TFS-Assets/LSG/software/APT_2.11.2_Linux_x86_64.zip # unzip APT_2.11.2_Linux_x86_64.zip # export PATH=$PATH:/path/to/APT_binaries # Define input and output files/directories PGF_FILE="HTA-2_0.r3.pgf" # Probe Group File for Human Transcriptome Array 2.0 CDF_FILE="HTA-2_0.r3.cdf" # Chip Description File, often accompanies the PGF file CEL_FILES="sample1.CEL sample2.CEL sample3.CEL" # Placeholder for input raw CEL files from the array OUTPUT_DIR="summarized_hta_data" ANALYSIS_PROTOCOL="rma" # Robust Multi-array Average is a common summarization method # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # Execute the probeset summarization using APT's apt-probeset-summarize command. # This command performs background correction, normalization, and summarization # of probe-level data from CEL files into expression values. apt-probeset-summarize \ --pgf-file "${PGF_FILE}" \ --cdf-file "${CDF_FILE}" \ --cel-files "${CEL_FILES}" \ --output-dir "${OUTPUT_DIR}" \ --analysis-protocol "${ANALYSIS_PROTOCOL}" \ --log-file "${OUTPUT_DIR}/apt_summarize.log"
Tools Used
Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets. HTA-2_0.r3.pgf