GSE86224 Processing Pipeline
GSE
code_examples
3 steps
Publication
Protein-RNA Networks Regulated by Normal and ALS-Associated Mutant HNRNPA2B1 in the Nervous System.Neuron (2016) — PMID 27773581
Dataset
GSE86224HNRNPA2B1 regulates alternative RNA processing in the nervous system and accumulates in granules in ALS IPSC-derived motor neurons [hnRNPA2B1_Arrays_…
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.
Microarray vInferred with models/gemini-2.5-flash$ Bash example
# Install Affymetrix Power Tools (APT) # APT is typically downloaded from the Thermo Fisher Scientific website or installed via a package manager like Bioconda. # For example, using Bioconda: # conda install -c bioconda affy-power-tools # Example usage of apt-probeset-summarize # This command summarizes probe-level data from CEL files into a probeset-level expression matrix. # Replace 'path/to/your/library_file.cdf' with the actual CDF file for your array type (e.g., from the Affymetrix support site). # Replace 'input_sample1.CEL input_sample2.CEL' with your actual CEL files. # Replace 'output_summary_prefix' with your desired output file prefix. apt-probeset-summarize \ --cdf-file path/to/your/library_file.cdf \ --out-dir . \ --log-file apt_probeset_summarize.log \ --cel-files input_sample1.CEL input_sample2.CEL \ --output-file output_summary_prefix
-
2
Iter-plier algorithm used to quantify probesets.
Iter-plier v1.61.0 (R package Bioconductor)$ Bash example
#!/bin/bash # Define variables # Placeholder for input CEL files directory (e.g., containing Affymetrix .CEL files) CEL_FILES_DIR="data/raw_cel_files" # Placeholder for output directory where quantified probesets will be saved OUTPUT_DIR="results/quantification" # Name of the R script to be created and executed R_SCRIPT="quantify_iter_plier.R" # Placeholder for the array annotation package (e.g., 'hgu133plus2.db' for Affymetrix Human Genome U133 Plus 2.0 Array) # This package provides probe-level annotations necessary for quantification. ARRAY_ANNOTATION_PACKAGE="hgu133plus2.db" # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # --- R Package Installation (commented out) --- # These commands install the necessary R packages if they are not already present. # It's recommended to install BiocManager first, then use it to install Bioconductor packages. # R -e 'install.packages("BiocManager", repos="https://cloud.r-project.org")' # R -e 'BiocManager::install("iterPli")' # R -e 'BiocManager::install("affy")' # Required for reading .CEL files # R -e 'BiocManager::install("${ARRAY_ANNOTATION_PACKAGE}")' # Install the specific annotation package # Create the R script dynamically cat <<EOF > "${R_SCRIPT}" # Load necessary R packages library(iterPli) library(affy) # Provides functions to read Affymetrix .CEL files library("${ARRAY_ANNOTATION_PACKAGE}", character.only = TRUE) # Load the specified array annotation package # --- Configuration from environment variables --- cel_files_dir <- Sys.getenv("CEL_FILES_DIR") output_dir <- Sys.getenv("OUTPUT_DIR") array_annotation_package <- Sys.getenv("ARRAY_ANNOTATION_PACKAGE") # Create output directory if it doesn't exist within the R script context if (!dir.exists(output_dir)) { dir.create(output_dir, recursive = TRUE) } # List and read .CEL files from the specified directory cel_files <- list.files(cel_files_dir, pattern = "\\.CEL$", full.names = TRUE, ignore.case = TRUE) if (length(cel_files) == 0) { stop(paste("Error: No .CEL files found in the specified directory:", cel_files_dir)) } # Create an AffyBatch object from the raw .CEL files # This object holds the raw intensity data from the microarray experiment. raw_data <- ReadAffy(filenames = cel_files) # Perform Iter-plier quantification # The iterPli function processes the raw intensity data to produce robust probeset expression values. # It returns an ExpressionSet object, which contains the quantified expression values. # Default parameters are used here. Depending on the array type, you might need to specify 'cdfName' # (e.g., quantified_data <- iterPli(raw_data, cdfName = "hgu133plus2")) if not automatically inferred or if using custom CDFs. quantified_data <- iterPli(raw_data) # Extract the expression matrix from the ExpressionSet object expression_matrix <- exprs(quantified_data) # Define the output file path output_file <- file.path(output_dir, "iter_plier_quantified_probesets.tsv") # Save the quantified expression matrix to a tab-separated file write.table(expression_matrix, file = output_file, sep = "\t", quote = FALSE, row.names = TRUE) message(paste("Iter-plier quantification complete. Results saved to:", output_file)) EOF # Execute the R script using Rscript, passing environment variables # This ensures the R script can access the paths defined in the bash script. CEL_FILES_DIR="${CEL_FILES_DIR}" OUTPUT_DIR="${OUTPUT_DIR}" ARRAY_ANNOTATION_PACKAGE="${ARRAY_ANNOTATION_PACKAGE}" Rscript "${R_SCRIPT}" echo "Iter-plier quantification pipeline finished successfully." -
3
http://exon.ucsc.edu/documentation/mjay_library/mjay.pgf
Unknown (Inferred with models/gemini-2.5-flash) vN/A$ Bash example
# The provided description 'http://exon.ucsc.edu/documentation/mjay_library/mjay.pgf' is a URL to a .pgf (Portable Graphics Format) file. # This file format typically contains graphical diagrams and does not provide a textual description of a bioinformatics step or tool. # Therefore, it is not possible to infer a specific bioinformatics tool, its version, or a relevant bash command from this description. # Please provide a textual description of the bioinformatics step for accurate inference. # # As no specific tool or command can be inferred, a placeholder command is provided to fulfill the output format requirement. echo "Error: Cannot infer a specific bioinformatics step or tool from the provided .pgf file URL. Please provide a textual description."
Tools Used
Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets. http://exon.ucsc.edu/documentation/mjay_library/mjay.pgf