GSE86464 Processing Pipeline
GSE
code_examples
3 steps
Publication
Protein-RNA Networks Regulated by Normal and ALS-Associated Mutant HNRNPA2B1 in the Nervous System.Neuron (2016) — PMID 27773581
Dataset
GSE86464HNRNPA2B1 regulates alternative RNA processing in the nervous system and accumulates in granules in ALS IPSC-derived motor neurons
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.
Microarray vNot specified (Inferred with models/gemini-2.5-flash)$ Bash example
# Install Affy Power Tools (APT) via Bioconda # conda install -c bioconda affy-power-tools # Example usage of apt-probeset-summarize # This command processes Affymetrix CEL files to generate summarized probe set data. # Replace 'path/to/library.cdf', 'path/to/input_cel_file_1.CEL', etc., and 'path/to/output_dir' with actual paths. # The --analysis parameter specifies the summarization algorithm (e.g., rma, mas5, plier). # The --output-dir parameter specifies where to write the output files. apt-probeset-summarize \ --cdf-file path/to/library.cdf \ --analysis rma \ --cel-files path/to/input_cel_file_1.CEL path/to/input_cel_file_2.CEL \ --output-dir path/to/output_dir -
2
Iter-plier algorithm used to quantify probesets.
$ Bash example
# Install R and Bioconductor if not already present # R -e "install.packages('BiocManager')" # R -e "BiocManager::install('affy')" # Create an R script to perform Iter-plier background correction and RMA quantification cat << 'EOF' > quantify_probesets.R library(affy) # Define input and output directories # Assumes raw Affymetrix .CEL files are located in the 'raw_cel_files' directory. # Create this directory and place your .CEL files there before running. input_dir <- "raw_cel_files" output_dir <- "quantified_results" dir.create(output_dir, showWarnings = FALSE) # Check if input directory exists if (!dir.exists(input_dir)) { stop("Input directory '", input_dir, "' not found. Please create it and place .CEL files inside.") } # List all .CEL files in the input directory cel_files <- list.celfiles(path=input_dir, full.names=TRUE) if (length(cel_files) == 0) { stop("No .CEL files found in the input directory: ", input_dir) } print(paste("Found", length(cel_files), ".CEL files for quantification.")) # Read the .CEL files into an AffyBatch object # This step reads raw intensity data from the arrays. raw_data <- ReadAffy(filenames=cel_files) # Perform IterPLIER background correction # The IterPLIER algorithm is used to estimate and subtract background noise. # This function is part of the 'affy' package. bg_corrected_affybatch <- bg.correct.iterplier(raw_data) # Perform RMA (Robust Multi-array Average) normalization and summarization # RMA is a widely used method for normalizing and summarizing Affymetrix GeneChip data. # This step quantifies probesets by combining probe intensities into a single expression value per probeset. eset <- rma(bg_corrected_affybatch) # Extract the expression matrix (quantified probeset values) expression_matrix <- exprs(eset) # Get probeset IDs probeset_ids <- featureNames(eset) # Combine probeset IDs with the expression matrix into a data frame quantified_data <- data.frame(ProbesetID = probeset_ids, expression_matrix) # Define the output file path output_file <- file.path(output_dir, "probeset_quantification_iterplier_rma.csv") # Write the quantified probeset data to a CSV file write.csv(quantified_data, output_file, row.names = FALSE) print(paste("Probeset quantification complete. Results saved to:", output_file)) EOF # Create input directory for CEL files (if it doesn't exist) mkdir -p raw_cel_files # Execute the R script to perform quantification Rscript quantify_probesets.R # Example of creating dummy CEL files for testing (optional) # This part is for demonstration if you don't have actual CEL files. # It will create empty files, which 'ReadAffy' will likely fail on, # but shows the expected input structure. # touch raw_cel_files/sample_A.CEL # touch raw_cel_files/sample_B.CEL -
3
http://exon.ucsc.edu/documentation/mjay_library/hjay.pgf
clipper (Inferred with models/gemini-2.5-flash) vNot specified (Inferred with models/gemini-2.5-flash) GitHub$ Bash example
# Install clipper (if not already installed) # git clone https://github.com/yeolab/clipper.git # cd clipper # python setup.py install # Or just run the script directly # Example usage for eCLIP peak calling with clipper # Assuming input BAM files and a control BAM file are available # Replace with actual file paths and species INPUT_BAM="input.bam" CONTROL_BAM="control.bam" OUTPUT_PREFIX="peaks" SPECIES="hg38" # Placeholder for latest assembly # Run clipper python clipper.py \ -o "${OUTPUT_PREFIX}.bed" \ -s "${SPECIES}" \ -b "${INPUT_BAM}" \ -c "${CONTROL_BAM}" \ --bonferroni \ --fdr 0.05 \ --window 100 \ --step 20
Tools Used
Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets. http://exon.ucsc.edu/documentation/mjay_library/hjay.pgf