GSE86462 Processing Pipeline

GSE code_examples 3 steps

Publication

Protein-RNA Networks Regulated by Normal and ALS-Associated Mutant HNRNPA2B1 in the Nervous System.

Neuron (2016) — PMID 27773581

Dataset

GSE86462

HNRNPA2B1 regulates alternative RNA processing in the nervous system and accumulates in granules in ALS IPSC-derived motor neurons [hnRNPA2B1_Arrays_…

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.

    Microarray vNot specified (Inferred with models/gemini-2.5-flash)
    $ Bash example
    # Install Affymetrix Power Tools (APT) via Bioconda
    # conda install -c bioconda affy-power-tools
    
    # Example usage of apt-probeset-summarize
    # This command processes raw Affymetrix CEL files to generate probe set summaries.
    # Replace <array_type_library_file>, <output_directory>, <probeset_file>, and <cel_file(s)> with actual paths.
    # The array type library file (.pgf, .clf, .qcc) and probeset file (.mps) are specific to the Affymetrix array used.
    # Common summarization algorithms include RMA (Robust Multi-array Average), MAS5, PLIER.
    # The specific algorithm can be chosen using options like --rma, --mas5, --plier.
    # For example, using RMA:
    apt-probeset-summarize --log-file apt_summarize.log --cel-files <path/to/cel_file1.CEL> <path/to/cel_file2.CEL> --out-dir <output_directory> --analysis-files-path <path/to/library_files_directory> --set-analysis-algorithm rma
    # Or, if using a list of CEL files:
    # apt-probeset-summarize --log-file apt_summarize.log --cel-file-list <path/to/cel_file_list.txt> --out-dir <output_directory> --analysis-files-path <path/to/library_files_directory> --set-analysis-algorithm rma
    # Note: <path/to/library_files_directory> should contain the .pgf, .clf, .qcc, .mps files for your array type.
    # The output will typically be a .chp file and a text file with expression values.
  2. 2

    Iter-plier algorithm used to quantify probesets.

    affy (R package) (Inferred with models/gemini-2.5-flash) v1.80.0 (Inferred with models/gemini-2.5-flash)
    $ Bash example
    # Install R and Bioconductor if not already installed
    # For Bioconductor 3.19 (R 4.4):
    # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")'
    # R -e 'BiocManager::install("affy")'
    
    # Define input directory containing .CEL files
    CEL_DIR="path/to/your/cel_files"
    OUTPUT_FILE="probeset_quantification.txt"
    
    # Create an R script to perform PLIER quantification
    cat <<EOF > run_plier.R
    library(affy)
    
    # Get list of CEL files
    cel_files <- list.files("${CEL_DIR}", pattern = "\\.CEL$", full.names = TRUE)
    
    if (length(cel_files) == 0) {
      stop("No .CEL files found in ${CEL_DIR}")
    }
    
    # Read CEL files
    raw_data <- ReadAffy(filenames = cel_files)
    
    # Perform PLIER summarization
    # The 'just.plier' function performs background correction, normalization, and summarization
    eset_plier <- just.plier(raw_data)
    
    # Extract expression matrix
    expression_matrix <- exprs(eset_plier)
    
    # Write the expression matrix to a tab-separated file
    write.table(expression_matrix, file = "${OUTPUT_FILE}", sep = "\t", quote = FALSE, col.names = NA)
    
    message(paste0("Probeset quantification completed and saved to ${OUTPUT_FILE}"))
    EOF
    
    # Execute the R script
    Rscript run_plier.R
    
    # Clean up the R script
    rm run_plier.R
  3. 3

    HTA-2_0.r3.pgf

    Affymetrix Power Tools (APT) (Inferred with models/gemini-2.5-flash) vNot explicitly stated.
    $ Bash example
    # Install Affymetrix Power Tools (APT) if not already installed.
    # APT binaries can be downloaded from the Thermo Fisher Scientific website.
    # Example installation (conceptual, as exact steps vary by OS and APT version):
    # wget https://assets.thermofisher.com/TFS-Assets/LSG/software/APT_2.11.2_Linux_x86_64.zip
    # unzip APT_2.11.2_Linux_x86_64.zip
    # export PATH=$PATH:/path/to/APT_binaries
    
    # Define input and output files/directories
    PGF_FILE="HTA-2_0.r3.pgf" # Probe Group File for Human Transcriptome Array 2.0
    CDF_FILE="HTA-2_0.r3.cdf" # Chip Description File, often accompanies the PGF file
    CEL_FILES="sample1.CEL sample2.CEL sample3.CEL" # Placeholder for input raw CEL files from the array
    OUTPUT_DIR="summarized_hta_data"
    ANALYSIS_PROTOCOL="rma" # Robust Multi-array Average is a common summarization method
    
    # Create output directory if it doesn't exist
    mkdir -p "${OUTPUT_DIR}"
    
    # Execute the probeset summarization using APT's apt-probeset-summarize command.
    # This command performs background correction, normalization, and summarization
    # of probe-level data from CEL files into expression values.
    apt-probeset-summarize \
      --pgf-file "${PGF_FILE}" \
      --cdf-file "${CDF_FILE}" \
      --cel-files "${CEL_FILES}" \
      --output-dir "${OUTPUT_DIR}" \
      --analysis-protocol "${ANALYSIS_PROTOCOL}" \
      --log-file "${OUTPUT_DIR}/apt_summarize.log"
    

Tools Used

Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets.
HTA-2_0.r3.pgf
← Back to Analysis