GSE39855 Processing Pipeline

GSE code_examples 3 steps

Publication

LIN28 binds messenger RNAs at GGAGA motifs and regulates splicing factor abundance.

Molecular cell (2012) — PMID 22959275

Dataset

GSE39855

LIN28 binds messenger RNAs at GGAGA motifs and regulates splicing factor abundance (splicing array)

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.

    Microarray vInferred with models/gemini-2.5-flash
    $ Bash example
    # Install Affymetrix Power Tools (APT) - typically downloaded from Thermo Fisher Scientific website.
    # Example installation (adjust version and path as needed):
    # wget https://assets.thermofisher.com/TFS-Assets/LSG/software/APT_2.11.2_Linux_x86_64.zip
    # unzip APT_2.11.2_Linux_x86_64.zip
    # export PATH=$PATH:/path/to/APT/bin
    
    # Placeholder for input CEL files and output directory.
    # 'input_cel_files.txt' should contain a list of .CEL file paths, one per line.
    # The specific library files (e.g., .pgf, .clf, .cdf) for your array type are often required
    # and can be downloaded from the Thermo Fisher Scientific website or generated.
    # Example: -p /path/to/library/file.pgf -c /path/to/library/file.clf
    
    apt-probeset-summarize \
      -a rma \
      --cel-files input_cel_files.txt \
      -o ./summarized_output_directory \
      --log-file apt_summarize.log
  2. 2

    Iter-plier algorithm used to quantify probesets.

    affy (Inferred with models/gemini-2.5-flash) v1.80.0 (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install Bioconductor and necessary packages if not already installed
    # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")'
    # R -e 'BiocManager::install("affy")'
    # R -e 'BiocManager::install("hgu133plus2.db")' # Example annotation package for a common human array (e.g., Affymetrix Human Genome U133 Plus 2.0 Array)
    
    # Create a dummy CEL file directory for demonstration purposes
    mkdir -p cel_files
    # Touch dummy CEL files (replace with actual CEL files in a real scenario)
    touch cel_files/sample1.CEL
    touch cel_files/sample2.CEL
    
    # Run R script to quantify probesets using the Iter-plier algorithm
    Rscript -e '
        # Load the affy package for microarray data analysis
        library(affy)
        # Load the appropriate annotation package for your specific Affymetrix array type
        # Example: library(hgu133plus2.db) for Human Genome U133 Plus 2.0 Array
        # Make sure to install the correct annotation package for your data
        # library(YOUR_ARRAY_ANNOTATION_PACKAGE)
    
        # Set the directory containing your raw Affymetrix CEL files
        cel_dir <- "cel_files"
    
        # Read CEL files into an AffyBatch object
        # This function automatically detects CEL files in the specified directory
        raw_data <- ReadAffy(celfile.path = cel_dir)
    
        # Perform probeset quantification using the Iter-plier algorithm
        # The justPlier function within the affy package implements PLIER and Iter-PLIER.
        # Set type = "iter-plier" to use the Iter-plier algorithm.
        # This will return an ExpressionSet object containing the quantified probeset data.
        eset_iter_plier <- justPlier(raw_data, type = "iter-plier")
    
        # Extract the expression matrix from the ExpressionSet object
        # This matrix contains the quantified expression values for each probeset.
        expression_matrix <- exprs(eset_iter_plier)
    
        # Save the quantified expression matrix to a CSV file
        write.csv(expression_matrix, "probeset_quantification_iter_plier.csv", row.names = TRUE)
    
        # Optionally, save the entire ExpressionSet object for further downstream analysis in R
        save(eset_iter_plier, file = "eset_iter_plier.RData")
    '
  3. 3

    HJAY_r2.pgf

    R (with tikzDevice package) (Inferred with models/gemini-2.5-flash) vN/A GitHub
    $ Bash example
    # This step is inferred to generate a PGF (Portable Graphics Format) file,
    # likely a plot or figure for a report, using R with the tikzDevice package.
    # The exact input data and plotting commands are not specified in the description.
    # A placeholder R script 'generate_HJAY_r2_plot.R' is assumed to exist,
    # which contains the R code to generate the plot named HJAY_r2.pgf.
    # Replace 'generate_HJAY_r2_plot.R' with the actual script name and ensure it's configured
    # to output 'HJAY_r2.pgf' based on relevant input data.
    
    # Example installation of R and tikzDevice (if not already installed):
    # conda install -c conda-forge r-base r-tikzdevice
    
    Rscript generate_HJAY_r2_plot.R

Tools Used

Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets.
HJAY_r2.pgf
← Back to Analysis