GSE39855 Processing Pipeline
GSE
code_examples
3 steps
Publication
LIN28 binds messenger RNAs at GGAGA motifs and regulates splicing factor abundance.Molecular cell (2012) — PMID 22959275
Dataset
GSE39855LIN28 binds messenger RNAs at GGAGA motifs and regulates splicing factor abundance (splicing array)
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize.
Microarray vInferred with models/gemini-2.5-flash$ Bash example
# Install Affymetrix Power Tools (APT) - typically downloaded from Thermo Fisher Scientific website. # Example installation (adjust version and path as needed): # wget https://assets.thermofisher.com/TFS-Assets/LSG/software/APT_2.11.2_Linux_x86_64.zip # unzip APT_2.11.2_Linux_x86_64.zip # export PATH=$PATH:/path/to/APT/bin # Placeholder for input CEL files and output directory. # 'input_cel_files.txt' should contain a list of .CEL file paths, one per line. # The specific library files (e.g., .pgf, .clf, .cdf) for your array type are often required # and can be downloaded from the Thermo Fisher Scientific website or generated. # Example: -p /path/to/library/file.pgf -c /path/to/library/file.clf apt-probeset-summarize \ -a rma \ --cel-files input_cel_files.txt \ -o ./summarized_output_directory \ --log-file apt_summarize.log
-
2
Iter-plier algorithm used to quantify probesets.
$ Bash example
# Install Bioconductor and necessary packages if not already installed # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")' # R -e 'BiocManager::install("affy")' # R -e 'BiocManager::install("hgu133plus2.db")' # Example annotation package for a common human array (e.g., Affymetrix Human Genome U133 Plus 2.0 Array) # Create a dummy CEL file directory for demonstration purposes mkdir -p cel_files # Touch dummy CEL files (replace with actual CEL files in a real scenario) touch cel_files/sample1.CEL touch cel_files/sample2.CEL # Run R script to quantify probesets using the Iter-plier algorithm Rscript -e ' # Load the affy package for microarray data analysis library(affy) # Load the appropriate annotation package for your specific Affymetrix array type # Example: library(hgu133plus2.db) for Human Genome U133 Plus 2.0 Array # Make sure to install the correct annotation package for your data # library(YOUR_ARRAY_ANNOTATION_PACKAGE) # Set the directory containing your raw Affymetrix CEL files cel_dir <- "cel_files" # Read CEL files into an AffyBatch object # This function automatically detects CEL files in the specified directory raw_data <- ReadAffy(celfile.path = cel_dir) # Perform probeset quantification using the Iter-plier algorithm # The justPlier function within the affy package implements PLIER and Iter-PLIER. # Set type = "iter-plier" to use the Iter-plier algorithm. # This will return an ExpressionSet object containing the quantified probeset data. eset_iter_plier <- justPlier(raw_data, type = "iter-plier") # Extract the expression matrix from the ExpressionSet object # This matrix contains the quantified expression values for each probeset. expression_matrix <- exprs(eset_iter_plier) # Save the quantified expression matrix to a CSV file write.csv(expression_matrix, "probeset_quantification_iter_plier.csv", row.names = TRUE) # Optionally, save the entire ExpressionSet object for further downstream analysis in R save(eset_iter_plier, file = "eset_iter_plier.RData") ' -
3
HJAY_r2.pgf
$ Bash example
# This step is inferred to generate a PGF (Portable Graphics Format) file, # likely a plot or figure for a report, using R with the tikzDevice package. # The exact input data and plotting commands are not specified in the description. # A placeholder R script 'generate_HJAY_r2_plot.R' is assumed to exist, # which contains the R code to generate the plot named HJAY_r2.pgf. # Replace 'generate_HJAY_r2_plot.R' with the actual script name and ensure it's configured # to output 'HJAY_r2.pgf' based on relevant input data. # Example installation of R and tikzDevice (if not already installed): # conda install -c conda-forge r-base r-tikzdevice Rscript generate_HJAY_r2_plot.R
Tools Used
Raw Source Text
Data processed using Affymetrix package (Affy Power Tools) apt-probeset-summarize. Iter-plier algorithm used to quantify probesets. HJAY_r2.pgf