GSE25421 Processing Pipeline — Yeo Lab Publications

Publication

Inhibition of RNA splicing triggers CHMP7 nuclear entry, impacting TDP-43 function and leading to the onset of ALS cellular phenotypes.

Neuron (2024) — PMID 39486415

Dataset

GSE25421

BA-Sulfate vs Sulfate starvation

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

1

R limma package - Normalization : normalizeWithinArrays loess - Statistical analysis :R Anapuce R package.

limma vInferred with models/gemini-2.5-flash GitHub

$ Bash example

# Install R and Bioconductor if not already present
# For Bioconductor packages like limma:
# if (!requireNamespace("BiocManager", quietly = TRUE))
#     install.packages("BiocManager")
# BiocManager::install("limma")

# For CRAN packages like Anapuce:
# install.packages("Anapuce")
# install.packages("Biobase") # Required for ExpressionSet objects

# Create an R script for the analysis
cat << 'EOF' > run_limma_anapuce_analysis.R
# Load necessary packages
library(limma)
library(Anapuce)
library(Biobase) # Often needed for ExpressionSet objects

# --- Normalization using limma ---
# In a real scenario, you would load your raw microarray data, e.g.:
# targets <- readTargets("targets.txt") # A file describing your arrays
# RG <- read.maimages(targets, source="agilent") # Read raw red/green intensities

# For demonstration, create a dummy RGList object
num_genes <- 100
num_arrays <- 5
RG <- new("RGList", list(
  R=matrix(rnorm(num_genes * num_arrays, mean=1000, sd=200), num_genes, num_arrays),
  G=matrix(rnorm(num_genes * num_arrays, mean=800, sd=150), num_genes, num_arrays),
  targets=data.frame(FileName=paste0("array", 1:num_arrays, ".txt"),
                     Group=rep(c("Control", "Treated"), length.out=num_arrays))
))
RG$genes <- data.frame(ProbeID=paste0("Probe_", 1:num_genes))

print("Performing within-array normalization with loess method using limma...")
MA <- normalizeWithinArrays(RG, method="loess")
print("Normalization complete. First few normalized M values (log-ratio):")
print(head(MA$M))

# --- Statistical analysis using Anapuce ---
# Anapuce typically works with ExpressionSet objects or similar.
# For demonstration, we'll create a basic ExpressionSet from the normalized data.
# In a full pipeline, you might perform background correction and between-array normalization
# with limma to get log2 expression values (E-values) before Anapuce.

# Create phenoData
pData <- MA$targets
rownames(pData) <- colnames(MA$M)
phenoData <- new("AnnotatedDataFrame", data=pData)

# Create featureData
fData <- MA$genes
rownames(fData) <- rownames(MA$M)
featureData <- new("AnnotatedDataFrame", data=fData)

# Create ExpressionSet (using M-values as a placeholder for expression data)
# Note: Anapuce functions often expect log-intensities, not M-values directly.
# This is a simplified example to show package usage.
eset <- new("ExpressionSet", exprs=MA$M, phenoData=phenoData, featureData=featureData)

print("Performing statistical analysis using Anapuce...")
# Specific Anapuce functions would be called here, e.g., for differential expression,
# quality control, or visualization, using the 'eset' object.
# Example: anapuce_results <- anapuce(eset, design_matrix, contrast_matrix)
print(paste("Anapuce package version:", packageVersion("Anapuce")))
print("Anapuce analysis placeholder executed. Replace with specific Anapuce functions.")

# Save results (example)
# write.csv(MA$M, "limma_normalized_M_values.csv")
# write.csv(exprs(eset), "anapuce_input_expression.csv")
EOF

Rscript run_limma_anapuce_analysis.R

View on GitHub