GSE34995 Processing Pipeline

RNA-Seq code_examples 2 steps

Publication

Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins.

Cell reports (2012) — PMID 22574288

Dataset

Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins (RNA-Seq)

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Strand-specific RNA-seq reads from control siRNA treatment, and each hnRNP depletion experiment were processed as previously described (Polymenidou et al., 2011).

RNA-seq v2.7.10a (Inferred with models/gemini-2.5-flash)

$ Bash example

# Install STAR (if not already installed)
# conda install -c bioconda star

# Define variables
# Placeholder for mouse GRCm39 genome index, as the original paper used mouse mm9 and the latest assembly is preferred.
GENOME_DIR="/path/to/STAR_genome_index/GRCm39"
READ1="control_siRNA_R1.fastq.gz" # Example input for control siRNA treatment
READ2="control_siRNA_R2.fastq.gz" # Example input for control siRNA treatment (paired-end assumed)
OUTPUT_PREFIX="control_siRNA_aligned"
THREADS=8 # Number of CPU threads to use

# Create genome index (run once for a given genome, uncomment and adjust paths if needed)
# STAR --runMode genomeGenerate \
#      --genomeDir ${GENOME_DIR} \
#      --genomeFastaFiles /path/to/GRCm39.primary_assembly.fasta \
#      --sjdbGTFfile /path/to/gencode.vM25.annotation.gtf \
#      --runThreadN ${THREADS}

# Align strand-specific RNA-seq reads using STAR
# Assuming dUTP-based library prep, which is typically reverse-stranded, 
# indicated by --outSAMstrandField reverse for the XS tag.
STAR --genomeDir ${GENOME_DIR} \
     --readFilesIn ${READ1} ${READ2} \
     --runThreadN ${THREADS} \
     --outFileNamePrefix ${OUTPUT_PREFIX}_ \
     --outSAMtype BAM SortedByCoordinate \
     --outSAMstrandField reverse \
     --outFilterMultimapNmax 20 \
     --outFilterMismatchNmax 999 \
     --outFilterMismatchNoverLmax 0.1 \
     --alignIntronMin 20 \
     --alignIntronMax 1000000 \
     --alignMatesGapMax 1000000 \
     --readFilesCommand zcat

An average of 70% of reads mapped uniquely to our gene structure database, using Bowtie (version 0.12.2, with parameters âl 20 âm 5 âk 5 ââbest ââun ââ max âq).

Bowtie v0.12.2 GitHub

$ Bash example

# Install Bowtie 0.12.2
# conda install -c bioconda bowtie=0.12.2

# Note: 'gene_structure_index' refers to a pre-built Bowtie index for the gene structure database.
# 'input_reads.fastq' is the input FASTQ file containing the reads.
# 'aligned_reads.sam' will contain the aligned reads in SAM format.
# 'unaligned_reads.fastq' will contain reads that did not align.
# The parameter '--max' from the description is not a valid Bowtie 0.12.2 flag and has been omitted.

bowtie -q \
       -l 20 \
       -m 5 \
       -k 5 \
       --best \
       --un unaligned_reads.fastq \
       gene_structure_index \
       input_reads.fastq \
       -S aligned_reads.sam

View on GitHub

Tools Used

RNA-seq

Raw Source Text

Strand-specific RNA-seq reads from control siRNA treatment, and each hnRNP depletion experiment were processed as previously described (Polymenidou et al., 2011). An average of 70% of reads mapped uniquely to our gene structure database, using Bowtie (version 0.12.2, with parameters âl 20 âm 5 âk 5 ââbest ââun ââ max âq).

← Back to Analysis