GSE277082 Processing Pipeline
OTHER
code_examples
2 steps
Publication
Neuronal aging causes mislocalization of splicing proteins and unchecked cellular stress.Nature neuroscience (2025) — PMID 40456907
Dataset
GSE277082Aging-linked deterioration of RNA metabolism destabilizes the stress response of neurons [RNASeq, RiboSeq]
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Riboseq data was analyzed as described previously; see Tirosh et al.
$ Bash example
# Install Bowtie # conda install -c bioconda bowtie=1.1.1 # Install samtools (required for BAM conversion and sorting) # conda install -c bioconda samtools # Define paths and files GENOME_DIR="genome/sacCer3" GENOME_FA="${GENOME_DIR}/sacCer3.fa" GENOME_INDEX_PREFIX="${GENOME_DIR}/sacCer3_index" RRNA_FA="${GENOME_DIR}/sacCer3_rRNA.fa" # Placeholder for rRNA sequences RRNA_INDEX_PREFIX="${GENOME_DIR}/sacCer3_rRNA_index" INPUT_FASTQ="riboseq_reads.fastq" # Replace with actual input file OUTPUT_FASTQ_GENOME="riboseq_rRNA_unmapped.fastq" OUTPUT_SAM_GENOME="riboseq_genome_mapped.sam" OUTPUT_BAM_GENOME="riboseq_genome_mapped.bam" OUTPUT_SORTED_BAM="riboseq_genome_mapped.sorted.bam" OUTPUT_P_SITE_COUNTS="riboseq_p_site_counts.tsv" # Example output for custom script # --- Reference Data Preparation --- # Download S. cerevisiae genome (sacCer3) from UCSC # mkdir -p ${GENOME_DIR} # wget -P ${GENOME_DIR} http://hgdownload.soe.ucsc.edu/goldenPath/sacCer3/bigZips/sacCer3.fa.gz # gunzip ${GENOME_FA}.gz # Obtain S. cerevisiae rRNA sequences (e.g., from NCBI or specific genome annotations) # For example, extract rRNA from sacCer3 GFF/GTF or download separately. # Example: echo ">rRNA_sequence" > ${RRNA_FA} # Example: echo "AGCTAGCT..." >> ${RRNA_FA} # Build Bowtie indices for genome and rRNA # bowtie-build ${GENOME_FA} ${GENOME_INDEX_PREFIX} # bowtie-build ${RRNA_FA} ${RRNA_INDEX_PREFIX} # --- Riboseq Data Analysis Pipeline (as per Tirosh et al. 2016) --- # 1. Align reads to rRNA and extract unmapped reads # Reads mapping to rRNA are discarded. # -S: output SAM format # -v 2: allow up to 2 mismatches (as per Tirosh et al. for genome, applied here for rRNA too) # --un: write reads that did not align to a file bowtie -S -v 2 --un ${OUTPUT_FASTQ_GENOME} ${RRNA_INDEX_PREFIX} ${INPUT_FASTQ} > /dev/null # Discard rRNA mapped SAM # 2. Align rRNA-unmapped reads to the genome # -S: output SAM format # -v 2: allow up to 2 mismatches bowtie -S -v 2 ${GENOME_INDEX_PREFIX} ${OUTPUT_FASTQ_GENOME} ${OUTPUT_SAM_GENOME} # 3. Convert SAM to BAM, sort, and index samtools view -bS ${OUTPUT_SAM_GENOME} > ${OUTPUT_BAM_GENOME} samtools sort ${OUTPUT_BAM_GENOME} -o ${OUTPUT_SORTED_BAM} samtools index ${OUTPUT_SORTED_BAM} # 4. P-site inference and ORF assignment (Custom script based on Tirosh et al. methods) # Tirosh et al. (2016) methods: # "The 5′ ends of reads were mapped to the genome, and the P-site was inferred by adding 15 nucleotides to the 5′ end of reads 28–30 nucleotides long, and 16 nucleotides to the 5′ end of reads 31–33 nucleotides long. Reads were then assigned to ORFs based on their P-site position." # This step requires custom scripting (e.g., Python, R, Perl) to parse the BAM file, # filter reads by length, adjust 5' end coordinates to infer P-sites, and then # intersect these P-sites with a gene/ORF annotation file (e.g., GTF/GFF). # # Example placeholder for a custom script execution: # python custom_riboseq_p_site_analysis.py \ # --input_bam ${OUTPUT_SORTED_BAM} \ # --orf_annotation /path/to/saccharomyces_cerevisiae.gtf \ # --output_file ${OUTPUT_P_SITE_COUNTS} -
2
2015 doi:10.1371/journal.ppat.1005288
$ Bash example
# Install STAR (example using conda) # conda install -c bioconda star=2.5.2b # Define variables FASTQ_FILE="sample_R1.fastq.gz" # Placeholder for input FASTQ file OUTPUT_PREFIX="sample_aligned" # Prefix for output files NUM_THREADS=8 # Number of threads to use # Reference genome directory (hg38 as a placeholder) # This directory should contain the STAR genome index files (SA, SAindex, genome, etc.) # To generate a STAR genome index: # STAR --runThreadN ${NUM_THREADS} --runMode genomeGenerate \ # --genomeDir /path/to/STAR_INDEX_DIR/hg38 \ # --genomeFastaFiles /path/to/hg38.fa \ # --sjdbGTFfile /path/to/hg38.gtf \ # --sjdbOverhang 100 # Adjust sjdbOverhang based on read length - 1 STAR_INDEX_DIR="/path/to/STAR_INDEX_DIR/hg38" # Placeholder for STAR genome index directory # Run STAR alignment for eCLIP data STAR --runThreadN ${NUM_THREADS} \ --genomeDir ${STAR_INDEX_DIR} \ --readFilesIn ${FASTQ_FILE} \ --outFileNamePrefix ${OUTPUT_PREFIX} \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes All \ --outSAMunmapped Within \ --outFilterMultimapNmax 1 \ --outFilterMismatchNmax 3 \ --outFilterMismatchNoverLmax 0.1 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --alignSJoverhangMin 8 \ --alignSJDBoverhangMin 1 \ --sjdbScore 1 \ --readFilesCommand zcat
Raw Source Text
Riboseq data was analyzed as described previously; see Tirosh et al. 2015 doi:10.1371/journal.ppat.1005288 Assembly: hg19 Supplementary files format and content: RNA-seq and footprinting reads for each sample in duplicate