GSE220459 Processing Pipeline

RNA-Seq code_examples 2 steps

Publication

Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size.

Neuron (2024) — PMID 38697111

Dataset

GSE220459

Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size (RNA-seq NPC)

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    The raw data was mapped using STAR.

    STAR v2.7.10a (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install STAR (example using conda)
    # conda create -n star_env star -y
    # conda activate star_env
    
    # Build STAR genome index (run once per genome, e.g., GRCh38)
    # mkdir -p /path/to/genome_index/GRCh38
    # STAR --runThreadN 8 \
    #      --runMode genomeGenerate \
    #      --genomeDir /path/to/genome_index/GRCh38 \
    #      --genomeFastaFiles /path/to/fasta/GRCh38.primary_assembly.genome.fa \
    #      --sjdbGTFfile /path/to/gtf/gencode.v38.annotation.gtf \
    #      --sjdbOverhang 100 # Recommended: (read_length - 1)
    
    # Map raw data using STAR (example for paired-end reads)
    STAR --runThreadN 8 \
         --genomeDir /path/to/genome_index/GRCh38 \
         --readFilesIn sample_R1.fastq.gz sample_R2.fastq.gz \
         --readFilesCommand zcat \
         --outFileNamePrefix sample_ \
         --outSAMtype BAM SortedByCoordinate \
         --outBAMcompression 6 \
         --outFilterMultimapNmax 20 \
         --outFilterMismatchNmax 3 \
         --alignIntronMax 1000000 \
         --alignMatesGapMax 1000000 \
         --quantMode GeneCounts # Optional: for gene-level quantification
  2. 2

    We calculated the gene-level read counts and identified differentially expressed genes by in-house script.

    In-house script vN/A
    $ Bash example
    # This script calculates gene-level read counts and identifies differentially expressed genes.
    # The specific implementation details are within the "in-house script".
    #
    # Pre-requisite: Gene-level read counts (e.g., from featureCounts, HTSeq-count, or RSEM).
    # Example for generating counts (assuming aligned BAM files in 'aligned_bams/'):
    # # conda install -c bioconda subread
    # # featureCounts -a gencode.v44.annotation.gtf -o gene_counts.tsv -F GTF -t exon -g gene_id aligned_bams/*.bam
    #
    # Input for DE analysis: gene_counts.tsv (gene-level read counts), sample_metadata.tsv (experimental design)
    # Output: diff_exp_results.tsv (table of differentially expressed genes)
    #
    # Reference datasets (placeholders):
    # - Gene annotation: gencode.v44.annotation.gtf (GENCODE human release 44, GRCh38)
    
    # Execute the in-house script for differential expression analysis.
    # The actual command, script name, and parameters will depend on the specific in-house implementation.
    # This is a placeholder command demonstrating typical inputs and outputs.
    Rscript run_inhouse_de_script.R \
      --counts_file gene_counts.tsv \
      --metadata_file sample_metadata.tsv \
      --output_file diff_exp_results.tsv \
      --design_formula "~ condition + batch" \
      --min_reads_per_gene 10 \
      --fdr_threshold 0.05 \
      --log2fc_threshold 1.0

Tools Used

Raw Source Text
The raw data was mapped using STAR.
We calculated the gene-level read counts and identified differentially expressed genes by in-house script.
Assembly: mm10
← Back to Analysis