GSE220462 Processing Pipeline

GSE code_examples 2 steps

Publication

Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size.

Neuron (2024) — PMID 38697111

Dataset

GSE220462

Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    The raw data was mapped using STAR.

    STAR v2.7.9a (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install STAR (example using conda)
    # conda install -c bioconda star
    
    # Create STAR genome index (if not already done)
    # Replace /path/to/genome_fasta.fa with your reference genome FASTA file
    # Replace /path/to/annotations.gtf with your gene annotations GTF file
    # Replace /path/to/STAR_index_hg38 with your desired index directory
    # STAR --runThreadN 8 --runMode genomeGenerate --genomeDir /path/to/STAR_index_hg38 --genomeFastaFiles /path/to/genome_fasta.fa --sjdbGTFfile /path/to/annotations.gtf --sjdbOverhang 100
    
    # Define variables
    GENOME_DIR="/path/to/STAR_index_hg38" # Placeholder: Replace with path to your STAR genome index for hg38
    READ1="sample_R1.fastq.gz" # Placeholder: Replace with your R1 FASTQ file
    READ2="sample_R2.fastq.gz" # Placeholder: Replace with your R2 FASTQ file (remove if single-end)
    OUTPUT_PREFIX="sample_aligned_"
    THREADS=8 # Number of threads to use
    
    # Run STAR alignment
    STAR \
      --genomeDir ${GENOME_DIR} \
      --readFilesIn ${READ1} ${READ2} \
      --runThreadN ${THREADS} \
      --outFileNamePrefix ${OUTPUT_PREFIX} \
      --outSAMtype BAM SortedByCoordinate \
      --outSAMattributes All \
      --outFilterMismatchNmax 3 \
      --outFilterScoreMinOverLread 0.66 \
      --outFilterMatchNminOverLread 0.66 \
      --alignIntronMin 20 \
      --alignIntronMax 1000000 \
      --alignMatesGapMax 1000000 \
      --outReadsUnmapped Fastx \
      --outFilterType BySJout \
      --outFilterMultimapNmax 20 \
      --outFilterMultimapScoreRange 1 \
      --outFilterScoreMin 10 \
      --outFilterMatchNmin 10 \
      --limitSjdbInsertNsj 1200000 \
      --sjdbScore 1 \
      --seedSearchStartLmax 30 \
      --seedPerReadNmax 1000 \
      --seedPerWindowNmax 50 \
      --alignTranscriptsPerReadNmax 10000 \
      --alignTranscriptsPerWindowNmax 1000
    
    # The output BAM file will be named sample_aligned_Aligned.sortedByCoord.out.bam
    # Other output files (e.g., Log.final.out, SJ.out.tab) will also be generated.
  2. 2

    We calculated the gene-level read counts and identified differentially expressed genes by in-house script.

    in-house script vN/A
    $ Bash example
    # Installation of common tools that might be wrapped by an in-house script
    # conda install -c bioconda subread # For featureCounts
    # conda install -c conda-forge r-base # For R-based DE analysis (DESeq2, edgeR)
    # R -e "install.packages('DESeq2')"
    # R -e "install.packages('edgeR')"
    
    # Placeholder for input BAM files (replace with actual paths)
    BAM_FILES="sample1_rep1.bam sample1_rep2.bam sample2_rep1.bam sample2_rep2.bam"
    
    # Placeholder for gene annotation GTF file (replace with actual path or download)
    # Example download for GRCh38:
    # wget -O Homo_sapiens.GRCh38.109.gtf.gz "https://ftp.ensembl.org/pub/release-109/gtf/homo_sapiens/Homo_sapiens.GRCh38.109.gtf.gz"
    # gunzip Homo_sapiens.GRCh38.109.gtf.gz
    GENE_ANNOTATION="Homo_sapiens.GRCh38.109.gtf"
    
    # Placeholder for experimental design file (e.g., tab-separated file with sample_id, condition)
    # Example design.tsv:
    # sample_id    condition
    # sample1_rep1    treated
    # sample1_rep2    treated
    # sample2_rep1    control
    # sample2_rep2    control
    DESIGN_FILE="design.tsv"
    
    # Execute the conceptual in-house script
    # The actual command would depend on the implementation of the in-house script.
    # It typically takes aligned BAM files, a gene annotation GTF, and a design file.
    # It performs gene quantification and differential expression analysis.
    ./in_house_expression_pipeline.sh \
      --bams ${BAM_FILES} \
      --gtf ${GENE_ANNOTATION} \
      --design ${DESIGN_FILE} \
      --output_dir ./results

Tools Used

Raw Source Text
The raw data was mapped using STAR.
We calculated the gene-level read counts and identified differentially expressed genes by in-house script.
Assembly: mm10
← Back to Analysis