GSE220462 Processing Pipeline
GSE
code_examples
2 steps
Publication
Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size.Neuron (2024) — PMID 38697111
Dataset
GSE220462Epistatic interactions between NMD and TRP53 control progenitor cell maintenance and brain size
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
The raw data was mapped using STAR.
$ Bash example
# Install STAR (example using conda) # conda install -c bioconda star # Create STAR genome index (if not already done) # Replace /path/to/genome_fasta.fa with your reference genome FASTA file # Replace /path/to/annotations.gtf with your gene annotations GTF file # Replace /path/to/STAR_index_hg38 with your desired index directory # STAR --runThreadN 8 --runMode genomeGenerate --genomeDir /path/to/STAR_index_hg38 --genomeFastaFiles /path/to/genome_fasta.fa --sjdbGTFfile /path/to/annotations.gtf --sjdbOverhang 100 # Define variables GENOME_DIR="/path/to/STAR_index_hg38" # Placeholder: Replace with path to your STAR genome index for hg38 READ1="sample_R1.fastq.gz" # Placeholder: Replace with your R1 FASTQ file READ2="sample_R2.fastq.gz" # Placeholder: Replace with your R2 FASTQ file (remove if single-end) OUTPUT_PREFIX="sample_aligned_" THREADS=8 # Number of threads to use # Run STAR alignment STAR \ --genomeDir ${GENOME_DIR} \ --readFilesIn ${READ1} ${READ2} \ --runThreadN ${THREADS} \ --outFileNamePrefix ${OUTPUT_PREFIX} \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes All \ --outFilterMismatchNmax 3 \ --outFilterScoreMinOverLread 0.66 \ --outFilterMatchNminOverLread 0.66 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --outReadsUnmapped Fastx \ --outFilterType BySJout \ --outFilterMultimapNmax 20 \ --outFilterMultimapScoreRange 1 \ --outFilterScoreMin 10 \ --outFilterMatchNmin 10 \ --limitSjdbInsertNsj 1200000 \ --sjdbScore 1 \ --seedSearchStartLmax 30 \ --seedPerReadNmax 1000 \ --seedPerWindowNmax 50 \ --alignTranscriptsPerReadNmax 10000 \ --alignTranscriptsPerWindowNmax 1000 # The output BAM file will be named sample_aligned_Aligned.sortedByCoord.out.bam # Other output files (e.g., Log.final.out, SJ.out.tab) will also be generated. -
2
We calculated the gene-level read counts and identified differentially expressed genes by in-house script.
in-house script vN/A$ Bash example
# Installation of common tools that might be wrapped by an in-house script # conda install -c bioconda subread # For featureCounts # conda install -c conda-forge r-base # For R-based DE analysis (DESeq2, edgeR) # R -e "install.packages('DESeq2')" # R -e "install.packages('edgeR')" # Placeholder for input BAM files (replace with actual paths) BAM_FILES="sample1_rep1.bam sample1_rep2.bam sample2_rep1.bam sample2_rep2.bam" # Placeholder for gene annotation GTF file (replace with actual path or download) # Example download for GRCh38: # wget -O Homo_sapiens.GRCh38.109.gtf.gz "https://ftp.ensembl.org/pub/release-109/gtf/homo_sapiens/Homo_sapiens.GRCh38.109.gtf.gz" # gunzip Homo_sapiens.GRCh38.109.gtf.gz GENE_ANNOTATION="Homo_sapiens.GRCh38.109.gtf" # Placeholder for experimental design file (e.g., tab-separated file with sample_id, condition) # Example design.tsv: # sample_id condition # sample1_rep1 treated # sample1_rep2 treated # sample2_rep1 control # sample2_rep2 control DESIGN_FILE="design.tsv" # Execute the conceptual in-house script # The actual command would depend on the implementation of the in-house script. # It typically takes aligned BAM files, a gene annotation GTF, and a design file. # It performs gene quantification and differential expression analysis. ./in_house_expression_pipeline.sh \ --bams ${BAM_FILES} \ --gtf ${GENE_ANNOTATION} \ --design ${DESIGN_FILE} \ --output_dir ./results
Tools Used
Raw Source Text
The raw data was mapped using STAR. We calculated the gene-level read counts and identified differentially expressed genes by in-house script. Assembly: mm10