GSE106493 Processing Pipeline
RIP-Seq
code_examples
4 steps
Publication
Transcriptome regulation by PARP13 in basal and antiviral states in human cells.iScience (2024) — PMID 38495826
Dataset
GSE106493Live-cell mapping of organelle-associated RNAs via proximity biotinylation combined with protein-RNA crosslinking
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Reads were mapped to human genome using TopHat v2.1.1 with the flags, "--no-coverage-search" and "--GTF gencode.v19.annotation.gtf".
$ Bash example
# Install TopHat (example using conda) # conda install -c bioconda tophat=2.1.1 # Define variables for reference files and input/output # Replace with actual paths to your reference genome index and GTF file GENOME_INDEX="/path/to/human_genome/hg19/bowtie2_index/hg19" # TopHat uses Bowtie2 index GTF_FILE="/path/to/gencode.v19.annotation.gtf" READS_FILE="reads.fastq" # Replace with your input FASTQ file(s) OUTPUT_DIR="tophat_output" # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # Run TopHat mapping tophat2 \ --no-coverage-search \ --GTF "${GTF_FILE}" \ -o "${OUTPUT_DIR}" \ "${GENOME_INDEX}" \ "${READS_FILE}" -
2
Gene expression was quantified against the Gencode v19 reference transcriptome (gencode.v19.annotation.gtf, genecodegenes.org) with Cufflinks v2.2.1.
$ Bash example
# Install Cufflinks (example using conda) # conda install -c bioconda cufflinks=2.2.1 # Define reference transcriptome GENCODE_GTF="gencode.v19.annotation.gtf" # Source: genecodegenes.org # Assume input BAM file and output directory INPUT_BAM="aligned_reads.bam" # Placeholder for your aligned RNA-seq reads OUTPUT_DIR="cufflinks_output" # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # Run Cufflinks for gene expression quantification # -o: output directory # -g: reference annotation GTF file cufflinks -o "${OUTPUT_DIR}" -g "${GENCODE_GTF}" "${INPUT_BAM}" -
3
The statistical significance of differential expression was assessed using CuffDiff2, with the flags, "--dispersion-method per-condition" and "--seed 42".
$ Bash example
# Install Cufflinks (which includes cuffdiff2) # conda install -c bioconda cufflinks=2.2.1 # Define input files and output directory GTF_FILE="path/to/annotation.gtf" # Placeholder: e.g., Gencode, Ensembl GTF file CONDITION_A_BAMS="path/to/conditionA_rep1.bam,path/to/conditionA_rep2.bam" # Placeholder: Comma-separated BAM files for condition A replicates CONDITION_B_BAMS="path/to/conditionB_rep1.bam,path/to/conditionB_rep2.bam" # Placeholder: Comma-separated BAM files for condition B replicates OUTPUT_DIR="cuffdiff_output" # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # Run CuffDiff2 for differential expression analysis cuffdiff2 \ --dispersion-method per-condition \ --seed 42 \ -o "${OUTPUT_DIR}" \ "${GTF_FILE}" \ "${CONDITION_A_BAMS}" \ "${CONDITION_B_BAMS}" -
4
ROC analysis was perfomed in Microsoft Excel, using true positive and false positive gene lists culled from the literature.
Microsoft Excel GitHub
Raw Source Text
Reads were mapped to human genome using TopHat v2.1.1 with the flags, "--no-coverage-search" and "--GTF gencode.v19.annotation.gtf". Gene expression was quantified against the Gencode v19 reference transcriptome (gencode.v19.annotation.gtf, genecodegenes.org) with Cufflinks v2.2.1. The statistical significance of differential expression was assessed using CuffDiff2, with the flags, "--dispersion-method per-condition" and "--seed 42". ROC analysis was perfomed in Microsoft Excel, using true positive and false positive gene lists culled from the literature. Genome_build: hg19