GSE185373 Processing Pipeline

RNA-Seq code_examples 4 steps

Publication

Proteomic discovery of chemical probes that perturb protein complexes in human cells.

Molecular cell (2023) — PMID 37084731

Dataset

Function-first proteomic strategies for chemical probe discovery in human cells

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

FASTQ files were first trimmed using Trim_galore (v0.6.4) to remove sequencing adapters and low quality (Q<15) reads.

Trim Galore v0.6.4 GitHub

$ Bash example

# Install Trim Galore (and its dependencies like Cutadapt and FastQC)
# conda install -c bioconda trim-galore

# Define input and output paths
INPUT_FASTQ="input.fastq.gz" # Placeholder for your input FASTQ file
OUTPUT_DIR="trimmed_fastq"

# Create output directory if it doesn't exist
mkdir -p "${OUTPUT_DIR}"

# Run Trim Galore to remove sequencing adapters and low quality (Q<15) reads
# --quality 15: Trims reads from the 3' end until the average quality in a window is above 15
# --output_dir: Specifies the directory for output files
# Trim Galore automatically detects and removes common sequencing adapters.
# For paired-end data, use: trim_galore --paired --quality 15 --output_dir "${OUTPUT_DIR}" input_R1.fastq.gz input_R2.fastq.gz
trim_galore --quality 15 --output_dir "${OUTPUT_DIR}" "${INPUT_FASTQ}"

View on GitHub

Trimmed sequencing reads were aligned to the human Hg19 reference genome (GENCODE, GRCh37.p13) using STAR (v2.7.5).

STAR v2.7.5 GitHub

$ Bash example

# Install STAR (if not already installed)
# conda install -c bioconda star=2.7.5

# Define variables
STAR_VERSION="2.7.5"
# Placeholder for the STAR genome index for human Hg19 (GRCh37.p13) with GENCODE annotations.
# This index would typically be pre-built using a command like:
# STAR --runThreadN <threads> --runMode genomeGenerate \
#      --genomeDir /path/to/STAR_index_hg19_gencode \
#      --genomeFastaFiles /path/to/GRCh37.p13.genome.fa \
#      --sjdbGTFfile /path/to/gencode.v19.annotation.gtf \
#      --sjdbOverhang 100 # Adjust based on read length
GENOME_DIR="/path/to/STAR_index_hg19_gencode_GRCh37.p13"
READS_FILE="trimmed_reads.fastq.gz" # Placeholder for input trimmed reads
OUTPUT_PREFIX="aligned_reads"
THREADS=8 # A common default for --runThreadN

# Align reads using STAR
STAR --runThreadN ${THREADS} \
     --genomeDir ${GENOME_DIR} \
     --readFilesIn ${READS_FILE} \
     --readFilesCommand zcat \
     --outFileNamePrefix ${OUTPUT_PREFIX}_ \
     --outSAMtype BAM SortedByCoordinate \
     --outSAMunmapped Within \
     --outSAMattributes All

View on GitHub

SAM files were subsequently converted to BAM files, sorted, and indexed using samtools (v1.9).

samtools v1.9 GitHub

$ Bash example

# Install samtools if not already installed
# conda install -c bioconda samtools=1.9

# Placeholder for input SAM file
INPUT_SAM="input.sam"
# Placeholder for output sorted BAM file
OUTPUT_BAM_SORTED="output_sorted.bam"

# Convert SAM to BAM and sort in one step
# samtools sort can take SAM as input and output a sorted BAM file
samtools sort "${INPUT_SAM}" -o "${OUTPUT_BAM_SORTED}"

# Index the sorted BAM file
samtools index "${OUTPUT_BAM_SORTED}"

View on GitHub

BAM files were used to generate bigwig files using bamCoverage (part of the Deeptools package; v3.3.1).

deepTools v3.3.1 GitHub

$ Bash example

# Install deepTools (if not already installed)
# conda install -c bioconda deeptools=3.3.1

# Example usage: Generate a bigwig file from a BAM file
# Replace 'input.bam' with your actual BAM file path
# Replace 'output.bw' with your desired output bigwig file path
bamCoverage -b input.bam -o output.bw --binSize 10 --numberOfProcessors auto

View on GitHub

Tools Used

Trim Galore STAR

Raw Source Text

FASTQ files were first trimmed using Trim_galore (v0.6.4) to remove sequencing adapters and low quality (Q<15) reads.
Trimmed sequencing reads were aligned to the human Hg19 reference genome (GENCODE, GRCh37.p13) using STAR (v2.7.5).
SAM files were subsequently converted to BAM files, sorted, and indexed using samtools (v1.9).
BAM files were used to generate bigwig files using bamCoverage (part of the Deeptools package; v3.3.1).
Genome_build: HG19
Supplementary_files_format_and_content: bigWig

← Back to Analysis