GSE61947 Processing Pipeline

ChIP-Seq code_examples 3 steps

Publication

A Gene Regulatory Network Cooperatively Controlled by Pdx1 and Sox9 Governs Lineage Allocation of Foregut Progenitor Cells.

Cell reports (2015) — PMID 26440894

Dataset

GSE61947

SOX9 targets in hESC-derived pancreatic progenitors

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Fastq files were mapped to hg18 genome using Bowtie2 version 2.1.0

Bowtie2 v2.1.0 GitHub

$ Bash example

# Install Bowtie2
# conda install -c bioconda bowtie2=2.1.0

# Define variables
FASTQ_R1="input_R1.fastq.gz" # Placeholder for forward reads
FASTQ_R2="input_R2.fastq.gz" # Placeholder for reverse reads
GENOME_INDEX="/path/to/bowtie2_indexes/hg18" # Placeholder for hg18 Bowtie2 index
OUTPUT_BAM="mapped_reads.bam"
THREADS=8 # Number of threads to use

# Ensure samtools is available for BAM conversion and sorting
# conda install -c bioconda samtools

# Map reads to hg18 genome using Bowtie2
# -x: specify the Bowtie2 index
# -1, -2: specify paired-end FASTQ files
# -p: specify number of threads
# --very-sensitive: a common preset for good sensitivity (optional, but good for mapping)
# | samtools view -bS - : pipe SAM output to samtools to convert to BAM
# | samtools sort -o ${OUTPUT_BAM} : sort the BAM file and save to output
bowtie2 -x "${GENOME_INDEX}" -1 "${FASTQ_R1}" -2 "${FASTQ_R2}" \
        -p "${THREADS}" \
        --very-sensitive \
        | samtools view -bS - \
        | samtools sort -o "${OUTPUT_BAM}"

View on GitHub

SAM files were processed using HOMER into Tag directories

HOMER v4.11.1 GitHub

$ Bash example

# Install HOMER (if not already installed)
# conda install -c bioconda homer

# Example: Process a SAM file into a HOMER Tag Directory
# Replace 'input.sam' with your actual SAM file path.
# Replace 'output_tag_directory' with your desired output directory name.
# The '-tbp 1' option is often used for single-end reads to count tags per base pair.
# For ChIP-seq, you might also include '-fragLength <estimated_fragment_length>' (e.g., 150-200).
makeTagDirectory output_tag_directory input.sam -format sam -tbp 1

View on GitHub

Peaks were called using HOMER tag directories and normalized using the input file

HOMER v4.11.1 GitHub

$ Bash example

# Install HOMER (if not already installed)
# conda install -c bioconda homer

# Create tag directories from BAM files (example commands, assuming 'sample.bam' and 'input.bam' exist)
# makeTagDirectory sample_tag_dir sample.bam
# makeTagDirectory input_tag_dir input.bam

# Call peaks using HOMER findPeaks
# -i input_tag_dir: Specifies the control/input tag directory for normalization.
# -style factor: Sets the peak calling style, 'factor' is suitable for general transcription factor or binding site analysis.
# -size 200: Sets the fragment size for peak detection (adjust based on library preparation).
# -FDR 0.001: Sets the False Discovery Rate threshold for peak calling.
# -L 2: Sets the log2 fold change threshold for peak calling.
# -tbp 1: Total background peaks for normalization (often used with -i).
findPeaks sample_tag_dir -o sample_peaks.txt -i input_tag_dir -style factor -size 200 -FDR 0.001 -L 2 -tbp 1

# Reference genome (not directly used by findPeaks, but for context/downstream annotation)
# Placeholder: hg38

View on GitHub

Tools Used

Bowtie2 HOMER

Raw Source Text

Fastq files were mapped to hg18 genome using Bowtie2 version 2.1.0
SAM files were processed using HOMER into Tag directories
Peaks were called using HOMER tag directories and normalized using the input file
Genome_build: hg18
Supplementary_files_format_and_content: peak calls using HOMER

← Back to Analysis