GSE113193 Processing Pipeline

OTHER code_examples 32 steps

Publication

Tissue-selective restriction of RNA editing of CaV1.3 by splicing factor SRSF9.

Nucleic acids research (2018) — PMID 29733375

Dataset

Tissue-selective restriction of RNA editing of CaV1.3 by splicing factor SRSF9

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Takes output from raw files.

FastQC (Inferred with models/gemini-2.5-flash) v0.11.9 (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Define input raw sequencing files (example)
RAW_FILES="sample_R1.fastq.gz sample_R2.fastq.gz"

# Define output directory for quality control reports
OUTPUT_DIR="fastqc_reports"

# Create output directory if it doesn't exist
mkdir -p "${OUTPUT_DIR}"

# Install FastQC (example using conda)
# conda install -c bioconda fastqc

# Run FastQC to generate quality control reports from raw files
fastqc ${RAW_FILES} -o "${OUTPUT_DIR}"

View on GitHub

Run to trim off both 5â and 3â adapters on both reads.

fastp (Inferred with models/gemini-2.5-flash) v0.23.2 GitHub

$ Bash example

# Install fastp (if not already installed)
# conda install -c bioconda fastp=0.23.2

# Define input and output file paths
INPUT_READ1="read1.fastq.gz"
INPUT_READ2="read2.fastq.gz"
OUTPUT_READ1="trimmed_read1.fastq.gz"
OUTPUT_READ2="trimmed_read2.fastq.gz"
REPORT_JSON="fastp_report.json"
REPORT_HTML="fastp_report.html"

# Run fastp to trim adapters and perform quality filtering
# -i: input read 1
# -o: output read 1
# -I: input read 2
# -O: output read 2
# --detect_adapter_for_pe: automatically detect and trim adapters for paired-end reads
# --qualified_quality_phred 15: filter reads with quality score < 15
# --length_required 20: discard reads shorter than 20 bp after trimming
# --trim_poly_g: trim polyG tails (common in Illumina NextSeq/NovaSeq)
# --trim_poly_x: trim polyX tails
# -j 8: use 8 threads
# -h: HTML report output
# -j: JSON report output
fastp \
  -i "${INPUT_READ1}" \
  -o "${OUTPUT_READ1}" \
  -I "${INPUT_READ2}" \
  -O "${OUTPUT_READ2}" \
  --detect_adapter_for_pe \
  --qualified_quality_phred 15 \
  --length_required 20 \
  --trim_poly_g \
  --trim_poly_x \
  -j "${REPORT_JSON}" \
  -h "${REPORT_HTML}" \
  -w 8

View on GitHub

Command: quality-cutoff 6 -m 18 -a NNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -g CTTCCGATCTACAAGTT -g CTTCCGATCTTGGTCCT -A AACTTGTAGATCGGA -A AGGACCAAGATCGGA -A ACTTGTAGATCGGAA -A GGACCAAGATCGGAA -A CTTGT AGATCGGAAG -A GACCAAGATCGGAAG -A TTGTAGATCGGAAGA -A ACCAAGATCGGAAGA -A TGTAGATCGGAAGAG -A CCAAGATCGGAAGAG -A GTAGATCGGAAGAGC -A CAAGATCGGAAGAGC -A TAGATCGGAAGAGCG -A AAGATCGGAAGAGCG -A AGATCGGAAGAGCGT -A GATCGGAAGAGCGTC -A ATCGGAAGAGCGTCG -A TCGGAAGAGCGTCGT -A CGGAAGAGCGTCGTG -A GGAAGAGCGTCGTGT -o /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz -p /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz /full/path/to/files/file_R1.C01.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.metrics

cutadapt (Inferred with models/gemini-2.5-flash) v4.x (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install cutadapt (example using conda)
# conda install -c bioconda cutadapt

# Define input and output file paths
INPUT_R1="/full/path/to/files/file_R1.C01.fastq.gz"
INPUT_R2="/full/path/to/files/file_R2.C01.fastq.gz"
OUTPUT_R1="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz"
OUTPUT_R2="/full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz"
METRICS_FILE="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.metrics"

# Define adapter sequences
ADAPTER_R1="NNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC"
ADAPTER_G1="CTTCCGATCTACAAGTT"
ADAPTER_G2="CTTCCGATCTTGGTCCT"
ADAPTER_A_LIST=(
    "AACTTGTAGATCGGA"
    "AGGACCAAGATCGGA"
    "ACTTGTAGATCGGAA"
    "GGACCAAGATCGGAA"
    "CTTGT AGATCGGAAG" # Note: space in original command, likely a typo or specific sequence
    "GACCAAGATCGGAAG"
    "TTGTAGATCGGAAGA"
    "ACCAAGATCGGAAGA"
    "TGTAGATCGGAAGAG"
    "CCAAGATCGGAAGAG"
    "GTAGATCGGAAGAGC"
    "CAAGATCGGAAGAGC"
    "TAGATCGGAAGAGCG"
    "AAGATCGGAAGAGCG"
    "AGATCGGAAGAGCGT"
    "GATCGGAAGAGCGTC"
    "ATCGGAAGAGCGTCG"
    "TCGGAAGAGCGTCGT"
    "CGGAAGAGCGTCGTG"
    "GGAAGAGCGTCGTGT"
)

# Construct the -A options string
ADAPTER_A_OPTIONS=""
for adapter in "${ADAPTER_A_LIST[@]}"; do
    ADAPTER_A_OPTIONS+=" -A \"${adapter}\""
done

# Execute cutadapt for adapter and quality trimming
# The original command 'quality-cutoff 6' is interpreted as 'cutadapt -q 6' for quality trimming.
# The multiple -g and -A options are handled by cutadapt.
cutadapt -q 6 \
         -m 18 \
         -a "${ADAPTER_R1}" \
         -g "${ADAPTER_G1}" \
         -g "${ADAPTER_G2}" \
         ${ADAPTER_A_OPTIONS} \
         -o "${OUTPUT_R1}" \
         -p "${OUTPUT_R2}" \
         "${INPUT_R1}" \
         "${INPUT_R2}" > "${METRICS_FILE}"

View on GitHub

Takes output from cutadapt round 1.

cutadapt v1.18 GitHub

$ Bash example

# Install cutadapt if not already available
# conda install -c bioconda cutadapt=1.18

# Define input and output files
INPUT_FASTQ="reads_from_cutadapt_round1.fastq.gz"
OUTPUT_FASTQ="reads_trimmed_round2.fastq.gz"

# Execute cutadapt for quality, length, and poly-A trimming
# This step assumes a previous cutadapt round (round 1) for primary adapter trimming.
# -a A{100}: Trim poly-A tails (up to 100 'A's) from the 3' end
# -q 20,20: Trim low-quality bases from both ends (Phred score < 20)
# --minimum-length 18: Discard reads shorter than 18 bp after trimming
# -o: Output file
cutadapt -a A{100} -q 20,20 --minimum-length 18 -o "${OUTPUT_FASTQ}" "${INPUT_FASTQ}"

View on GitHub

Run to trim off the 3â adapters on read 2, to control for double ligation events.

cutadapt (Inferred with models/gemini-2.5-flash) v4.0 GitHub

$ Bash example

# Install cutadapt (example using conda)
# conda install -c bioconda cutadapt=4.0

# Define variables
INPUT_R1="input_R1.fastq.gz"
INPUT_R2="input_R2.fastq.gz"
OUTPUT_R1="trimmed_R1.fastq.gz"
OUTPUT_R2="trimmed_R2.fastq.gz"

# Adapter sequence for the 3' end of Read 2.
# For eCLIP assays, this is often the rP5 adapter or a variant, 
# which is ligated to the 3' end of the RNA fragment.
# Example sequence from yeolab/skipper eCLIP pipeline:
ADAPTER_R2_SEQUENCE="AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT"

# Run cutadapt to trim 3' adapters from Read 2
# -A ADAPTER_R2_SEQUENCE: Trims the given adapter from the 3' end of Read 2.
#                         This is crucial for controlling double ligation events 
#                         where the adapter might ligate to itself or another adapter.
# -q 20: Trims low-quality bases from the 3' end (Phred score < 20).
# -m 18: Discard reads shorter than 18 bp after trimming.
# -o: Output file for Read 1
# -p: Output file for Read 2
cutadapt -A "${ADAPTER_R2_SEQUENCE}" \
         -q 20 \
         -m 18 \
         -o "${OUTPUT_R1}" \
         -p "${OUTPUT_R2}" \
         "${INPUT_R1}" \
         "${INPUT_R2}"

View on GitHub

6
Command: cutadapt -f fastq --match-read-wildcards --times 1 -e 0.1 -O 5 --quality-cutoff 6 -m 18 -A AACTTGTAGATCGGA -A AGGACCAAGATCGGA -A ACTTGTAGATCGGAA -A GGACCAAGATCGGAA -A CTTGTAGATCGGAAG -A GACCAAGATCGGAAG -A TTGTAGATCGGAAGA -A ACCAAGATCGGAAGA -A TGTAGATCGGAAGAG -A CCAAGATCGGAAGAG -A GTAGATCGGAAGAGC -A CAAGATCGGAAGAGC -A TAGATCGGAAGAGCG -A AAGATCGGAAGAGCG -A AGATCGGAAGAGCGT -A GATCGGAAGAGCGTC -A ATCGGAAGAGCGTCG -A TCGGAAGAGCGTCGT -A CGGAAGAGCGTCGTG -A GGAAGAGCGTCGTGT -o /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz -p /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.metrics

cutadapt v4.0 (Inferred with models/gemini-2.5-flash) GitHub
$ Bash example
```
# Install cutadapt (e.g., using conda)
# conda install -c bioconda cutadapt

cutadapt -f fastq --match-read-wildcards --times 1 -e 0.1 -O 5 --quality-cutoff 6 -m 18 -A AACTTGTAGATCGGA -A AGGACCAAGATCGGA -A ACTTGTAGATCGGAA -A GGACCAAGATCGGAA -A CTTGTAGATCGGAAG -A GACCAAGATCGGAAG -A TTGTAGATCGGAAGA -A ACCAAGATCGGAAGA -A TGTAGATCGGAAGAG -A CCAAGATCGGAAGAG -A GTAGATCGGAAGAGC -A CAAGATCGGAAGAGC -A TAGATCGGAAGAGCG -A AAGATCGGAAGAGCG -A AGATCGGAAGAGCGT -A GATCGGAAGAGCGTC -A ATCGGAAGAGCGTCG -A TCGGAAGAGCGTCGT -A CGGAAGAGCGTCGTG -A GGAAGAGCGTCGTGT -o /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz -p /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.metrics
```
View on GitHub

Takes output from cutadapt round 2.

cutadapt v4.0 GitHub

$ Bash example

# Install cutadapt (if not already installed)
# conda install -c bioconda cutadapt=4.0

# Define input and output files
INPUT_FASTQ="input_from_cutadapt_round2.fastq.gz"
OUTPUT_FASTQ="trimmed_round3.fastq.gz"
LOG_FILE="cutadapt_round3.log"

# Define parameters (these are placeholders and should be adjusted based on specific experimental design)
# This step assumes further trimming is needed after an initial cutadapt round.
# It could involve removing a different adapter, more stringent quality trimming, or length filtering.
# Example: Trim remaining 3' adapters (e.g., poly-A or a different library adapter)
# Example: Quality trim from 3' end with a cutoff of 20
# Example: Discard reads shorter than 18 bp
ADAPTER_SEQUENCE="AGATCGGAAGAGCACACGTCTGAACTCCAGTCA" # Placeholder for a common Illumina adapter or specific library adapter
MIN_LENGTH=18
QUALITY_CUTOFF=20

# Execute cutadapt
cutadapt \
    -a "${ADAPTER_SEQUENCE}" \
    --quality-cutoff="${QUALITY_CUTOFF}" \
    --minimum-length="${MIN_LENGTH}" \
    -o "${OUTPUT_FASTQ}" \
    "${INPUT_FASTQ}" \
    > "${LOG_FILE}" 2>&1

View on GitHub

Maps to human specific version of RepBase used to remove repetitive elements, helps control for spurious artifacts from rRNA (& other) repetitive reads.

bbduk (Inferred with models/gemini-2.5-flash) v38.x GitHub

$ Bash example

# Install bbtools (if not already installed)
# conda install -c bioconda bbtools

# Download and prepare a human-specific RepBase library for bbduk.
# This typically involves downloading RepBase sequences (e.g., from GIRI RepBase website),
# filtering for human-specific elements, and compiling them into a single FASTA file.
# Example placeholder for the RepBase reference file. The actual download and processing
# of RepBase data may require registration or specific scripts.
# For instance, one might download 'Homo_sapiens.fa' from RepBase and use it as the reference.
REPBASE_REF="path/to/human_repbase.fasta" # Placeholder for the actual path to the RepBase reference file

# Input FASTQ file (e.g., raw reads or adapter-trimmed reads)
INPUT_FASTQ="input_reads.fastq.gz"

# Output FASTQ file with repetitive reads removed
OUTPUT_FASTQ="filtered_non_repetitive_reads.fastq.gz"

# Run bbduk to remove reads mapping to human repetitive elements
# k=31: kmer size for matching (default for contaminant removal)
# hdist=1: allow 1 mismatch in kmer
# stats: output statistics about removed reads
# -Xmx: memory allocation
bbduk.sh \
    in="$INPUT_FASTQ" \
    out="$OUTPUT_FASTQ" \
    ref="$REPBASE_REF" \
    k=31 \
    hdist=1 \
    stats=bbduk_repbase_filter_stats.txt \
    -Xmx4g

View on GitHub

Command: STAR --runMode alignReads --runThreadN 16 --genomeDir /path/to/RepBase_human_database_file --genomeLoad LoadAndRemove --readFilesIn /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz --outSAMunmapped Within --outFilterMultimapNmax 30 --outFilterMultimapScoreRange 1 --outFileNamePrefix /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam --outSAMattributes All --readFilesCommand zcat --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam

STAR v2.7.10a GitHub

$ Bash example

# Install STAR (example using conda)
# conda install -c bioconda star

# Define variables for clarity
GENOME_DIR="/path/to/RepBase_human_database_file"
READ_FILE_R1="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz"
READ_FILE_R2="/full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz"
OUTPUT_PREFIX="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam"
FINAL_OUTPUT_BAM="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam"

# Execute STAR alignment
STAR --runMode alignReads \
     --runThreadN 16 \
     --genomeDir "${GENOME_DIR}" \
     --genomeLoad LoadAndRemove \
     --readFilesIn "${READ_FILE_R1}" "${READ_FILE_R2}" \
     --outSAMunmapped Within \
     --outFilterMultimapNmax 30 \
     --outFilterMultimapScoreRange 1 \
     --outFileNamePrefix "${OUTPUT_PREFIX}" \
     --outSAMattributes All \
     --readFilesCommand zcat \
     --outStd BAM_Unsorted \
     --outSAMtype BAM Unsorted \
     --outFilterType BySJout \
     --outReadsUnmapped Fastx \
     --outFilterScoreMin 10 \
     --outSAMattrRGline ID:foo \
     --alignEndsType EndToEnd > "${FINAL_OUTPUT_BAM}"

View on GitHub

Takes output from STAR rmRep.

STAR v2.7.3a GitHub

$ Bash example

# Install STAR (if not already installed)
# conda install -c bioconda star

# Define variables
GENOME_DIR="/path/to/STAR_genome_index/hg38" # Placeholder for STAR genome index directory (e.g., human hg38)
READ1_FASTQ="input_read1.fastq.gz" # Placeholder for input FASTQ file (read 1)
READ2_FASTQ="input_read2.fastq.gz" # Placeholder for input FASTQ file (read 2, remove if single-end)
OUTPUT_PREFIX="aligned_reads" # Prefix for output files
THREADS=8 # Number of threads to use
RG_ID="sample_id" # Read group ID
RG_SM="sample_name" # Sample name
RG_LB="library_name" # Library name
RG_PU="platform_unit" # Platform unit (e.g., flowcell-lane)

# Create output directory if it doesn't exist
mkdir -p star_output

# Run STAR alignment (parameters adapted from Yeo lab eCLIP workflow)
STAR \
  --runThreadN "${THREADS}" \
  --genomeDir "${GENOME_DIR}" \
  --readFilesIn "${READ1_FASTQ}" "${READ2_FASTQ}" \
  --readFilesCommand zcat \
  --outFileNamePrefix "star_output/${OUTPUT_PREFIX}_" \
  --outSAMtype BAM SortedByCoordinate \
  --outSAMattributes All \
  --outFilterMultimapNmax 1 \
  --outFilterMismatchNmax 999 \
  --outFilterMismatchNoverLmax 0.05 \
  --alignIntronMin 20 \
  --alignIntronMax 1000000 \
  --alignMatesGapMax 1000000 \
  --alignSJoverhangMin 8 \
  --alignSJDBoverhangMin 1 \
  --sjdbScore 1 \
  --limitSjdbInsertNsj 2000000 \
  --outFilterType BySJout \
  --outFilterScoreMinOverLread 0.3 \
  --outFilterMatchNoverLread 0.3 \
  --winAnchorMultimapNmax 50 \
  --seedSearchStartLmax 30 \
  --seedPerReadNmax 100000 \
  --seedPerWindowNmax 1000 \
  --alignTranscriptsPerReadNmax 100000 \
  --alignTranscriptsPerWindowNmax 10000 \
  --quantMode GeneCounts \
  --outSAMattrRGline ID:"${RG_ID}" SM:"${RG_SM}" LB:"${RG_LB}" PL:ILLUMINA PU:"${RG_PU}"

View on GitHub

Maps unique reads to the human genome.

BWA-MEM (Inferred with models/gemini-2.5-flash) v0.7.17 GitHub

$ Bash example

# Install BWA and Samtools if not already available
# conda install -c bioconda bwa samtools

# Placeholder for human genome reference (e.g., hg38)
# Ensure the reference genome is indexed with bwa index
# bwa index human_genome.fa

# Align reads to the human genome and filter for uniquely mapped reads
# -t: number of threads
# human_genome.fa: path to the indexed human reference genome
# reads.fastq.gz: path to the input FASTQ file
# | samtools view -bS -F 4 -q 20 - : pipe output to samtools to filter for mapped reads with MAPQ >= 20
# | samtools sort -o aligned_unique_reads.bam - : pipe output to samtools to sort and save as BAM
bwa mem -t 8 human_genome.fa reads.fastq.gz | \
samtools view -bS -F 4 -q 20 - | \
samtools sort -o aligned_unique_reads.bam -

# Index the sorted BAM file
samtools index aligned_unique_reads.bam

View on GitHub

Command: STAR --runMode alignReads --runThreadN 16 --genomeDir /path/to/STAR_database_file --genomeLoad LoadAndRemove --readFilesIn /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate1 /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate2 --outSAMunmapped Within --outFilterMultimapNmax 1 --outFilterMultimapScoreRange 1 --outFileNamePrefix /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam --outSAMattributes All --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam

STAR v2.7.10a GitHub

$ Bash example

# Install STAR (example using Conda):
# conda install -c bioconda star

# Define variables
STAR_GENOME_DIR="/path/to/star_index/GRCh38" # Placeholder: Replace with the actual path to your STAR genome index directory (e.g., for human hg38)
INPUT_R1="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate1" # Replace with actual path to mate1 FASTQ/FASTA file
INPUT_R2="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate2" # Replace with actual path to mate2 FASTQ/FASTA file
OUTPUT_BAM_PATH="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam" # The final aligned BAM file name and prefix for auxiliary files

# Ensure the output directory exists
mkdir -p "$(dirname "${OUTPUT_BAM_PATH}")"

# Run STAR alignment
STAR \
  --runMode alignReads \
  --runThreadN 16 \
  --genomeDir "${STAR_GENOME_DIR}" \
  --genomeLoad LoadAndRemove \
  --readFilesIn "${INPUT_R1}" "${INPUT_R2}" \
  --outSAMunmapped Within \
  --outFilterMultimapNmax 1 \
  --outFilterMultimapScoreRange 1 \
  --outFileNamePrefix "${OUTPUT_BAM_PATH}" \
  --outSAMattributes All \
  --outStd BAM_Unsorted \
  --outSAMtype BAM Unsorted \
  --outFilterType BySJout \
  --outReadsUnmapped Fastx \
  --outFilterScoreMin 10 \
  --outSAMattrRGline ID:foo \
  --alignEndsType EndToEnd > "${OUTPUT_BAM_PATH}"

View on GitHub

Takes output from STAR genome mapping.

STAR v2.7.10a (Inferred from common usage for RNA-seq alignment) GitHub

$ Bash example

# Install STAR (example using conda)
# conda install -c bioconda star

# Define variables
# Placeholder for STAR genome index (e.g., hg38). 
# This directory should contain files generated by STAR --runMode genomeGenerate.
GENOME_DIR="/path/to/STAR_index_hg38"
READ1_FASTQ="sample_R1.fastq.gz"
READ2_FASTQ="sample_R2.fastq.gz" # Comment out or remove if single-end
OUTPUT_PREFIX="sample_"
THREADS=8 # Number of threads to use

# Run STAR genome mapping
STAR \
  --genomeDir "${GENOME_DIR}" \
  --readFilesIn "${READ1_FASTQ}" "${READ2_FASTQ}" \
  --runThreadN "${THREADS}" \
  --outFileNamePrefix "${OUTPUT_PREFIX}" \
  --outSAMtype BAM SortedByCoordinate \
  --outSAMunmapped Within \
  --outSAMattributes Standard \
  --readFilesCommand zcat \
  --limitBAMsortRAM 30000000000 # Adjust based on available RAM, ~30GB recommended for large genomes

View on GitHub

Custom random-mer-aware script for PCR duplicate removal.

umi_tools dedup (Inferred with models/gemini-2.5-flash) v1.1.2 GitHub

$ Bash example

# Install umi_tools (example using conda)
# conda install -c bioconda umi-tools

# Example usage for PCR duplicate removal with umi_tools dedup
# This assumes UMIs have already been extracted and appended to read names
# (e.g., using 'umi_tools extract' prior to alignment).
# The '--method directional' is commonly used for UMI-based deduplication.
# The '--paired' flag is used for paired-end sequencing data.

umi_tools dedup \
    -I input_aligned_reads.bam \
    -S deduplicated_reads.bam \
    --method directional \
    --paired \
    --log deduplication.log

View on GitHub

Command: barcode_collapse_pe.py --bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam --out_file /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam --metrics_file /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.metrics

barcode_collapse_pe.py vNot specified, associated with yeolab/eclip project GitHub

$ Bash example

# Install dependencies (if not already installed)
# This script requires pysam
# conda install -c bioconda pysam

# If barcode_collapse_pe.py is not in your PATH, you might need to clone the repository
# and add its scripts directory to PATH, or call it directly:
# git clone https://github.com/yeolab/eclip.git
# # export PATH=$PATH:$(pwd)/eclip/scripts
# # python eclip/scripts/barcode_collapse_pe.py ...

# Define input and output file paths as they appear in the description
INPUT_BAM="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam"
OUTPUT_BAM="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam"
METRICS_FILE="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.metrics"

# Execute the command
barcode_collapse_pe.py --bam "${INPUT_BAM}" --out_file "${OUTPUT_BAM}" --metrics_file "${METRICS_FILE}"

View on GitHub

Takes output from barcode collapse PE.

STAR (Inferred with models/gemini-2.5-flash) v2.7.10a GitHub

$ Bash example

# Install STAR (if not already installed)
# conda install -c bioconda star=2.7.10a

# Define variables
# INPUT_FASTQ: Placeholder for the gzipped FASTQ file containing deduplicated paired-end reads
# from the barcode collapse PE step.
INPUT_FASTQ="deduplicated_reads_PE.fastq.gz"

# OUTPUT_PREFIX: Prefix for output files (e.g., aligned BAM, log files)
OUTPUT_PREFIX="aligned_reads"

# GENOME_DIR: Path to the pre-built STAR genome index directory.
# For eCLIP, this is typically a human reference genome like hg38.
# Replace with the actual path to your genome index.
GENOME_DIR="/path/to/STAR_genome_index/hg38"

# THREADS: Number of CPU threads to use for alignment.
THREADS=8

# RAM_LIMIT: Maximum RAM (in bytes) to be used for sorting BAM. Adjust based on available system RAM.
# Example: 30GB = 30 * 1024^3 bytes
RAM_LIMIT=30000000000

# --- STAR Alignment Command ---
# This command aligns the deduplicated paired-end reads to the reference genome.
# It outputs a sorted BAM file and gene counts.
STAR --runThreadN ${THREADS} \
     --genomeDir ${GENOME_DIR} \
     --readFilesIn ${INPUT_FASTQ} \
     --readFilesCommand zcat \
     --outFileNamePrefix ${OUTPUT_PREFIX}_ \
     --outSAMtype BAM SortedByCoordinate \
     --outSAMattributes All \
     --quantMode GeneCounts \
     --outFilterMultimapNmax 20 \
     --outFilterMismatchNmax 3 \
     --alignIntronMin 20 \
     --alignIntronMax 1000000 \
     --alignMatesGapMax 1000000 \
     --limitBAMsortRAM ${RAM_LIMIT}

View on GitHub

Sorts resulting bam file for use downstream.

samtools (Inferred with models/gemini-2.5-flash) v1.10 GitHub

$ Bash example

# Install samtools (if not already installed)
# conda install -c bioconda samtools=1.10

# Sort the BAM file by coordinate
# Replace 'input.bam' with your actual input BAM file
# Replace 'output.sorted.bam' with your desired output sorted BAM file name
# Adjust '-@ 8' for the number of threads/CPUs you want to use
samtools sort -@ 8 -o output.sorted.bam input.bam

View on GitHub

Command: java -Xmx2048m -XX:+UseParallelOldGC -XX:ParallelGCThreads=4 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Djava.io.tmpdir=/full/path/to/files/.queue/tmp -cp /path/to/gatk/dist/Queue.jar net.sf.picard.sam.SortSam INPUT=/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam TMP_DIR=/full/path/to/files/.queue/tmp OUTPUT=/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam VALIDATION_STRINGENCY=SILENT SO=coordinate CREATE_INDEX=true

Picard v2.x (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install Picard (e.g., via Conda or by downloading the JAR)
# conda install -c bioconda picard
# Or download the picard.jar from https://github.com/broadinstitute/picard/releases (for standalone versions)

# Define variables for paths and files
# NOTE: The original command uses Queue.jar from a GATK distribution for the classpath.
# This implies that the Picard classes were bundled within that specific GATK Queue.jar,
# or that Queue.jar was part of a larger GATK distribution where Picard classes were accessible.
# Adjust GATK_DIST_PATH to your specific GATK distribution path where Queue.jar resides.
GATK_DIST_PATH="/path/to/gatk/dist" # Adjust this to your GATK distribution path
INPUT_BAM="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam" # Input BAM file
OUTPUT_BAM="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam" # Output sorted BAM file
TMP_DIR="/full/path/to/files/.queue/tmp" # Temporary directory

# Create the temporary directory if it doesn't exist
mkdir -p "${TMP_DIR}"

# Execute Picard SortSam command
java -Xmx2048m \
  -XX:+UseParallelOldGC \
  -XX:ParallelGCThreads=4 \
  -XX:GCTimeLimit=50 \
  -XX:GCHeapFreeLimit=10 \
  -Djava.io.tmpdir="${TMP_DIR}" \
  -cp "${GATK_DIST_PATH}/Queue.jar" \
  net.sf.picard.sam.SortSam \
  INPUT="${INPUT_BAM}" \
  TMP_DIR="${TMP_DIR}" \
  OUTPUT="${OUTPUT_BAM}" \
  VALIDATION_STRINGENCY=SILENT \
  SO=coordinate \
  CREATE_INDEX=true

View on GitHub

Takes output from sortSam, makes bam index for use downstream.

samtools index (Inferred with models/gemini-2.5-flash) v1.19 GitHub

$ Bash example

# Install samtools if not already installed
# conda install -c bioconda samtools

# Assuming 'sorted.bam' is the output from sortSam
INPUT_BAM="sorted.bam"

samtools index "${INPUT_BAM}"

View on GitHub

Command: samtools index /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam.bai

samtools v1.10 GitHub

$ Bash example

# Install samtools if not already installed
# conda install -c bioconda samtools=1.10

# Create an index for the sorted BAM file
samtools index /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam.bai

View on GitHub

Takes inputs from multiple final bam files.

samtools (Inferred with models/gemini-2.5-flash) v1.19 GitHub

$ Bash example

bash
# Install samtools if not already available
# conda install -c bioconda samtools

# Merges multiple sorted BAM files into a single sorted BAM file.
# Replace input_1.bam, input_2.bam, etc., with your actual input BAM files.
# Replace merged_output.bam with your desired output file name.
samtools merge merged_output.bam input_1.bam input_2.bam input_3.bam

View on GitHub

Merges the two technical replicates for further downstream analysis.

samtools merge (Inferred with models/gemini-2.5-flash) v1.10 GitHub

$ Bash example

# Install samtools if not already installed
# conda install -c bioconda samtools

# Merge technical replicates (BAM files)
# Replace replicate1.bam and replicate2.bam with your actual input BAM files.
# Replace merged_replicates.bam with your desired output merged BAM file name.
samtools merge merged_replicates.bam replicate1.bam replicate2.bam

View on GitHub

Command: samtools merge /full/path/to/files/CombinedID.merged.bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam /full/path/to/files/file_R1.D08.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam

samtools v1.10 GitHub

$ Bash example

# Install samtools (e.g., using conda)
# conda install -c bioconda samtools

# Define input and output files
OUTPUT_BAM="/full/path/to/files/CombinedID.merged.bam"
INPUT_BAM_1="/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam"
INPUT_BAM_2="/full/path/to/files/file_R1.D08.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam"

# Execute samtools merge
samtools merge "${OUTPUT_BAM}" "${INPUT_BAM_1}" "${INPUT_BAM_2}"

View on GitHub

24
Takes output from sortSam, makes bam index for use downstream.

samtools (Inferred with models/gemini-2.5-flash) v1.10 (Inferred with models/gemini-2.5-flash) GitHub
$ Bash example
```
# Install samtools if not already installed
# conda install -c bioconda samtools=1.10

# Assuming 'sorted.bam' is the output from sortSam
samtools index sorted.bam
```
View on GitHub
25
Command: samtools index /full/path/to/files/CombinedID.merged.bam /full/path/to/files/CombinedID.merged.bam.bai

samtools v(Inferred with models/gemini-2.5-flash) GitHub
$ Bash example
```
# Install samtools if not already available
# conda install -c bioconda samtools

samtools index /full/path/to/files/CombinedID.merged.bam /full/path/to/files/CombinedID.merged.bam.bai
```
View on GitHub

Takes output from sortSam.

samtools index (Inferred with models/gemini-2.5-flash) v1.19 GitHub

$ Bash example

# Install samtools if not already installed
# conda install -c bioconda samtools

# This command creates an index file (.bai) for a sorted BAM file,
# which is essential for many downstream applications and visualization.
# Replace 'sorted.bam' with the actual output file from sortSam.
samtools index sorted.bam

View on GitHub

Only outputs the second read in each pair for use with single stranded peak caller.

reformat.sh (Inferred with models/gemini-2.5-flash) v38.90 (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install BBMap suite (contains reformat.sh)
# conda install -c bioconda bbmap

# Define input and output file names
INPUT_R1="input_R1.fastq.gz"
INPUT_R2="input_R2.fastq.gz"
OUTPUT_R2_ONLY="output_R2_only.fastq.gz"

# Only outputs the second read in each pair for use with single stranded peak caller.
# 'in1' and 'in2' specify the paired-end input files.
# 'out1=null' discards the first read.
# 'out2' specifies the output file for the second read.
reformat.sh in1="${INPUT_R1}" in2="${INPUT_R2}" out1=null out2="${OUTPUT_R2_ONLY}"

View on GitHub

This is the final bam file to perform analysis on.

samtools (Inferred with models/gemini-2.5-flash) v1.19 GitHub

$ Bash example

# Install samtools if not already available
# conda install -c bioconda samtools

# Placeholder for the input BAM file (e.g., from an alignment step)
INPUT_BAM="input.bam"

# Define the output sorted BAM file name
OUTPUT_SORTED_BAM="final.sorted.bam"

# Sort the BAM file to prepare it for indexing and downstream analysis
samtools sort -o "${OUTPUT_SORTED_BAM}" "${INPUT_BAM}"

# Index the sorted BAM file for quick access to specific regions
samtools index "${OUTPUT_SORTED_BAM}"

echo "Final BAM file '${OUTPUT_SORTED_BAM}' and its index '${OUTPUT_SORTED_BAM}.bai' are ready for analysis."

View on GitHub

Command: samtools view -hb -f 128 /full/path/to/files/CombinedID.merged.bam > /full/path/to/files/CombinedID.merged.r2.bam

samtools v1.10+ GitHub

$ Bash example

# Install samtools (if not already installed)
# conda install -c bioconda samtools

# Define input and output file paths
INPUT_BAM="/full/path/to/files/CombinedID.merged.bam"
OUTPUT_BAM="/full/path/to/files/CombinedID.merged.r2.bam"

# Filter BAM file to extract reads that are the second in a pair
# -h: Include header in the output
# -b: Output in BAM format
# -f 128: Only output reads with the FLAG bit 128 set (second in pair)
samtools view -hb -f 128 "${INPUT_BAM}" > "${OUTPUT_BAM}"

View on GitHub

Takes results from samtools view.

samtools v1.17 GitHub

$ Bash example

# Install samtools if not already available
# conda install -c bioconda samtools

# This command takes an input BAM file and outputs a SAM file, including the header.
# Replace 'input.bam' with your actual input file and 'output.sam' with your desired output file.
samtools view -h input.bam > output.sam

View on GitHub

Calls peaks on those files.

clipper (Inferred with models/gemini-2.5-flash) vlatest (Inferred with models/gemini-2.5-flash) GitHub

$ Bash example

# Install clipper (if not already installed)
# git clone https://github.com/yeolab/clipper.git
# cd clipper
# pip install -r requirements.txt # Check if requirements.txt exists, if not, assume basic python dependencies

# For demonstration, let's assume clipper.py is in the current PATH or directory
# And placeholder BAM files exist: input_ip.bam, input_control.bam

# Download hg38 chromosome sizes if not available (using UCSC as source)
# wget -qO- http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes > hg38.chrom.sizes

# Call peaks using clipper
# Assuming input_ip.bam and input_control.bam are the input files
# And hg38.chrom.sizes is the genome size file (placeholder for latest assembly)
python clipper.py -b input_ip.bam -c input_control.bam -s hg38.chrom.sizes -o peaks_output

View on GitHub

Command: clipper -b /full/path/to/files/CombinedID.merged.r2.bam -s hg19 -o /full/path/to/files/CombinedID.merged.r2.peaks.bed --bonferroni --superlocal --threshold-method binomial --save-pickle

CLIPper vNot specified GitHub

$ Bash example

# Install CLIPper (example, adjust as needed)
# CLIPper is a Python-based tool. It can often be run directly after cloning or installed via pip.
# git clone https://github.com/yeolab/clipper.git
# cd clipper
# python setup.py install

# Ensure required Python dependencies are installed (e.g., numpy, scipy, pysam)
# conda install -c bioconda numpy scipy pysam

# Define input and output paths
INPUT_BAM="/full/path/to/files/CombinedID.merged.r2.bam"
OUTPUT_BED="/full/path/to/files/CombinedID.merged.r2.peaks.bed"
GENOME_ASSEMBLY="hg19" # Reference genome assembly for CLIPper

# Execute CLIPper command for peak calling
clipper -b "${INPUT_BAM}" -s "${GENOME_ASSEMBLY}" -o "${OUTPUT_BED}" --bonferroni --superlocal --threshold-method binomial --save-pickle

View on GitHub

Tools Used

STAR

Raw Source Text

Takes output from raw files. Run to trim off both 5â and 3â adapters on both reads. Command: quality-cutoff 6 -m 18 -a NNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -g CTTCCGATCTACAAGTT -g CTTCCGATCTTGGTCCT -A AACTTGTAGATCGGA -A AGGACCAAGATCGGA -A ACTTGTAGATCGGAA -A GGACCAAGATCGGAA -A CTTGT AGATCGGAAG -A GACCAAGATCGGAAG -A TTGTAGATCGGAAGA -A ACCAAGATCGGAAGA -A TGTAGATCGGAAGAG -A CCAAGATCGGAAGAG -A GTAGATCGGAAGAGC -A CAAGATCGGAAGAGC -A TAGATCGGAAGAGCG -A AAGATCGGAAGAGCG -A AGATCGGAAGAGCGT -A GATCGGAAGAGCGTC -A ATCGGAAGAGCGTCG -A TCGGAAGAGCGTCGT -A CGGAAGAGCGTCGTG -A GGAAGAGCGTCGTGT -o /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz -p /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz /full/path/to/files/file_R1.C01.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.metrics
Takes output from cutadapt round 1. Run to trim off the 3â adapters on read 2, to control for double ligation events. Command: cutadapt -f fastq --match-read-wildcards --times 1 -e 0.1 -O 5 --quality-cutoff 6 -m 18 -A AACTTGTAGATCGGA -A AGGACCAAGATCGGA -A ACTTGTAGATCGGAA -A GGACCAAGATCGGAA -A CTTGTAGATCGGAAG -A GACCAAGATCGGAAG -A TTGTAGATCGGAAGA -A ACCAAGATCGGAAGA -A TGTAGATCGGAAGAG -A CCAAGATCGGAAGAG -A GTAGATCGGAAGAGC -A CAAGATCGGAAGAGC -A TAGATCGGAAGAGCG -A AAGATCGGAAGAGCG -A AGATCGGAAGAGCGT -A GATCGGAAGAGCGTC -A ATCGGAAGAGCGTCG -A TCGGAAGAGCGTCGT -A CGGAAGAGCGTCGTG -A GGAAGAGCGTCGTGT -o /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz -p /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.fastq.gz > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.metrics
Takes output from cutadapt round 2. Maps to human specific version of RepBase used to remove repetitive elements, helps control for spurious artifacts from rRNA (& other) repetitive reads. Command: STAR --runMode alignReads --runThreadN 16 --genomeDir /path/to/RepBase_human_database_file --genomeLoad LoadAndRemove --readFilesIn /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.fastq.gz /full/path/to/files/file_R2.C01.fastq.gz.adapterTrim.round2.fastq.gz --outSAMunmapped Within --outFilterMultimapNmax 30 --outFilterMultimapScoreRange 1 --outFileNamePrefix /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam --outSAMattributes All --readFilesCommand zcat --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bam
Takes output from STAR rmRep. Maps unique reads to the human genome. Command: STAR --runMode alignReads --runThreadN 16 --genomeDir /path/to/STAR_database_file --genomeLoad LoadAndRemove --readFilesIn /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate1 /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rep.bamUnmapped.out.mate2 --outSAMunmapped Within --outFilterMultimapNmax 1 --outFilterMultimapScoreRange 1 --outFileNamePrefix /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam --outSAMattributes All --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam
Takes output from STAR genome mapping. Custom random-mer-aware script for PCR duplicate removal. Command: barcode_collapse_pe.py --bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.bam --out_file /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam --metrics_file /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.metrics
Takes output from barcode collapse PE. Sorts resulting bam file for use downstream. Command: java -Xmx2048m -XX:+UseParallelOldGC -XX:ParallelGCThreads=4 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Djava.io.tmpdir=/full/path/to/files/.queue/tmp -cp /path/to/gatk/dist/Queue.jar net.sf.picard.sam.SortSam INPUT=/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.bam TMP_DIR=/full/path/to/files/.queue/tmp OUTPUT=/full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam VALIDATION_STRINGENCY=SILENT SO=coordinate CREATE_INDEX=true
Takes output from sortSam, makes bam index for use downstream. Command: samtools index /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam.bai
Takes inputs from multiple final bam files. Merges the two technical replicates for further downstream analysis. Command: samtools merge /full/path/to/files/CombinedID.merged.bam /full/path/to/files/file_R1.C01.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam /full/path/to/files/file_R1.D08.fastq.gz.adapterTrim.round2.rmRep.rmDup.sorted.bam
Takes output from sortSam, makes bam index for use downstream. Command: samtools index /full/path/to/files/CombinedID.merged.bam /full/path/to/files/CombinedID.merged.bam.bai
Takes output from sortSam. Only outputs the second read in each pair for use with single stranded peak caller. This is the final bam file to perform analysis on. Command: samtools view -hb -f 128 /full/path/to/files/CombinedID.merged.bam > /full/path/to/files/CombinedID.merged.r2.bam
Takes results from samtools view. Calls peaks on those files. Command: clipper -b /full/path/to/files/CombinedID.merged.r2.bam -s hg19 -o /full/path/to/files/CombinedID.merged.r2.peaks.bed --bonferroni --superlocal --threshold-method binomial --save-pickle
Genome_build: hg19
Supplementary_files_format_and_content: bigWig format. Maps the raw reads to both hg19 and mm10 respectively. Describes the mapped reads from the eCLIP experiments.
Supplementary_files_format_and_content: bed format, contains clusters of predicted RBP binding (on hg19 and mm10 respectively); column 4 contains -log10(p-value) and column 5 contains log2(fold-enrichment) in eCLIP versus paired size-matched input

← Back to Analysis