GSE193134 Processing Pipeline

RIP-Seq code_examples 18 steps

Publication

LARP4 is an RNA-binding protein that binds nuclear-encoded mitochondrial mRNAs to promote mitochondrial function.

RNA (New York, N.Y.) (2024) — PMID 38164626

Dataset

GSE193134

LARP4 Is an RNA-Binding Protein That Binds Nuclear-Encoded Mitochondrial mRNAs To Promote Mitochondrial Function

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Sequenced reads were reformatted to include randomers in read headers with umi_tools (1.0.0).

    UMI-tools v1.0.0 GitHub
    $ Bash example
    # Install UMI-tools (e.g., via conda)
    # conda install -c bioconda umi-tools=1.0.0
    
    # Example: Extract UMIs (randomers) from the start of Read 1 (10bp long) and add them to the headers of both reads.
    # Replace 'input_read1.fastq.gz', 'input_read2.fastq.gz' with your actual input files.
    # Adjust '--bc-pattern' if the UMI length or position is different (e.g., 'NNNNNNNNNN' for 10 Ns).
    # If UMIs are in Read 2, swap the --input and --read1-in parameters accordingly.
    
    umi_tools extract \
        --input input_read1.fastq.gz \
        --output output_read1_umi.fastq.gz \
        --read2-in input_read2.fastq.gz \
        --read2-out output_read2_umi.fastq.gz \
        --extract-method=string \
        --bc-pattern=NNNNNNNNNN \
        --log umi_tools_extract.log
  2. 2

    Args: --random-seed 1 --bc-pattern NNNNNNNNNN

    umi_tools extract (Inferred with models/gemini-2.5-flash) v1.1.2
    $ Bash example
    # Install umi_tools if not already installed
    # conda install -c bioconda umi-tools
    
    # Example usage of umi_tools extract with the provided arguments.
    # This command assumes input FASTQ files (read1.fastq.gz, read2.fastq.gz)
    # and outputs UMI-extracted FASTQ files (umi_extracted_read1.fastq.gz, umi_extracted_read2.fastq.gz).
    # Adjust input/output filenames and which read contains the UMI (--stdin) as per your library preparation.
    
    umi_tools extract \
      --random-seed 1 \
      --bc-pattern NNNNNNNNNN \
      --stdin read1.fastq.gz \
      --read2-in read2.fastq.gz \
      --stdout umi_extracted_read1.fastq.gz \
      --read2-out umi_extracted_read2.fastq.gz
  3. 3

    Reads were then trimmed with cutadapt (1.14).

    cutadapt v1.14 GitHub
    $ Bash example
    # Install cutadapt if not already installed
    # conda install -c bioconda cutadapt=1.14
    
    # Example command for trimming reads with cutadapt.
    # This command assumes common Illumina adapters and performs quality trimming.
    # Replace 'input.fastq.gz' with your actual input file and 'trimmed.fastq.gz' with your desired output file.
    # For paired-end reads, you would use -p for the second read file and -A for the reverse complement adapter.
    # The adapter sequences used here are common Illumina universal adapters; these may need to be adjusted based on the library preparation kit.
    cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -q 20,20 -m 20 -o trimmed.fastq.gz input.fastq.gz
  4. 4

    Args: --match-read-wildcards -O 1 --times 1 -e 0.1 --quality-cutoff 6 -m 18 -a InvRNA*.fasta (fasta sequences can be found at: https://github.com/YeoLab/eclip/tree/master/example/inputs/)

    $ Bash example
    # Install cutadapt
    # conda install -c bioconda cutadapt=1.18
    
    # Download adapter sequence file
    wget https://raw.githubusercontent.com/YeoLab/eclip/master/example/inputs/InvRNA.fasta
    
    # Placeholder for input reads
    # Replace 'input_reads.fastq.gz' with your actual input file
    # Placeholder for output reads
    # Replace 'trimmed_reads.fastq.gz' with your desired output file
    
    cutadapt \
      --match-read-wildcards \
      -O 1 \
      --times 1 \
      -e 0.1 \
      --quality-cutoff 6 \
      -m 18 \
      -a file:InvRNA.fasta \
      -o trimmed_reads.fastq.gz \
      input_reads.fastq.gz
  5. 5

    Reads were then trimmed again with cutadapt (1.14) to remove double-ligation events.

    cutadapt v1.14 GitHub
    $ Bash example
    # Install cutadapt (e.g., using conda)
    # conda install -c bioconda cutadapt=1.14
    
    # Define input and output file names
    INPUT_FASTQ="input.fastq.gz"
    OUTPUT_FASTQ="trimmed_output.fastq.gz"
    
    # Placeholder for the adapter sequence that causes double-ligation events.
    # This sequence would be specific to the library preparation protocol.
    # Example: A common Illumina TruSeq adapter sequence is used here.
    ADAPTER_SEQUENCE="AGATCGGAAGAGCACACGTCTGAACTCCAGTCA"
    
    # Trim reads with cutadapt to remove double-ligation events
    # -a ADAPTER_SEQUENCE: Trims the 3' adapter sequence
    # -o: Specifies the output file
    cutadapt -a "${ADAPTER_SEQUENCE}" -o "${OUTPUT_FASTQ}" "${INPUT_FASTQ}"
  6. 6

    Args: --match-read-wildcards -O 5 --times 1 -e 0.1 --quality-cutoff 6 -m 18 -a Ril19.fasta (fasta sequences can be found at: https://github.com/YeoLab/eclip/tree/master/example/inputs/)

    eCLIP v1.0.1
    $ Bash example
    # Install umi_tools if not already installed
    # conda install -c bioconda umi_tools
    
    # Download the adapter file (Ril19.fasta) if not locally available
    # wget https://raw.githubusercontent.com/YeoLab/eclip/master/example/inputs/Ril19.fasta
    
    # Example usage of umi_tools dedup
    # Replace input.bam with your actual input BAM file
    # Replace output.bam with your desired output BAM file
    umi_tools dedup \
      --match-read-wildcards \
      --output-encoding=5 \
      --times 1 \
      --error-rate=0.1 \
      --quality-cutoff=6 \
      --min-reads=18 \
      --adapter-file=Ril19.fasta \
      -I input.bam \
      -S output.bam
  7. 7

    Trimmed and filtered reads were then mapped with STAR (2.7.6a) against a repeat element database (RepBase 18.05).

    $ Bash example
    # Install STAR (example using conda)
    # conda create -n star_env star=2.7.6a -c bioconda -y
    # conda activate star_env
    
    # Define variables
    STAR_VERSION="2.7.6a"
    REPBASE_FASTA="RepBase18.05.fasta" # Placeholder: Replace with the actual path to the RepBase 18.05 FASTA file
    GENOME_DIR="star_repbase_index"
    READS_R1="trimmed_filtered_R1.fastq.gz" # Placeholder: Replace with the actual path to your trimmed and filtered R1 reads
    READS_R2="trimmed_filtered_R2.fastq.gz" # Placeholder: Replace with the actual path to your trimmed and filtered R2 reads (remove if single-end)
    OUTPUT_PREFIX="repbase_aligned"
    THREADS=8 # Adjust based on available CPU cores
    
    # 1. Build STAR index for RepBase (if not already built)
    # This step assumes you have the RepBase FASTA file.
    # For smaller reference sequences like repeat databases, 'genomeSAindexNbases' might need to be adjusted from the default 14.
    # A value of 10 is often suitable for smaller references.
    echo "Building STAR index for RepBase..."
    mkdir -p "${GENOME_DIR}"
    STAR --runMode genomeGenerate \
         --genomeDir "${GENOME_DIR}" \
         --genomeFastaFiles "${REPBASE_FASTA}" \
         --runThreadN "${THREADS}" \
         --genomeSAindexNbases 10 # Recommended for smaller reference sequences like repeat databases
    
    # 2. Map reads with STAR
    # Assuming paired-end reads. For single-end reads, remove the second file from --readFilesIn.
    echo "Mapping reads with STAR..."
    STAR --version # Confirm STAR version
    STAR --runThreadN "${THREADS}" \
         --genomeDir "${GENOME_DIR}" \
         --readFilesIn "${READS_R1}" "${READS_R2}" \
         --readFilesCommand zcat \
         --outFileNamePrefix "${OUTPUT_PREFIX}." \
         --outSAMtype BAM SortedByCoordinate \
         --outSAMattributes All \
         --outFilterMultimapNmax 100 \
         --alignIntronMax 1 \
         --alignSJDBoverhangMin 1 \
         --outReadsUnmapped Fastx # Output unmapped reads to a separate file
  8. 8

    Args: --runThreadN 16 \ --genomeDir human_repbase \ --readFilesIn path/to/read1 \ --outFileNamePrefix out_prefix \ --outReadsUnmapped Fastx \ --outSAMtype BAM Unsorted \ --outSAMattributes All \ --outSAMunmapped Within \ --outSAMattrRGline ID:foo \ --outFilterType BySJout \ --outFilterMultimapNmax 30 \ --outFilterMultimapScoreRange 1 \ --outFilterScoreMin 10 \ --alignEndsType EndToEnd

    STAR (Inferred with models/gemini-2.5-flash) v2.7.10a GitHub
    $ Bash example
    # Install STAR (if not already installed)
    # conda install -c bioconda star
    
    # Note: 'human_repbase' refers to a pre-built STAR genome index directory.
    # This index should be generated using STAR's genomeGenerate function,
    # typically from a human reference genome (e.g., GRCh38) and potentially
    # including repetitive element sequences (e.g., from Repbase) if 'repbase'
    # in the name implies such an inclusion.
    # Example for index generation (adjust paths and files as needed):
    # STAR --runThreadN 16 \
    #      --runMode genomeGenerate \
    #      --genomeDir human_repbase \
    #      --genomeFastaFiles /path/to/GRCh38.primary_assembly.fa /path/to/repbase_sequences.fa \
    #      --sjdbGTFfile /path/to/gencode.vXX.annotation.gtf
    
    STAR \
        --runThreadN 16 \
        --genomeDir human_repbase \
        --readFilesIn path/to/read1 \
        --outFileNamePrefix out_prefix \
        --outReadsUnmapped Fastx \
        --outSAMtype BAM Unsorted \
        --outSAMattributes All \
        --outSAMunmapped Within \
        --outSAMattrRGline ID:foo \
        --outFilterType BySJout \
        --outFilterMultimapNmax 30 \
        --outFilterMultimapScoreRange 1 \
        --outFilterScoreMin 10 \
        --alignEndsType EndToEnd
  9. 9

    Unmapped reads filtered of repeat elements were then mapped with STAR (2.7.6a) against a human genome (hg19).

    $ Bash example
    # Install STAR (version 2.7.6a)
    # conda install -c bioconda star=2.7.6a
    
    # Reference Genome: Human genome (hg19)
    # Download hg19 FASTA from UCSC: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
    # Download hg19 GTF from GENCODE (e.g., v19): https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz
    
    # Define input and output paths
    INPUT_READS="filtered_unmapped_reads.fastq.gz" # Placeholder for input FASTQ file
    STAR_INDEX_DIR="path/to/STAR_hg19_index"       # Directory containing STAR index for hg19
    OUTPUT_PREFIX="mapped_reads_hg19_"             # Prefix for output files
    
    # Run STAR alignment
    STAR --runMode alignReads \
         --genomeDir "${STAR_INDEX_DIR}" \
         --readFilesIn "${INPUT_READS}" \
         --outFileNamePrefix "${OUTPUT_PREFIX}" \
         --outSAMtype BAM SortedByCoordinate \
         --readFilesCommand zcat \
         --outFilterMultimapNmax 20 \
         --outFilterMismatchNmax 10 \
         --outFilterScoreMinOverLread 0.66 \
         --outFilterMatchNminOverLread 0.66 \
         --limitBAMsortRAM 30000000000 # Example: 30GB RAM for sorting, adjust as needed
  10. 10

    Args: --runThreadN 16 \ --genomeDir genomedir \ --readFilesIn /path/to/read1 \ --outFileNamePrefix out_prefix \ --outReadsUnmapped Fastx \ --outSAMtype BAM Unsorted \ --outSAMattributes All \ --outSAMunmapped Within \ --outSAMattrRGline ID:foo \ --outFilterType BySJout \ --outFilterMultimapNmax 1 \ --outFilterMultimapScoreRange 1 \ --outFilterScoreMin 10 \ --alignEndsType EndToEnd

    STAR (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install STAR (example using conda)
    # conda install -c bioconda star
    
    STAR \
        --runThreadN 16 \
        --genomeDir /path/to/STAR_genome_index \
        --readFilesIn /path/to/read1.fastq.gz \
        --outFileNamePrefix my_output_prefix \
        --outReadsUnmapped Fastx \
        --outSAMtype BAM Unsorted \
        --outSAMattributes All \
        --outSAMunmapped Within \
        --outSAMattrRGline ID:foo \
        --outFilterType BySJout \
        --outFilterMultimapNmax 1 \
        --outFilterMultimapScoreRange 1 \
        --outFilterScoreMin 10 \
        --alignEndsType EndToEnd
  11. 11

    Aligned reads were sorted with samtools (1.6)

    samtools v1.6 GitHub
    $ Bash example
    # Install samtools if not already installed
    # conda install -c bioconda samtools=1.6
    
    # Sort aligned reads
    # Replace 'aligned_reads.bam' with your actual input BAM file
    # Replace 'sorted_reads.bam' with your desired output sorted BAM file
    samtools sort -o sorted_reads.bam aligned_reads.bam
  12. 12

    Sorted reads were collapsed with umi_tools (1.0.0).

    UMI-tools v1.0.0 GitHub
    $ Bash example
    # Install umi_tools (if not already installed)
    # conda install -c bioconda umi-tools
    
    # Define input and output file names
    INPUT_BAM="sorted_reads.bam" # Placeholder for the sorted input BAM file
    OUTPUT_BAM="collapsed_reads.bam" # Placeholder for the output collapsed BAM file
    
    # Collapse reads using umi_tools dedup
    # This command assumes UMIs are present in the read IDs (e.g., after umi_tools extract)
    # If UMIs are in a SAM tag (e.g., 'XN'), use --umi-tag=XN
    # If reads are paired-end, add --paired
    # If reads are spliced (e.g., RNA-seq), consider --spliced-reads
    umi_tools dedup \
        --input="${INPUT_BAM}" \
        --output="${OUTPUT_BAM}" \
        --extract-umi-method=read_id
  13. 13

    Args: --random-seed 1 --method unique

    Custom Script/Utility (Inferred with models/gemini-2.5-flash) vN/A
    $ Bash example
    # This command represents a generic data processing step with the specified arguments.
    # The specific tool or script is not provided in the description.
    # Replace 'your_custom_script_or_tool' with the actual command if known.
    your_custom_script_or_tool --random-seed 1 --method unique
  14. 14

    BAM files were used to identify peak clusters with Clipper (1.2.2).

    CLIPper v1.2.2 GitHub
    $ Bash example
    # Install CLIPper and its dependencies (e.g., pysam, numpy, scipy, pybedtools)
    # pip install pysam numpy scipy pybedtools
    # git clone https://github.com/yeolab/clipper.git
    # cd clipper
    # # Assuming the main branch or a specific commit corresponds to version 1.2.2
    
    # Placeholder for input and control BAM files
    INPUT_BAM="sample_ip.bam"
    CONTROL_BAM="sample_input.bam" # Or 'sample_control.bam'
    OUTPUT_DIR="clipper_peaks"
    OUTPUT_PREFIX="sample_peaks"
    
    # Placeholder for genome reference (hg38 chrom.sizes)
    # If hg38.chrom.sizes is not available, you can download it:
    # wget -O hg38.chrom.sizes http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
    
    mkdir -p "${OUTPUT_DIR}"
    
    python clipper.py \
        "${INPUT_BAM}" \
        -c "${CONTROL_BAM}" \
        -g hg38.chrom.sizes \
        -o "${OUTPUT_DIR}" \
        -p "${OUTPUT_PREFIX}"
  15. 15

    Args: --species (hg19) --bam path/to/input.bam --timeout 3600000 --maxgenes 1000000 --save-pickle --outfile path/to/output.bam

    custom_bam_processor.py (Inferred with models/gemini-2.5-flash) vNot inferable GitHub
    $ Bash example
    # This script is inferred to be a custom Python script due to the unique combination of arguments, especially '--save-pickle'.
    # Installation instructions (example, adjust as needed):
    # conda create -n myenv python=3.9
    # conda activate myenv
    # pip install pandas numpy # Example dependencies, actual dependencies would depend on the script's content
    
    # Assuming 'custom_bam_processor.py' is the name of the Python script
    python custom_bam_processor.py \
        --species hg19 \
        --bam path/to/input.bam \
        --timeout 3600000 \
        --maxgenes 1000000 \
        --save-pickle \
        --outfile path/to/output.bam
  16. 16

    Peak clusters were normalized using BAM files for IP against BAM files for INPUT with peaksnormalize.pl (overlap_peakfi_with_bam_PE.pl), included in eclip 0.1.5+.

    $ Bash example
    # Clone the eclip repository if not already available
    # git clone https://github.com/yeolab/eclip.git
    # cd eclip
    # Assuming eclip/bin is in your PATH or you provide the full path to the script
    
    # Define placeholder variables
    PEAK_CLUSTER_FILE="your_peak_clusters.bed" # Example: .bed, .narrowPeak, .broadPeak
    IP_BAM_FILE="your_ip_sample.bam"
    INPUT_BAM_FILE="your_input_sample.bam"
    NORMALIZED_PEAK_PREFIX="normalized_peaks"
    
    # Execute the normalization script
    # Ensure peaksnormalize.pl is executable and in your PATH, or use its full path
    peaksnormalize.pl \
      --peak_file "${PEAK_CLUSTER_FILE}" \
      --ip_bam "${IP_BAM_FILE}" \
      --input_bam "${INPUT_BAM_FILE}" \
      --output_prefix "${NORMALIZED_PEAK_PREFIX}"
  17. 17

    Overlapping normalized peak regions were merged with compress_l2foldenrpeakfi_for_replicate_overlapping_bedformat.pl, included within eclip-0.1.5+

    $ Bash example
    # The script compress_l2foldenrpeakfi_for_replicate_overlapping_bedformat.pl is part of the Yeo Lab's merge_peaks repository, which is used for IDR and merging reproducible peaks in eCLIP pipelines.
    # It is typically included or called within the broader eCLIP workflow (e.g., eclip-0.1.5+).
    
    # Installation (commented out as it's usually part of a larger pipeline setup or cloned directly):
    # git clone https://github.com/yeolab/merge_peaks.git
    # cd merge_peaks
    
    # Define input and output file names based on the description.
    # "Overlapping normalized peak regions" implies input BED files that have already undergone normalization.
    # The script name suggests it processes L2-fold enrichment peak files.
    INPUT_REPLICATE1_PEAKS="replicate1_normalized_l2foldenr_peaks.bed"
    INPUT_REPLICATE2_PEAKS="replicate2_normalized_l2foldenr_peaks.bed"
    OUTPUT_MERGED_PEAKS="merged_replicate_normalized_peaks.bed"
    
    # Execute the Perl script.
    # The script typically takes multiple input BED files and merges them, often outputting to stdout.
    # Adjust the path to the script if it's not in the current directory or in your PATH.
    perl compress_l2foldenrpeakfi_for_replicate_overlapping_bedformat.pl \
        "${INPUT_REPLICATE1_PEAKS}" \
        "${INPUT_REPLICATE2_PEAKS}" \
        > "${OUTPUT_MERGED_PEAKS}"
    
  18. 18

    Normalized peak (compressed.bed) files were ranked by entropy score (make_informationcontent_from_peaks.pl included within the merge_peaks pipeline) and used as inputs to IDR (2.0.2) to determine reproducible peaks.

    IDR v2.0.2 GitHub
    $ Bash example
    # Install IDR (if not already installed)
    # conda install -c bioconda idr=2.0.2
    
    # Install merge_peaks pipeline (which includes make_informationcontent_from_peaks.pl)
    # git clone https://github.com/yeolab/merge_peaks.git
    # export PATH=$PATH:$(pwd)/merge_peaks/bin
    
    # Define input peak files (e.g., from CLIPper or similar peak caller)
    # These are placeholders; replace with actual file paths.
    INPUT_PEAKS_REP1="rep1.compressed.bed"
    INPUT_PEAKS_REP2="rep2.compressed.bed"
    
    # Define output files for entropy-ranked peaks
    ENTROPY_RANKED_REP1="rep1.entropy_ranked.bed"
    ENTROPY_RANKED_REP2="rep2.entropy_ranked.bed"
    
    # Define output prefix for IDR results
    IDR_OUTPUT_PREFIX="reproducible_peaks"
    
    # Step 1: Rank normalized peak files by entropy score using make_informationcontent_from_peaks.pl
    # This script takes a BED file and outputs a BED file with an entropy score in the 5th column.
    make_informationcontent_from_peaks.pl -i "${INPUT_PEAKS_REP1}" -o "${ENTROPY_RANKED_REP1}"
    make_informationcontent_from_peaks.pl -i "${INPUT_PEAKS_REP2}" -o "${ENTROPY_RANKED_REP2}"
    
    # Step 2: Run IDR (2.0.2) using the entropy-ranked peaks as inputs.
    # IDR will use the 5th column (score) from the BED files for ranking, as specified by --rank score.
    idr --samples "${ENTROPY_RANKED_REP1}" "${ENTROPY_RANKED_REP2}" \
        --output-file "${IDR_OUTPUT_PREFIX}" \
        --rank score \
        --plot \
        --log-output-file "${IDR_OUTPUT_PREFIX}.log"

Tools Used

Raw Source Text
Sequenced reads were reformatted to include randomers in read headers with umi_tools (1.0.0). Args: --random-seed 1 --bc-pattern NNNNNNNNNN
Reads were then trimmed with cutadapt (1.14). Args: --match-read-wildcards -O 1 --times 1 -e 0.1 --quality-cutoff 6 -m 18 -a InvRNA*.fasta (fasta sequences can be found at: https://github.com/YeoLab/eclip/tree/master/example/inputs/)
Reads were then trimmed again with cutadapt (1.14) to remove double-ligation events. Args: --match-read-wildcards -O 5 --times 1 -e 0.1 --quality-cutoff 6 -m 18  -a Ril19.fasta (fasta sequences can be found at: https://github.com/YeoLab/eclip/tree/master/example/inputs/)
Trimmed and filtered reads were then mapped with STAR (2.7.6a) against a repeat element database (RepBase 18.05). Args: --runThreadN 16 \  --genomeDir human_repbase \  --readFilesIn path/to/read1 \  --outFileNamePrefix out_prefix \  --outReadsUnmapped Fastx \  --outSAMtype BAM Unsorted \  --outSAMattributes All \  --outSAMunmapped Within \  --outSAMattrRGline ID:foo \  --outFilterType BySJout \  --outFilterMultimapNmax 30 \  --outFilterMultimapScoreRange 1 \  --outFilterScoreMin 10 \  --alignEndsType EndToEnd
Unmapped reads filtered of repeat elements were then mapped with STAR (2.7.6a) against a human genome (hg19). Args: --runThreadN 16 \  --genomeDir genomedir \  --readFilesIn /path/to/read1 \  --outFileNamePrefix out_prefix \  --outReadsUnmapped Fastx \  --outSAMtype BAM   Unsorted \  --outSAMattributes All \  --outSAMunmapped Within \  --outSAMattrRGline ID:foo \  --outFilterType BySJout \  --outFilterMultimapNmax 1 \  --outFilterMultimapScoreRange 1 \  --outFilterScoreMin 10 \  --alignEndsType EndToEnd
Aligned reads were sorted with samtools (1.6)
Sorted reads were collapsed with umi_tools (1.0.0). Args: --random-seed 1 --method unique
BAM files were used to identify peak clusters with Clipper (1.2.2). Args: --species (hg19) --bam path/to/input.bam --timeout 3600000 --maxgenes 1000000 --save-pickle --outfile path/to/output.bam
Peak clusters were normalized using BAM files for IP against BAM files for INPUT with peaksnormalize.pl (overlap_peakfi_with_bam_PE.pl), included in eclip 0.1.5+.
Overlapping normalized peak regions were merged with compress_l2foldenrpeakfi_for_replicate_overlapping_bedformat.pl, included within eclip-0.1.5+
Normalized peak (compressed.bed) files were ranked by entropy score (make_informationcontent_from_peaks.pl included within the merge_peaks pipeline) and used as inputs to IDR (2.0.2) to determine reproducible peaks.
Genome_build: hg19
Supplementary_files_format_and_content: Hek293_WT_P_rep_1.umi.r1.fq.genome-mappedSoSo.rmDupSo.peakClusters.normed.compressed.bed and Hek293_WT_P_rep_2.umi.r1.fq.genome-mappedSoSo.rmDupSo.peakClusters.normed.compressed.bed contain size matched input normalized eCLIP peaks from rep1 and rep2
Supplementary_files_format_and_content: LARP4_hek_WT.bed contains reproducible total eCLIP peaks across replicates (IDR peaks)
Supplementary_files_format_and_content: BigWig files contain RPM-normalized read densities
← Back to Analysis