GSE77633
GSE GEOEnhanced CLIP (eCLIP) enables robust and scalable transcriptome-wide discovery and characterization of RNA binding protein binding sites [iCLIP]
Relations
Summary
RNA binding proteins (RBPs) play essential roles in cellular physiology by interacting with target RNAs. As defects in protein-RNA recognition lead to human disease, UV-crosslinking and immunoprecipitation (CLIP) of ribonuclear complexes followed by deep sequencing (-seq) is critical in constructing protein-RNA maps to expand our understanding of RBP function. However, current CLIP protocols are technically demanding and involve low complexity libraries that yield squandered sequencing of PCR duplicates and high experimental failure rates. To enable truly large-scale implementation of CLIP-seq, we have developed an enhanced CLIP methodology (eCLIP) that features a decrease of ~10 cycles of requisite amplification with a concomitant >60% decrease in discarded PCR duplicate reads, while maintaining the ability to identify RNA binding with single-nucleotide resolution. By simplifying the generation of paired IgG and size-matched input controls, eCLIP also dramatically improves specificity in discovery of authentic binding sites. To demonstrate that eCLIP enables large-scale and robust profiling of RBPs, 102 eCLIP experiments in biological duplicate for a diverse collection of 74 RBPs in HepG2 and K562 cells were completed (available at https://www.encodeproject.org). We establish that eCLIP is comparable in amplification and sample requirements to ChIP-seq, and enables integrative analysis of diverse RBPs to reveal factor-specific profiles, common artifacts for CLIP experiments and RNA-centric perspectives of RBP activity.
Overall Design
iCLIP-seq against RBFOX2 in 293T Cells
Analysis (7 steps)
View Data Processing- Sequencing reads from CLIP-seq libraries were first trimmed of polyA tails, adapters, and low quality ends using cutadapt with parameters --match-read-wildcards --times 2 -e 0 -O 5 --quality-cutoff' 6 -m 18 -b TCGTATGCCGTCTTCTGCTTG -b ATCTCGTATGCCGTCTTCTGCTTG -b CGACAGGTTCAGAGTTCTACAGTCCGACGATC -b TGGAATTCTCGGGTGCCAAGG -b AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA -b TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT.
- Reads were then mapped against a database of repetitive elements derived from RepBase18.05.
- Bowtie version 1.0.0 with parameters -S -q -p 16 -e 100 -l 20 was used to align reads against an index generated from Repbase sequences (Langmead et al., 2009).
- Reads not mapped to Repbase sequences were aligned to the hg19 human genome (UCSC assembly) using STAR (Dobin et al., 2013) version 2.3.0e with parameters --outSAMunmapped Within âoutFilterMultimapNmax 1 âoutFilterMultimapScoreRange 1.
- Reads that were PCR replicates were removed from each CLIP-seq library using a custom script.
- Briefly one read with a unique barcode was kept at each nucleotide position when more than one with the same barcode was mapped to the same location
- Clusters were then assigned using the CLIPper software with parameters --bonferroni --superlocal --threshold- software (Lovci et al., 2013).
Supplementary Files (1)
GEO Samples (1)
Dataset Citations (1)
SRA Experiments (1) and Runs (4)
Total: 8560 MBSample attributes
Original files (1)
Runs (4)
| Run | Spots | Bases | Size (MB) | Files | Link |
|---|---|---|---|---|---|
| SRR3147674 | 21874724 | 1071861476 | 691.08 | H2_NoIndex_L001_R1.R10_lowRNAse_H_2.randomer.fastq.gz, SRR3147674, SR… | SRA |
| SRR3147675 | 112120785 | 5493918465 | 3443.55 | H2_NoIndex_L001_R1.R12_lowRNAse_H_1.randomer.fastq.gz, SRR3147675, SR… | SRA |
| SRR3147676 | 22648473 | 1109775177 | 696.56 | M2_NoIndex_L002_R1.R10_lowRNAse_M_2.randomer.fastq.gz, SRR3147676, SR… | SRA |
| SRR3147677 | 124320492 | 6091704108 | 3728.54 | M2_NoIndex_L002_R1.R12_lowRNAse_M_1.randomer.fastq.gz, SRR3147677, SR… | SRA |
Linked Publications (1)
Data Files (8)
| Accession | File Name | Stored Type | Output Type | Mapping Assembly | Size | Download | |
|---|---|---|---|---|---|---|---|
| — | H2_NoIndex_L001_R1.R10_lowRNAse_H_2.randomer.fastq.gz | RIP-Seq | 691.1 MB | link | |||
| — | H2_NoIndex_L001_R1.R10_lowRNAse_H_2.randomer.fastq.gz | RIP-Seq | 691.1 MB | link | |||
| — | H2_NoIndex_L001_R1.R12_lowRNAse_H_1.randomer.fastq.gz | RIP-Seq | 3.4 GB | link | |||
| — | H2_NoIndex_L001_R1.R12_lowRNAse_H_1.randomer.fastq.gz | RIP-Seq | 3.4 GB | link | |||
| — | M2_NoIndex_L002_R1.R10_lowRNAse_M_2.randomer.fastq.gz | RIP-Seq | 696.6 MB | link | |||
| — | M2_NoIndex_L002_R1.R10_lowRNAse_M_2.randomer.fastq.gz | RIP-Seq | 696.6 MB | link | |||
| — | M2_NoIndex_L002_R1.R12_lowRNAse_M_1.randomer.fastq.gz | RIP-Seq | 3.6 GB | link | |||
| — | M2_NoIndex_L002_R1.R12_lowRNAse_M_1.randomer.fastq.gz | RIP-Seq | 3.6 GB | link |