GSE135300 Processing Pipeline
RNA-Seq
code_examples
2 steps
Publication
An in vivo genome-wide CRISPR screen identifies the RNA-binding protein Staufen2 as a key regulator of myeloid leukemia.Nature cancer (2020) — PMID 34109316
Dataset
GSE135300Next Generation Sequencing: in vivo genome-wide CRISPR sgRNA screen in primary cancer-initiating and propagating bcCML stem cells
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Reads were counted by first searching for the CACCG sequence in the primary read file that appears in the vector 5â to all sgRNA inserts.
grep (Inferred with models/gemini-2.5-flash) vN/A$ Bash example
# This command counts the number of lines in the primary read file that contain the sequence "CACCG". # The '-c' option tells grep to output only a count of matching lines. # Replace 'primary_read_file.fastq' with the actual path to your read file. grep -c "CACCG" primary_read_file.fastq > cac_sequence_count.txt
-
2
The next 20 nts are the sgRNA insert, which was then mapped to a reference file of all possible sgRNAs present in the library.
$ Bash example
# Install bowtie2 if not already installed # conda install -c bioconda bowtie2 # Define input and output files # Assuming 'sgRNA_reads.fastq' contains the extracted 20nt sgRNA inserts SGRNA_READS="sgRNA_reads.fastq" # 'sgRNA_library.fasta' is the reference file of all possible sgRNAs SGRNA_LIBRARY_FASTA="sgRNA_library.fasta" INDEX_PREFIX="sgRNA_library_index" OUTPUT_SAM="sgRNA_mapped.sam" THREADS=8 # Example number of threads # 1. Build Bowtie2 index for the sgRNA library # This step only needs to be run once for a given reference library bowtie2-build "${SGRNA_LIBRARY_FASTA}" "${INDEX_PREFIX}" # 2. Map the sgRNA inserts to the indexed library # -U for single-end reads # -S for SAM output # -p for number of threads bowtie2 -x "${INDEX_PREFIX}" -U "${SGRNA_READS}" -S "${OUTPUT_SAM}" -p "${THREADS}" # Optional: Convert SAM to BAM, sort, and index # samtools view -bS "${OUTPUT_SAM}" > "${OUTPUT_SAM%.sam}.bam" # samtools sort "${OUTPUT_SAM%.sam}.bam" -o "${OUTPUT_SAM%.sam}.sorted.bam" # samtools index "${OUTPUT_SAM%.sam}.sorted.bam"
Raw Source Text
Reads were counted by first searching for the CACCG sequence in the primary read file that appears in the vector 5â to all sgRNA inserts. The next 20 nts are the sgRNA insert, which was then mapped to a reference file of all possible sgRNAs present in the library. Supplementary_files_format_and_content: tab-delimited text file, includes sgRNA code and count