GSE139815 Processing Pipeline

OTHER code_examples 2 steps

Publication

Pooled CRISPR screens with imaging on microraft arrays reveals stress granule-regulatory factors.

Nature methods (2020) — PMID 32393832

Dataset

Pooled CRISPR screens with imaging on microRaft arrays reveals stress granule-regulatory factors

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

MaGeCk was used to process data for lentiCRISPR bulk samples to quantify sgRNA abundances.

MaGeCk v0.5.9

$ Bash example

# Install MaGeCk (if not already installed)
# conda install -c bioconda mageck

# Define input and output files
# Replace 'sample.fastq.gz' with your actual lentiCRISPR bulk sample FASTQ file.
# Replace 'lentiCRISPR_sgRNA_library.txt' with your actual sgRNA library file.
# The sgRNA library file typically contains sgRNA sequences and their corresponding gene targets.
INPUT_FASTQ="sample.fastq.gz"
SGRNA_LIBRARY="lentiCRISPR_sgRNA_library.txt"
OUTPUT_PREFIX="lentiCRISPR_sgRNA_counts"

# Run MaGeCk to quantify sgRNA abundances
mageck count -l ${SGRNA_LIBRARY} -n ${OUTPUT_PREFIX} --fastq ${INPUT_FASTQ}

Targeted microRaft data was processed with CRaftID software (https://github.com/YeoLab/CRaftID)

CRaftID vnot specified GitHub

$ Bash example

# Install CRaftID (if not already installed)
# CRaftID is a Python script, typically cloned from its repository.
# git clone https://github.com/YeoLab/CRaftID.git
# cd CRaftID

# Define input and output paths
# Replace with actual file paths for your targeted microRaft data, reference genome, and target regions.
INPUT_FASTQ="targeted_microraft_data.fastq" # Input FASTQ file containing microRaft reads
OUTPUT_DIR="CRaftID_processed_results" # Directory for CRaftID output
REFERENCE_FASTA="GRCh38.fasta" # Placeholder for the reference genome FASTA file (e.g., hg38, mm10)
TARGET_REGIONS_BED="target_regions.bed" # Placeholder for a BED file defining the targeted regions

# Create the output directory if it does not exist
mkdir -p "${OUTPUT_DIR}"

# Execute CRaftID software
# Assuming CRaftID.py is in the current working directory or in your system's PATH.
python CRaftID.py \
    -i "${INPUT_FASTQ}" \
    -o "${OUTPUT_DIR}" \
    -r "${REFERENCE_FASTA}" \
    -t "${TARGET_REGIONS_BED}"

View on GitHub

Raw Source Text

MaGeCk was used to process data for lentiCRISPR bulk samples to quantify sgRNA abundances.
Targeted microRaft data was processed with CRaftID software (https://github.com/YeoLab/CRaftID)
Genome_build: hg19
Supplementary_files_format_and_content: [all.count_normalized.csv]  Table of normalized sgRNA counts (one column per sample) for bulk samples
Supplementary_files_format_and_content: [microRaft_processed.csv]   Read counts and sgRNA insert identified for each microRaft library. Libraries where no sgRNA insert were detected (sequencing failure) are not included in this file.

← Back to Analysis