GSE139815 Processing Pipeline

OTHER code_examples 2 steps

Publication

Pooled CRISPR screens with imaging on microraft arrays reveals stress granule-regulatory factors.

Nature methods (2020) — PMID 32393832

Dataset

GSE139815

Pooled CRISPR screens with imaging on microRaft arrays reveals stress granule-regulatory factors

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    MaGeCk was used to process data for lentiCRISPR bulk samples to quantify sgRNA abundances.

    MaGeCk v0.5.9
    $ Bash example
    # Install MaGeCk (if not already installed)
    # conda install -c bioconda mageck
    
    # Define input and output files
    # Replace 'sample.fastq.gz' with your actual lentiCRISPR bulk sample FASTQ file.
    # Replace 'lentiCRISPR_sgRNA_library.txt' with your actual sgRNA library file.
    # The sgRNA library file typically contains sgRNA sequences and their corresponding gene targets.
    INPUT_FASTQ="sample.fastq.gz"
    SGRNA_LIBRARY="lentiCRISPR_sgRNA_library.txt"
    OUTPUT_PREFIX="lentiCRISPR_sgRNA_counts"
    
    # Run MaGeCk to quantify sgRNA abundances
    mageck count -l ${SGRNA_LIBRARY} -n ${OUTPUT_PREFIX} --fastq ${INPUT_FASTQ}
  2. 2

    Targeted microRaft data was processed with CRaftID software (https://github.com/YeoLab/CRaftID)

    CRaftID vnot specified GitHub
    $ Bash example
    # Install CRaftID (if not already installed)
    # CRaftID is a Python script, typically cloned from its repository.
    # git clone https://github.com/YeoLab/CRaftID.git
    # cd CRaftID
    
    # Define input and output paths
    # Replace with actual file paths for your targeted microRaft data, reference genome, and target regions.
    INPUT_FASTQ="targeted_microraft_data.fastq" # Input FASTQ file containing microRaft reads
    OUTPUT_DIR="CRaftID_processed_results" # Directory for CRaftID output
    REFERENCE_FASTA="GRCh38.fasta" # Placeholder for the reference genome FASTA file (e.g., hg38, mm10)
    TARGET_REGIONS_BED="target_regions.bed" # Placeholder for a BED file defining the targeted regions
    
    # Create the output directory if it does not exist
    mkdir -p "${OUTPUT_DIR}"
    
    # Execute CRaftID software
    # Assuming CRaftID.py is in the current working directory or in your system's PATH.
    python CRaftID.py \
        -i "${INPUT_FASTQ}" \
        -o "${OUTPUT_DIR}" \
        -r "${REFERENCE_FASTA}" \
        -t "${TARGET_REGIONS_BED}"
Raw Source Text
MaGeCk was used to process data for lentiCRISPR bulk samples to quantify sgRNA abundances.
Targeted microRaft data was processed with CRaftID software (https://github.com/YeoLab/CRaftID)
Genome_build: hg19
Supplementary_files_format_and_content: [all.count_normalized.csv]  Table of normalized sgRNA counts (one column per sample) for bulk samples
Supplementary_files_format_and_content: [microRaft_processed.csv]   Read counts and sgRNA insert identified for each microRaft library. Libraries where no sgRNA insert were detected (sequencing failure) are not included in this file.
← Back to Analysis