GSE201898 Processing Pipeline

GSE code_examples 3 steps

Publication

MECP2-related pathways are dysregulated in a cortical organoid model of myotonic dystrophy.

Science translational medicine (2022) — PMID 35767654

Dataset

GSE201898

MECP2-related pathways are dysregulated in a cortical organoid model of Myotonic dystrophy

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Alignment, cell barcode processing, umis processing, abundance measurements: cellranger count (version 3.0.2)

    Cell Ranger v3.0.2
    $ Bash example
    cellranger_version="3.0.2"
    
    # Cell Ranger is typically installed by downloading the tarball from 10x Genomics and adding it to your PATH.
    # Example (adjust path as needed):
    # wget https://cf.10xgenomics.com/releases/cell-exp/cellranger-3.0.2.tar.gz
    # tar -xzf cellranger-3.0.2.tar.gz
    # export PATH=/path/to/cellranger-3.0.2:$PATH
    
    # Define variables for the run
    SAMPLE_ID="my_sample_id" # A unique ID for this run
    FASTQ_DIR="/path/to/your/fastqs" # Directory containing FASTQ files (e.g., from bcl2fastq or mkfastq)
    SAMPLE_NAME="sample_1" # The sample name prefix for your FASTQ files (e.g., sample_1_S1_L001_R1_001.fastq.gz)
    TRANSCRIPTOME_REF="/path/to/refdata-gex-GRCh38-2020-A" # Path to a Cell Ranger-compatible transcriptome reference (e.g., from 10x Genomics)
    
    # Execute cellranger count
    cellranger count \
        --id=${SAMPLE_ID} \
        --transcriptome=${TRANSCRIPTOME_REF} \
        --fastqs=${FASTQ_DIR} \
        --sample=${SAMPLE_NAME} \
        --expect-cells=3000 # Optional: Expected number of cells, adjust as needed
    
  2. 2

    MD tags were added to alignments with samtools calmd --threads 15 -rb possorted_genome_bam.bam refdata-cellranger-hg19-3.0.0/fasta/genome.fa > possorted_genome_bam_MD.bam

    Cell Ranger v3.0.0 GitHub
    $ Bash example
    # Install samtools if not already available
    # conda install -c bioconda samtools
    
    # Add MD tags to alignments
    samtools calmd --threads 15 -rb possorted_genome_bam.bam refdata-cellranger-hg19-3.0.0/fasta/genome.fa > possorted_genome_bam_MD.bam
  3. 3

    Reads were split based on the CB:Z tag, resulting in one BAM file per barcode.

    fgbio (Inferred with models/gemini-2.5-flash) v2.3.0 GitHub
    $ Bash example
    # Install fgbio if not already installed
    # conda install -c bioconda fgbio
    
    # Define input BAM file (replace with actual input file)
    INPUT_BAM="input.bam"
    
    # Define output prefix for split BAM files
    OUTPUT_PREFIX="barcode_split_"
    
    # Split BAM file based on the CB:Z tag, creating one BAM file per unique barcode.
    # The --tag CB option specifies the tag to split by (CB:Z refers to the CB tag with Z string type).
    # The --strategy BARCODE option ensures splitting by unique barcode values found in the tag.
    fgbio SplitBamByTag --input "${INPUT_BAM}" --output-prefix "${OUTPUT_PREFIX}" --tag CB --strategy BARCODE
Raw Source Text
Alignment, cell barcode processing, umis processing, abundance measurements: cellranger count (version 3.0.2)
MD tags were added to alignments with samtools calmd --threads 15 -rb possorted_genome_bam.bam refdata-cellranger-hg19-3.0.0/fasta/genome.fa > possorted_genome_bam_MD.bam
Reads were split based on the CB:Z tag, resulting in one BAM file per barcode.
Assembly: hg19
Supplementary files format and content: Tab-separated values files and matrix files
← Back to Analysis