GSE203091 Processing Pipeline

OTHER code_examples 6 steps

Publication

The long noncoding RNA Malat1 regulates CD8+ T cell differentiation by mediating epigenetic repression.

The Journal of experimental medicine (2022) — PMID 35593887

Dataset

GSE203091

The long noncoding RNA Malat1 regulates CD8+ T cell differentiation by mediating epigenetic repression (CUT&RUN)

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Reads were mapped either to the mouse or yeast genome with bowtie2 -p 8 --local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700 (v1.2.2).

Bowtie2 v1.2.2 GitHub

$ Bash example

# Install Bowtie2 (if not already installed)
# conda install -c bioconda bowtie2

# Placeholder for genome index path and prefix
# Replace 'path/to/genome_index/index_prefix' with the actual path to your chosen genome index (e.g., mouse or yeast).
# For example, for mouse (mm10), it might be 'path/to/mm10_index/mm10'
# For yeast (sacCer3), it might be 'path/to/sacCer3_index/sacCer3'
GENOME_INDEX="path/to/genome_index/index_prefix"

# Placeholder for input FASTQ files (assuming paired-end based on -I and -X parameters)
# Replace 'reads_R1.fastq' and 'reads_R2.fastq' with your actual input read files.
READS_R1="reads_R1.fastq"
READS_R2="reads_R2.fastq"

# Placeholder for output SAM file
OUTPUT_SAM="mapped_reads.sam"

bowtie2 \
  -p 8 \
  --local \
  --very-sensitive-local \
  --no-unal \
  --no-mixed \
  --no-discordant \
  --phred33 \
  -I 10 \
  -X 700 \
  -x "${GENOME_INDEX}" \
  -1 "${READS_R1}" \
  -2 "${READS_R2}" \
  -S "${OUTPUT_SAM}"

View on GitHub

Mapped reads were then converted into bed files.

bedtools (Inferred with models/gemini-2.5-flash) v2.31.0 GitHub

$ Bash example

# Install bedtools (if not already installed)
# conda install -c bioconda bedtools

# Convert mapped reads (BAM format) to BED format.
# Replace 'input.bam' with the path to your mapped reads BAM file.
# Replace 'output.bed' with the desired name for your output BED file.
bedtools bamtobed -i input.bam > output.bed

View on GitHub

A scale factor for Spike-In normalization was calculated for each sample by dividing 1000 from the sequencing depth of mapped yeast reads.

awk (Inferred with models/gemini-2.5-flash) vN/A GitHub

$ Bash example

# Example: Assume yeast_mapped_reads.txt contains the total count of mapped yeast reads.
# This file would typically be generated by a previous step, such as mapping reads to a yeast spike-in genome and counting them.
# Example content of yeast_mapped_reads.txt:
# 12345678

# Read the sequencing depth of mapped yeast reads from a file or variable
# For demonstration, we assume it's in 'yeast_mapped_reads.txt'
YEAST_READ_DEPTH=$(cat yeast_mapped_reads.txt)

# Calculate the scale factor by dividing the sequencing depth by 1000
# Using awk for floating-point division to ensure precision
SCALE_FACTOR=$(echo "$YEAST_READ_DEPTH" | awk '{print $1 / 1000}')

# Output the calculated scale factor (e.g., to a file or stdout for subsequent steps)
echo "$SCALE_FACTOR" > spike_in_scale_factor.txt

echo "Calculated Spike-In Normalization Scale Factor: $SCALE_FACTOR"

View on GitHub

Bed files were normalized using bedtools genomcov (v2.29.2) with the respective scaling factor calculated as describe above.

bedtools v2.29.2 GitHub

$ Bash example

# Install bedtools (if not already installed)
# conda install -c bioconda bedtools=2.29.2

# Define input and output file paths and the scaling factor
INPUT_BED="input.bed"
GENOME_SIZES="hg38.chrom.sizes" # Placeholder: Replace with your actual genome chrom.sizes file
SCALING_FACTOR="1.0" # Placeholder: Replace with the calculated scaling factor
OUTPUT_BEDGRAPH="normalized_coverage.bedgraph"

# Normalize bed file coverage using bedtools genomcov with the specified scaling factor
# The -bg flag outputs coverage in bedGraph format.
bedtools genomcov -i "${INPUT_BED}" -g "${GENOME_SIZES}" -bg -scale "${SCALING_FACTOR}" > "${OUTPUT_BEDGRAPH}"

View on GitHub

Peaks calls were determined using macs2 macs2 callpeak -broad --broad-cutoff 0.1 -B --nomodel --keep-dup all -f BED (v2.1.2).

MACS2 v2.1.2 GitHub

$ Bash example

# Install MACS2 if not already installed
# conda install -c bioconda macs2

# Example usage of macs2 callpeak
# Replace 'treatment.bed', 'control.bed', 'genome_size', and 'output_prefix' with actual values.
# 'genome_size' can be 'hs' for human, 'mm' for mouse, or a specific number (e.g., 2.7e9 for human).
macs2 callpeak -t treatment.bed -c control.bed -f BED -g genome_size -n output_prefix -broad --broad-cutoff 0.1 -B --nomodel --keep-dup all

View on GitHub

6

none provided by the submitter

(Inferred with models/gemini-2.5-flash)

Tools Used

Bowtie2

Raw Source Text

Reads were mapped either to the mouse or yeast genome with bowtie2 -p 8 --local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700 (v1.2.2). Mapped reads were then converted into bed files. A scale factor for Spike-In normalization was calculated for each sample by dividing 1000 from the sequencing depth of mapped yeast reads. Bed files were normalized using bedtools genomcov (v2.29.2) with the respective scaling factor calculated as describe above. Peaks calls were determined using macs2 macs2 callpeak -broad --broad-cutoff 0.1 -B --nomodel --keep-dup all -f BED (v2.1.2).
none provided by the submitter
Assembly: mm10
Supplementary files format and content: Normalized bedgraph files

← Back to Analysis