GSE86479 Processing Pipeline

RNA-Seq code_examples 2 steps

Publication

Pseudotemporal Ordering of Single Cells Reveals Metabolic Control of Postnatal β Cell Proliferation.

Cell metabolism (2017) — PMID 28467932

Dataset

Pseudotemporal ordering of single cells reveals metabolic control of postnatal beta cell proliferation

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.

Processing Steps

Generate Jupyter Notebook

Reads were aligned to the mm9 by STAR with parameters: --outSAMstrandField intronMotif --outFilterMultimapNmax 1 --runThreadN 5.Only the reads aligned uniquely to one genomic location were retained for subsequent analysis.

STAR v2.5.x GitHub

$ Bash example

# Install STAR (if not already installed)
# conda install -c bioconda star

# Define variables for reference genome and annotation
GENOME_DIR="star_mm9_index"
GENOME_FASTA="mm9.fa"
GTF_FILE="Mus_musculus.NCBIM37.67.gtf" # Ensembl release 67 for mm9/NCBIM37

# Download reference files (example using curl/wget)
# mkdir -p ref_data
# cd ref_data
# curl -O http://hgdownload.soe.ucsc.edu/goldenPath/mm9/bigZips/mm9.fa.gz
# gunzip mm9.fa.gz
# curl -O ftp://ftp.ensembl.org/pub/release-67/gtf/mus_musculus/Mus_musculus.NCBIM37.67.gtf.gz
# gunzip Mus_musculus.NCBIM37.67.gtf.gz
# cd ..

# Create STAR genome index (run once per genome)
# mkdir -p "${GENOME_DIR}"
# STAR --runMode genomeGenerate \
#      --genomeDir "${GENOME_DIR}" \
#      --genomeFastaFiles "ref_data/${GENOME_FASTA}" \
#      --sjdbGTFfile "ref_data/${GTF_FILE}" \
#      --sjdbOverhang 100 \
#      --runThreadN 5 # Use the same number of threads as alignment if possible

# Define input and output files
READ1="sample_R1.fastq.gz"
READ2="sample_R2.fastq.gz" # Assuming paired-end reads
OUTPUT_PREFIX="aligned_reads/"

# Create output directory
mkdir -p "${OUTPUT_PREFIX}"

# Align reads with STAR
STAR --genomeDir "${GENOME_DIR}" \
     --readFilesIn "${READ1}" "${READ2}" \
     --readFilesCommand zcat \
     --outFileNamePrefix "${OUTPUT_PREFIX}" \
     --runThreadN 5 \
     --outSAMstrandField intronMotif \
     --outFilterMultimapNmax 1 \
     --outSAMtype BAM SortedByCoordinate \
     --outSAMattributes NH HI AS NM MD \
     --outSAMmapqUnique 60 \
     --outFilterType BySJout \
     --outFilterMismatchNmax 999 \
     --outFilterMismatchNoverLmax 0.04 \
     --alignIntronMin 20 \
     --alignIntronMax 1000000 \
     --alignMatesGapMax 1000000 \
     --limitBAMsortRAM 30000000000 # Adjust based on available RAM (e.g., 30GB)

View on GitHub

makeTagDirectory and makeUCSCfile in homer was used to generate bigWig file

HOMER vInferred with models/gemini-2.5-flash GitHub

$ Bash example

# Install HOMER (if not already installed)
# conda install -c bioconda homer

# Define variables (replace with actual paths and names)
INPUT_BAM="input_sample.bam"
TAG_DIRECTORY="sample_tag_directory"
OUTPUT_BIGWIG="sample_signal.bigWig"
GENOME_ASSEMBLY="hg38" # Placeholder: Replace with your reference genome (e.g., mm10, hg38)

# Step 1: Create a Tag Directory from BAM/SAM files
# This command processes the alignment file(s) and creates a directory containing tag information.
# -format bam: Specifies the input file format.
# -genome: Specifies the reference genome assembly for proper normalization and coordinate handling.
# -tbp 1: Treats each read as a single tag (useful for single-end reads or when fragment length is handled later).
# -fragLength given: Tells HOMER to use the fragment length specified in the BAM file (if paired-end) or infer it.
makeTagDirectory ${TAG_DIRECTORY} ${INPUT_BAM} -format bam -genome ${GENOME_ASSEMBLY} -tbp 1 -fragLength given

# Step 2: Generate a bigWig file from the Tag Directory
# This command converts the tag directory into a bigWig file suitable for visualization in genome browsers.
# -o: Specifies the output bigWig file name.
# -bigWig: Ensures the output is in bigWig format.
# -norm 1x: Normalizes the signal to 1x coverage (reads per million mapped reads), a common method for visualization.
# -res 10: Sets the resolution of the bigWig file to 10 bp.
makeUCSCfile ${TAG_DIRECTORY} -o ${OUTPUT_BIGWIG} -bigWig -norm 1x -res 10

View on GitHub

Tools Used

STAR HOMER

Raw Source Text

Reads were aligned to the mm9 by STAR with parameters: --outSAMstrandField intronMotif --outFilterMultimapNmax 1 --runThreadN 5.Only the reads aligned uniquely to one genomic location were retained for subsequent analysis.
makeTagDirectory and makeUCSCfile in homer was used to generate bigWig file
Genome_build: mm9
Supplementary_files_format_and_content: bigWig file

← Back to Analysis