GSE86479 Processing Pipeline

RNA-Seq code_examples 2 steps

Publication

Pseudotemporal Ordering of Single Cells Reveals Metabolic Control of Postnatal β Cell Proliferation.

Cell metabolism (2017) — PMID 28467932

Dataset

GSE86479

Pseudotemporal ordering of single cells reveals metabolic control of postnatal beta cell proliferation

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Reads were aligned to the mm9 by STAR with parameters: --outSAMstrandField intronMotif --outFilterMultimapNmax 1 --runThreadN 5.Only the reads aligned uniquely to one genomic location were retained for subsequent analysis.

    $ Bash example
    # Install STAR (if not already installed)
    # conda install -c bioconda star
    
    # Define variables for reference genome and annotation
    GENOME_DIR="star_mm9_index"
    GENOME_FASTA="mm9.fa"
    GTF_FILE="Mus_musculus.NCBIM37.67.gtf" # Ensembl release 67 for mm9/NCBIM37
    
    # Download reference files (example using curl/wget)
    # mkdir -p ref_data
    # cd ref_data
    # curl -O http://hgdownload.soe.ucsc.edu/goldenPath/mm9/bigZips/mm9.fa.gz
    # gunzip mm9.fa.gz
    # curl -O ftp://ftp.ensembl.org/pub/release-67/gtf/mus_musculus/Mus_musculus.NCBIM37.67.gtf.gz
    # gunzip Mus_musculus.NCBIM37.67.gtf.gz
    # cd ..
    
    # Create STAR genome index (run once per genome)
    # mkdir -p "${GENOME_DIR}"
    # STAR --runMode genomeGenerate \
    #      --genomeDir "${GENOME_DIR}" \
    #      --genomeFastaFiles "ref_data/${GENOME_FASTA}" \
    #      --sjdbGTFfile "ref_data/${GTF_FILE}" \
    #      --sjdbOverhang 100 \
    #      --runThreadN 5 # Use the same number of threads as alignment if possible
    
    # Define input and output files
    READ1="sample_R1.fastq.gz"
    READ2="sample_R2.fastq.gz" # Assuming paired-end reads
    OUTPUT_PREFIX="aligned_reads/"
    
    # Create output directory
    mkdir -p "${OUTPUT_PREFIX}"
    
    # Align reads with STAR
    STAR --genomeDir "${GENOME_DIR}" \
         --readFilesIn "${READ1}" "${READ2}" \
         --readFilesCommand zcat \
         --outFileNamePrefix "${OUTPUT_PREFIX}" \
         --runThreadN 5 \
         --outSAMstrandField intronMotif \
         --outFilterMultimapNmax 1 \
         --outSAMtype BAM SortedByCoordinate \
         --outSAMattributes NH HI AS NM MD \
         --outSAMmapqUnique 60 \
         --outFilterType BySJout \
         --outFilterMismatchNmax 999 \
         --outFilterMismatchNoverLmax 0.04 \
         --alignIntronMin 20 \
         --alignIntronMax 1000000 \
         --alignMatesGapMax 1000000 \
         --limitBAMsortRAM 30000000000 # Adjust based on available RAM (e.g., 30GB)
  2. 2

    makeTagDirectory and makeUCSCfile in homer was used to generate bigWig file

    HOMER vInferred with models/gemini-2.5-flash GitHub
    $ Bash example
    # Install HOMER (if not already installed)
    # conda install -c bioconda homer
    
    # Define variables (replace with actual paths and names)
    INPUT_BAM="input_sample.bam"
    TAG_DIRECTORY="sample_tag_directory"
    OUTPUT_BIGWIG="sample_signal.bigWig"
    GENOME_ASSEMBLY="hg38" # Placeholder: Replace with your reference genome (e.g., mm10, hg38)
    
    # Step 1: Create a Tag Directory from BAM/SAM files
    # This command processes the alignment file(s) and creates a directory containing tag information.
    # -format bam: Specifies the input file format.
    # -genome: Specifies the reference genome assembly for proper normalization and coordinate handling.
    # -tbp 1: Treats each read as a single tag (useful for single-end reads or when fragment length is handled later).
    # -fragLength given: Tells HOMER to use the fragment length specified in the BAM file (if paired-end) or infer it.
    makeTagDirectory ${TAG_DIRECTORY} ${INPUT_BAM} -format bam -genome ${GENOME_ASSEMBLY} -tbp 1 -fragLength given
    
    # Step 2: Generate a bigWig file from the Tag Directory
    # This command converts the tag directory into a bigWig file suitable for visualization in genome browsers.
    # -o: Specifies the output bigWig file name.
    # -bigWig: Ensures the output is in bigWig format.
    # -norm 1x: Normalizes the signal to 1x coverage (reads per million mapped reads), a common method for visualization.
    # -res 10: Sets the resolution of the bigWig file to 10 bp.
    makeUCSCfile ${TAG_DIRECTORY} -o ${OUTPUT_BIGWIG} -bigWig -norm 1x -res 10

Tools Used

Raw Source Text
Reads were aligned to the mm9 by STAR with parameters: --outSAMstrandField intronMotif --outFilterMultimapNmax 1 --runThreadN 5.Only the reads aligned uniquely to one genomic location were retained for subsequent analysis.
makeTagDirectory and makeUCSCfile in homer was used to generate bigWig file
Genome_build: mm9
Supplementary_files_format_and_content: bigWig file
← Back to Analysis