GSE69889 Processing Pipeline

RNA-Seq code_examples 5 steps

Publication

A role for alternative splicing in circadian control of exocytosis and glucose homeostasis.

Genes & development (2020) — PMID 32616519

Dataset

GSE69889

Genome-wide Circadian Control of Transcription at Active Enhancers Regulates Insulin Secretion and Diabetes Risk

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    The data was analyzed by RNA Express v1.0 in BaseSpace

    RNA Express v1.0
    $ Bash example
    # The analysis was performed using "RNA Express v1.0" within the Illumina BaseSpace platform.
    # This is an application typically run via a graphical user interface or API, not a direct command-line tool.
    # Specific command-line parameters and an executable bash command cannot be generated from the provided description.
    
    # Placeholder for reference genome, commonly used in RNA analysis:
    # REF_GENOME="GRCh38"
    # REF_GTF="gencode.v38.annotation.gtf"
  2. 2

    Reads were aligned to mm10 using STAR with the following parameters: readFilesCommand zcat outFilterType BySJout outSJfilterCountUniqueMin -1 2 2 2 outSJfilterCountTotalMin -1 2 2 2 outFilterIntronMotifs RemoveNoncanonical genomeLoad LoadAndKeep

    STAR v2.7.10a (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install STAR (example using conda)
    # conda install -c bioconda star
    
    # Create a genome directory (if not already present)
    # mkdir -p /path/to/STAR_genome_index/mm10
    
    # Build STAR genome index (if not already present, using mm10 as reference)
    # This step is usually done once per genome assembly
    # STAR --runMode genomeGenerate \
    #      --genomeDir /path/to/STAR_genome_index/mm10 \
    #      --genomeFastaFiles /path/to/mm10.fa \
    #      --sjdbGTFfile /path/to/mm10.gtf \
    #      --runThreadN <number_of_threads>
    
    # Align reads using STAR
    STAR --genomeDir /path/to/STAR_genome_index/mm10 \
         --readFilesIn input_reads.fastq.gz \
         --readFilesCommand zcat \
         --outFileNamePrefix output_aligned_prefix_ \
         --outFilterType BySJout \
         --outSJfilterCountUniqueMin -1 2 2 2 \
         --outSJfilterCountTotalMin -1 2 2 2 \
         --outFilterIntronMotifs RemoveNoncanonical \
         --genomeLoad LoadAndKeep \
         --outSAMtype BAM SortedByCoordinate \
         --outSAMunmapped Within \
         --outSAMattributes Standard \
         --runThreadN <number_of_threads>
  3. 3

    Gene expression levels were computed by a proprietary alogorithm in BaseSpace

    Proprietary algorithm in BaseSpace (Inferred with models/gemini-2.5-flash) vN/A GitHub
    $ Bash example
    # Gene expression levels were computed using a proprietary algorithm within Illumina BaseSpace.
    # Specific tool and parameters are not disclosed.
    # Placeholder for a generic gene expression quantification step, assuming human GRCh38 reference:
    # proprietary_expression_tool --input_reads reads.fastq.gz --reference_genome GRCh38.fa --annotation_gtf GRCh38.gtf --output_expression_matrix expression_levels.tsv
  4. 4

    Differential expression was performed with DESeq2 using default parameters

    DESeq2 v1.44.0 (Inferred with models/gemini-2.5-flash) GitHub
    $ Bash example
    # Install R and Bioconductor if not already present
    # sudo apt-get update && sudo apt-get install -y r-base
    # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")'
    # R -e 'BiocManager::install("DESeq2")'
    
    # Create dummy count data (replace with your actual counts.tsv)
    cat 'gene_id\tsampleA\tsampleB\tsampleC\tsampleD
    gene1\t100\t120\t50\t60
    gene2\t50\t60\t100\t110
    gene3\t200\t210\t180\t190
    gene4\t10\t12\t20\t22' > counts.tsv
    
    # Create dummy sample information (replace with your actual sample_info.tsv)
    cat 'sample\tcondition
    sampleA\tcontrol
    sampleB\tcontrol
    sampleC\ttreated
    sampleD\ttreated' > sample_info.tsv
    
    # Create an R script for DESeq2 analysis
    cat << 'EOF' > run_deseq2.R
    library(DESeq2)
    
    # Load count data
    count_data <- read.table("counts.tsv", header = TRUE, row.names = 1, sep = "\t")
    
    # Load sample information
    sample_info <- read.table("sample_info.tsv", header = TRUE, row.names = 1, sep = "\t")
    
    # Ensure sample order matches count data columns
    sample_info <- sample_info[colnames(count_data), , drop = FALSE]
    
    # Create DESeqDataSet object
    dds <- DESeqDataSetFromMatrix(countData = count_data,
                                  colData = sample_info,
                                  design = ~ condition)
    
    # Run DESeq2 analysis with default parameters
    dds <- DESeq(dds)
    
    # Get results (e.g., comparing 'treated' vs 'control')
    res <- results(dds, contrast = c("condition", "treated", "control"))
    
    # Order results by adjusted p-value
    res <- res[order(res$padj), ]
    
    # Write results to a CSV file
    write.csv(as.data.frame(res), file = "deseq2_results.csv")
    
    message("DESeq2 analysis complete. Results saved to deseq2_results.csv")
    EOF
    
    # Execute the R script
    Rscript run_deseq2.R
  5. 5

    Gene cycling data was computed with eJTK_Cycle

    eJTK_Cycle vNot specified
    $ Bash example
    # Clone the eJTK_Cycle repository if not already present
    # git clone https://github.com/alan-yeo/eJTK_Cycle.git
    # cd eJTK_Cycle
    
    # Run eJTK_Cycle to compute gene cycling data.
    # This command assumes 'gene_expression_data.tsv' is the input file containing gene expression values
    # (e.g., genes as rows, time points as columns).
    # 'cycling_results.tsv' will be the output file with cycling analysis results.
    # -p: Specifies the period to test for (e.g., 24 for circadian rhythms).
    # -a: Specifies the alpha (significance) level for p-value correction (e.g., 0.05).
    python eJTK_Cycle.py -i gene_expression_data.tsv -o cycling_results.tsv -p 24 -a 0.05

Tools Used

Raw Source Text
The data was analyzed by RNA Express v1.0 in BaseSpace
Reads were aligned to mm10 using STAR with the following parameters: readFilesCommand zcat outFilterType BySJout outSJfilterCountUniqueMin -1 2 2 2 outSJfilterCountTotalMin -1 2 2 2 outFilterIntronMotifs RemoveNoncanonical genomeLoad LoadAndKeep
Gene expression levels were computed by a proprietary alogorithm in BaseSpace
Differential expression was performed with DESeq2 using default parameters
Gene cycling data was computed with eJTK_Cycle
Genome_build: mm10
Supplementary_files_format_and_content: tab-delimited text files showing gene cycling data
← Back to Analysis