GSE69889 Processing Pipeline
RNA-Seq
code_examples
5 steps
Publication
A role for alternative splicing in circadian control of exocytosis and glucose homeostasis.Genes & development (2020) — PMID 32616519
Dataset
GSE69889Genome-wide Circadian Control of Transcription at Active Enhancers Regulates Insulin Secretion and Diabetes Risk
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
The data was analyzed by RNA Express v1.0 in BaseSpace
RNA Express v1.0$ Bash example
# The analysis was performed using "RNA Express v1.0" within the Illumina BaseSpace platform. # This is an application typically run via a graphical user interface or API, not a direct command-line tool. # Specific command-line parameters and an executable bash command cannot be generated from the provided description. # Placeholder for reference genome, commonly used in RNA analysis: # REF_GENOME="GRCh38" # REF_GTF="gencode.v38.annotation.gtf"
-
2
Reads were aligned to mm10 using STAR with the following parameters: readFilesCommand zcat outFilterType BySJout outSJfilterCountUniqueMin -1 2 2 2 outSJfilterCountTotalMin -1 2 2 2 outFilterIntronMotifs RemoveNoncanonical genomeLoad LoadAndKeep
$ Bash example
# Install STAR (example using conda) # conda install -c bioconda star # Create a genome directory (if not already present) # mkdir -p /path/to/STAR_genome_index/mm10 # Build STAR genome index (if not already present, using mm10 as reference) # This step is usually done once per genome assembly # STAR --runMode genomeGenerate \ # --genomeDir /path/to/STAR_genome_index/mm10 \ # --genomeFastaFiles /path/to/mm10.fa \ # --sjdbGTFfile /path/to/mm10.gtf \ # --runThreadN <number_of_threads> # Align reads using STAR STAR --genomeDir /path/to/STAR_genome_index/mm10 \ --readFilesIn input_reads.fastq.gz \ --readFilesCommand zcat \ --outFileNamePrefix output_aligned_prefix_ \ --outFilterType BySJout \ --outSJfilterCountUniqueMin -1 2 2 2 \ --outSJfilterCountTotalMin -1 2 2 2 \ --outFilterIntronMotifs RemoveNoncanonical \ --genomeLoad LoadAndKeep \ --outSAMtype BAM SortedByCoordinate \ --outSAMunmapped Within \ --outSAMattributes Standard \ --runThreadN <number_of_threads> -
3
Gene expression levels were computed by a proprietary alogorithm in BaseSpace
$ Bash example
# Gene expression levels were computed using a proprietary algorithm within Illumina BaseSpace. # Specific tool and parameters are not disclosed. # Placeholder for a generic gene expression quantification step, assuming human GRCh38 reference: # proprietary_expression_tool --input_reads reads.fastq.gz --reference_genome GRCh38.fa --annotation_gtf GRCh38.gtf --output_expression_matrix expression_levels.tsv
-
4
Differential expression was performed with DESeq2 using default parameters
$ Bash example
# Install R and Bioconductor if not already present # sudo apt-get update && sudo apt-get install -y r-base # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")' # R -e 'BiocManager::install("DESeq2")' # Create dummy count data (replace with your actual counts.tsv) cat 'gene_id\tsampleA\tsampleB\tsampleC\tsampleD gene1\t100\t120\t50\t60 gene2\t50\t60\t100\t110 gene3\t200\t210\t180\t190 gene4\t10\t12\t20\t22' > counts.tsv # Create dummy sample information (replace with your actual sample_info.tsv) cat 'sample\tcondition sampleA\tcontrol sampleB\tcontrol sampleC\ttreated sampleD\ttreated' > sample_info.tsv # Create an R script for DESeq2 analysis cat << 'EOF' > run_deseq2.R library(DESeq2) # Load count data count_data <- read.table("counts.tsv", header = TRUE, row.names = 1, sep = "\t") # Load sample information sample_info <- read.table("sample_info.tsv", header = TRUE, row.names = 1, sep = "\t") # Ensure sample order matches count data columns sample_info <- sample_info[colnames(count_data), , drop = FALSE] # Create DESeqDataSet object dds <- DESeqDataSetFromMatrix(countData = count_data, colData = sample_info, design = ~ condition) # Run DESeq2 analysis with default parameters dds <- DESeq(dds) # Get results (e.g., comparing 'treated' vs 'control') res <- results(dds, contrast = c("condition", "treated", "control")) # Order results by adjusted p-value res <- res[order(res$padj), ] # Write results to a CSV file write.csv(as.data.frame(res), file = "deseq2_results.csv") message("DESeq2 analysis complete. Results saved to deseq2_results.csv") EOF # Execute the R script Rscript run_deseq2.R -
5
Gene cycling data was computed with eJTK_Cycle
eJTK_Cycle vNot specified$ Bash example
# Clone the eJTK_Cycle repository if not already present # git clone https://github.com/alan-yeo/eJTK_Cycle.git # cd eJTK_Cycle # Run eJTK_Cycle to compute gene cycling data. # This command assumes 'gene_expression_data.tsv' is the input file containing gene expression values # (e.g., genes as rows, time points as columns). # 'cycling_results.tsv' will be the output file with cycling analysis results. # -p: Specifies the period to test for (e.g., 24 for circadian rhythms). # -a: Specifies the alpha (significance) level for p-value correction (e.g., 0.05). python eJTK_Cycle.py -i gene_expression_data.tsv -o cycling_results.tsv -p 24 -a 0.05
Raw Source Text
The data was analyzed by RNA Express v1.0 in BaseSpace Reads were aligned to mm10 using STAR with the following parameters: readFilesCommand zcat outFilterType BySJout outSJfilterCountUniqueMin -1 2 2 2 outSJfilterCountTotalMin -1 2 2 2 outFilterIntronMotifs RemoveNoncanonical genomeLoad LoadAndKeep Gene expression levels were computed by a proprietary alogorithm in BaseSpace Differential expression was performed with DESeq2 using default parameters Gene cycling data was computed with eJTK_Cycle Genome_build: mm10 Supplementary_files_format_and_content: tab-delimited text files showing gene cycling data