GSE104502 Processing Pipeline
GSE
code_examples
2 steps
Publication
Short poly(A) tails are a conserved feature of highly expressed genes.Nature structural & molecular biology (2017) — PMID 29106412
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Read counts were quantified using kallisto.
kallisto$ Bash example
kallisto -h
-
2
These were then aligned to C. elegans genome WS247.
$ Bash example
# Install STAR (if not already installed) # conda install -c bioconda star # Define paths and reference genome for C. elegans WS247 # The C. elegans WS247 genome FASTA and GTF/GFF3 files can be obtained from WormBase (e.g., ftp://ftp.wormbase.org/pub/wormbase/releases/WS247/) GENOME_DIR="/path/to/C_elegans_WS247_STAR_index" # Directory containing STAR genome index READS_R1="input_reads_R1.fastq.gz" # Placeholder for input forward reads READS_R2="input_reads_R2.fastq.gz" # Placeholder for input reverse reads (if paired-end, remove if single-end) OUTPUT_PREFIX="aligned_to_WS247" # --- Genome Index Generation (Run this once if the index does not exist) --- # Assuming you have the genome FASTA and GTF files for WS247, e.g.: # GENOME_FASTA="/path/to/c_elegans.PRJNA13758.WS247.genomic.fa" # GTF_FILE="/path/to/c_elegans.PRJNA13758.WS247.annotations.gtf" # Convert GFF3 to GTF if only GFF3 is available # STAR --runMode genomeGenerate \ # --genomeDir ${GENOME_DIR} \ # --genomeFastaFiles ${GENOME_FASTA} \ # --sjdbGTFfile ${GTF_FILE} \ # --runThreadN 8 # Adjust number of threads as needed # --------------------------------------------------------------------------- # Align reads to C. elegans genome WS247 STAR --genomeDir ${GENOME_DIR} \ --readFilesIn ${READS_R1} ${READS_R2} \ --readFilesCommand zcat \ --outFileNamePrefix ${OUTPUT_PREFIX}. \ --outSAMtype BAM SortedByCoordinate \ --outSAMunmapped Within \ --outSAMattributes Standard \ --runThreadN 8 # Adjust number of threads as needed # Note: For single-end reads, remove the ${READS_R2} from --readFilesIn.
Raw Source Text
Read counts were quantified using kallisto. These were then aligned to C. elegans genome WS247. Genome_build: WS247 Supplementary_files_format_and_content: Csv; Contains tpm values for each replicate