GSE98869 Processing Pipeline

RNA-Seq code_examples 13 steps

Publication

The <i>C. elegans</i> neural editome reveals an ADAR target mRNA required for proper chemotaxis.

eLife (2017) — PMID 28925356

Dataset

GSE98869

The C. elegans neural editome reveals an ADAR target mRNA required for proper chemotaxis

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Reads were trimmed of adapter sequences TCGTATGCCGTCTTCTGCTTG, ATCTCGTATGCCGTCTTCTGCTTG, CGACAGGTTCAGAGTTCTACAGTCCGACGATC GATCGGAAGAGCACACGTCTGAACTCCAGTCAC using cutadapt v1.9.1

    cutadapt v1.9.1 GitHub
    $ Bash example
    # Install cutadapt if not already installed
    # conda install -c bioconda cutadapt=1.9.1
    
    # Define adapter sequences
    ADAPTER1="TCGTATGCCGTCTTCTGCTTG"
    ADAPTER2="ATCTCGTATGCCGTCTTCTGCTTG"
    ADAPTER3="CGACAGGTTCAGAGTTCTACAGTCCGACGATC"
    ADAPTER4="GATCGGAAGAGCACACGTCTGAACTCCAGTCAC"
    
    # Placeholder for input and output files
    INPUT_FILE="input.fastq.gz"
    OUTPUT_FILE="output.trimmed.fastq.gz"
    
    # Execute cutadapt command
    cutadapt -a "${ADAPTER1}" \
             -a "${ADAPTER2}" \
             -a "${ADAPTER3}" \
             -a "${ADAPTER4}" \
             -o "${OUTPUT_FILE}" \
             "${INPUT_FILE}"
  2. 2

    Trimmed reads were aligned to RepBase v18.05 using STAR v2.4.01

    $ Bash example
    # Install STAR (if not already installed)
    # conda install -c bioconda star=2.4.01
    
    # Define variables
    TRIMMED_READS="trimmed_reads.fastq.gz" # Placeholder for actual trimmed reads file
    REPBASE_INDEX_DIR="/path/to/RepBase_v18.05_STAR_index" # Placeholder for STAR index directory
    OUTPUT_PREFIX="repbase_alignment"
    
    # Note: RepBase v18.05 FASTA file would be required to build the STAR index.
    # Example command to build STAR index (run once):
    # STAR --runMode genomeGenerate \
    #      --genomeDir ${REPBASE_INDEX_DIR} \
    #      --genomeFastaFiles /path/to/RepBase_v18.05.fasta \
    #      --runThreadN 8 # Adjust threads as needed
    
    # Align trimmed reads to RepBase v18.05
    STAR --runThreadN 8 \
         --genomeDir ${REPBASE_INDEX_DIR} \
         --readFilesIn ${TRIMMED_READS} \
         --outFileNamePrefix ${OUTPUT_PREFIX}_ \
         --outSAMtype BAM SortedByCoordinate \
         --outSAMunmapped Within \
         --outFilterMultimapNmax 100 \
         --outFilterMismatchNmax 10 \
         --alignIntronMax 1 \
         --alignEndsType Local
  3. 3

    Reads that didn't align to RepBase were aligned to ce11 using STAR v2.4.01

    $ Bash example
    # Install STAR (if not already installed)
    # conda install -c bioconda star=2.4.01
    
    # Define variables
    GENOME_DIR="/path/to/STAR_index/ce11" # Placeholder: Replace with actual path to ce11 STAR index
    READS_R1="unaligned_reads_R1.fastq.gz" # Placeholder: Replace with actual path to unaligned R1 reads
    READS_R2="unaligned_reads_R2.fastq.gz" # Placeholder: Replace with actual path to unaligned R2 reads (remove if single-end)
    OUTPUT_PREFIX="ce11_aligned_"
    THREADS=8 # Number of threads to use
    
    # Align reads to ce11 using STAR
    STAR --genomeDir "${GENOME_DIR}" \
         --readFilesIn "${READS_R1}" "${READS_R2}" \
         --runThreadN "${THREADS}" \
         --outFileNamePrefix "${OUTPUT_PREFIX}" \
         --outSAMtype BAM SortedByCoordinate \
         --outBAMcompression 6
  4. 4

    Featurecounts v1.5.0-p1 and DESeq2 were used to count and normalize genes for differential expression, respectively.

    featureCounts v1.5.0 GitHub
    $ Bash example
    # Install featureCounts (part of Subread package)
    # conda install -c bioconda subread
    
    # Download a reference GTF annotation file (e.g., GENCODE human GRCh38)
    # wget -O gencode.v44.annotation.gtf.gz ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/gencode.v44.annotation.gtf.gz
    # gunzip gencode.v44.annotation.gtf.gz
    
    # --- featureCounts execution ---
    # Assuming input BAM files are sample1_rep1.bam, sample1_rep2.bam, etc.
    # Replace with actual BAM file paths and the correct GTF path
    # The -F exon -t gene -g gene_id parameters are common for gene-level counting from exons.
    featureCounts -a gencode.v44.annotation.gtf -o gene_counts.txt -F exon -t gene -g gene_id \
      sample1_rep1.bam sample1_rep2.bam sample2_rep1.bam sample2_rep2.bam
    
    # Install DESeq2 R package
    # R -e 'if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager"); BiocManager::install("DESeq2")'
    
    # --- DESeq2 execution (R script) ---
    # Create an R script file, e.g., run_deseq2.R
    cat << 'EOF' > run_deseq2.R
    library(DESeq2)
    
    # Read featureCounts output
    # The first column is gene_id, subsequent columns are counts for each sample.
    # skip=1 to skip the comment line, header=TRUE to use the second line as column names,
    # row.names=1 to set the Geneid column as row names.
    counts_data <- read.table("gene_counts.txt", header = TRUE, row.names = 1, skip = 1)
    
    # Remove the 'Chr', 'Start', 'End', 'Strand', 'Length' columns if present from featureCounts output.
    # These are typically columns 1-5 after the gene_id column in the raw featureCounts output.
    # Adjust column indices if your featureCounts output format differs.
    counts_data <- counts_data[, 6:ncol(counts_data)]
    
    # Ensure column names are clean (remove .bam suffix if present for easier matching with colData)
    colnames(counts_data) <- gsub("\\.bam$", "", colnames(counts_data))
    
    # Create sample metadata (colData)
    # THIS IS A PLACEHOLDER. You MUST replace it with your actual experimental design.
    # Example: two conditions (control, treated) with two replicates each.
    # Ensure the row names of colData match the column names of your counts_data.
    # The order of samples in colData must match the order of columns in counts_data.
    sample_names <- colnames(counts_data)
    conditions <- factor(c("control", "control", "treated", "treated")) # Example conditions
    
    colData <- data.frame(condition = conditions, row.names = sample_names)
    
    # Create DESeqDataSet object
    dds <- DESeqDataSetFromMatrix(countData = counts_data, colData = colData, design = ~ condition)
    
    # Run DESeq2 differential expression analysis
    dds <- DESeq(dds)
    
    # Get results for a specific contrast (e.g., treated vs control)
    res <- results(dds, contrast = c("condition", "treated", "control"))
    
    # Order results by adjusted p-value
    res <- res[order(res$padj), ]
    
    # Save results to a CSV file
    write.csv(as.data.frame(res), file = "deseq2_differential_expression_results.csv")
    
    # Optional: Save normalized counts
    normalized_counts <- counts(dds, normalized = TRUE)
    write.csv(as.data.frame(normalized_counts), file = "deseq2_normalized_counts.csv")
    EOF
    
    # Execute the R script
    Rscript run_deseq2.R
  5. 5

    Gene regions were described using WS254 (Wormbase) annotations.

    Wormbase (Inferred with models/gemini-2.5-flash) vWS254
    $ Bash example
    # Download C. elegans WS254 annotations (GFF3 format)
    # Source: ftp://ftp.wormbase.org/pub/wormbase/releases/WS254/
    wget -O c_elegans.WS254.annotations.gff3.gz ftp://ftp.wormbase.org/pub/wormbase/releases/WS254/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS254.annotations.gff3.gz
    gunzip c_elegans.WS254.annotations.gff3.gz
  6. 6

    Reads were removed of potential duplicates and combined respective to their WT or Adr condition for editing calls using samtools v1.3.1

    samtools v1.3.1 GitHub
    $ Bash example
    # Install samtools if not already installed
    # conda install -c bioconda samtools=1.3.1
    
    # Define input BAM files (placeholders)
    # For WT condition
    WT_REP1_BAM="wt_replicate1.bam"
    WT_REP2_BAM="wt_replicate2.bam"
    
    # For Adr condition
    ADR_REP1_BAM="adr_replicate1.bam"
    ADR_REP2_BAM="adr_replicate2.bam"
    
    # --- Process WT samples ---
    # Deduplicate WT replicate 1
    # Assuming input BAMs are sorted by coordinate. If not, add samtools sort before markdup.
    # samtools sort -o ${WT_REP1_BAM%.bam}_sorted.bam ${WT_REP1_BAM}
    # WT_REP1_BAM_SORTED="${WT_REP1_BAM%.bam}_sorted.bam"
    samtools markdup -r ${WT_REP1_BAM} ${WT_REP1_BAM%.bam}_dedup.bam
    
    # Deduplicate WT replicate 2
    samtools markdup -r ${WT_REP2_BAM} ${WT_REP2_BAM%.bam}_dedup.bam
    
    # Combine deduplicated WT replicates
    samtools merge wt_combined.bam ${WT_REP1_BAM%.bam}_dedup.bam ${WT_REP2_BAM%.bam}_dedup.bam
    
    # Sort and index the combined WT BAM for editing calls
    samtools sort -o wt_combined_sorted.bam wt_combined.bam
    samtools index wt_combined_sorted.bam
    
    # --- Process Adr samples ---
    # Deduplicate Adr replicate 1
    samtools markdup -r ${ADR_REP1_BAM} ${ADR_REP1_BAM%.bam}_dedup.bam
    
    # Deduplicate Adr replicate 2
    samtools markdup -r ${ADR_REP2_BAM} ${ADR_REP2_BAM%.bam}_dedup.bam
    
    # Combine deduplicated Adr replicates
    samtools merge adr_combined.bam ${ADR_REP1_BAM%.bam}_dedup.bam ${ADR_REP2_BAM%.bam}_dedup.bam
    
    # Sort and index the combined Adr BAM for editing calls
    samtools sort -o adr_combined_sorted.bam adr_combined.bam
    samtools index adr_combined_sorted.bam
  7. 7

    Reads were filtered and removed using the following parameters: reads containing junction overhangs less than 10nt long, reads containing indels, reads containing mutations within 5nt of the 3' end, reads containing more than 1 non AG or TC antisense mutation

    eclip_filter_reads.py (Inferred with models/gemini-2.5-flash) vv1.0.0 GitHub
    $ Bash example
    # This script is conceptual, representing a comprehensive filtering step within the eCLIP pipeline.
    # The exact script name and parameter names may vary based on the specific implementation
    # within the Yeo lab's eCLIP workflow (e.g., a combination of existing scripts or a custom one).
    
    # Example installation (assuming the eCLIP pipeline repository is cloned)
    # git clone https://github.com/yeolab/eclip.git
    # cd eclip
    # # Ensure Python dependencies like pysam are installed
    # # conda install -c bioconda pysam
    
    # Define input and output files
    INPUT_BAM="aligned_reads.bam"
    OUTPUT_BAM="filtered_reads.bam"
    
    # Execute the filtering command with inferred parameters
    # --min-junction-overhang 10: Removes reads with junction overhangs less than 10nt long.
    # --remove-indels: Removes reads containing indels.
    # --min-dist-3prime 5: Removes reads containing mutations within 5nt of the 3' end.
    # --max-non-canonical-antisense-mutations 1: Removes reads containing more than 1 non-AG or TC antisense mutation.
    # Note: The last parameter is highly specific and might be implemented with a combination of flags or internal logic within the actual script.
    
    python eclip_filter_reads.py \
        --input-bam "${INPUT_BAM}" \
        --output-bam "${OUTPUT_BAM}" \
        --min-junction-overhang 10 \
        --remove-indels \
        --min-dist-3prime 5 \
        --max-non-canonical-antisense-mutations 1
  8. 8

    Reads were piled up using samtools 1.3.1 [-d 1000 -E -I -p -o -f -t DP,DV,DPR,INFO/DPR,DP4,SP]

    samtools v1.3.1 GitHub
    $ Bash example
    # Install samtools if not already installed
    # conda install -c bioconda samtools=1.3.1
    
    # Placeholder for reference genome (e.g., human hg38)
    # Replace 'path/to/your/reference.fa' with the actual path to your reference genome.
    REFERENCE_GENOME="path/to/your/reference.fa"
    
    # Placeholder for input BAM file
    # Replace 'path/to/your/input.bam' with the actual path to your aligned BAM file.
    INPUT_BAM="path/to/your/input.bam"
    
    # Placeholder for output BCF file
    # Replace 'path/to/your/output.bcf' with your desired output file name.
    OUTPUT_BCF="path/to/your/output.bcf"
    
    # Reads were piled up using samtools 1.3.1
    # -d 1000: Maximum per-BAM depth
    # -E: Extended BAQ (base alignment quality) computation
    # -I: Do not skip anomalous read pairs in variant calling
    # -u: Output uncompressed BCF (required for -t option to be effective for variant calling tags)
    # -f: Reference FASTA file
    # -t: Output tags for BCF output (DP,DV,DPR,INFO/DPR,DP4,SP)
    samtools mpileup -d 1000 -E -I -u -f "${REFERENCE_GENOME}" -t DP,DV,DPR,INFO/DPR,DP4,SP "${INPUT_BAM}" > "${OUTPUT_BCF}"
  9. 9

    Variants were called using bcftools 1.2.1 [-O -c -A -i -v]

    bcftools v1.2.1 GitHub
    $ Bash example
    # Install bcftools (if not already installed)
    # conda install -c bioconda bcftools=1.2.1
    
    # Placeholder for reference genome
    # Replace with your actual reference genome path (e.g., hg38.fa)
    REFERENCE_GENOME="path/to/your/reference.fa"
    
    # Placeholder for input BCF/VCF file (e.g., from bcftools mpileup)
    INPUT_FILE="path/to/your/input.bcf"
    
    # Placeholder for output VCF file
    OUTPUT_FILE="variants.vcf"
    
    # Call variants using bcftools
    # Note: The -i option requires a filter expression. A common example 'QUAL>20' is used here.
    # Replace with the actual filter expression used in the pipeline if known.
    bcftools call \
        -Ov \
        -c \
        -A \
        -i 'QUAL>20' \
        -v \
        -f "${REFERENCE_GENOME}" \
        "${INPUT_FILE}" > "${OUTPUT_FILE}"
  10. 10

    Variants were filtered for and removed: known SNPs (wormbase WS254), positions covered by less than 5 reads, variants which were not A-G or T-C antisense

    bcftools (Inferred with models/gemini-2.5-flash) v1.16 GitHub
    $ Bash example
    # Install bcftools if not already installed
    # conda install -c bioconda bcftools
    
    # Placeholder for input VCF and known SNPs VCF
    INPUT_VCF="input.vcf.gz"
    KNOWN_SNPS_VCF="wormbase_ws254_snps.vcf.gz" # This file should contain known SNPs from Wormbase WS254
    
    # Example of how to obtain known SNPs from WormBase WS254 (adjust URL if needed, assuming C. elegans)
    # wget -O "${KNOWN_SNPS_VCF}" ftp://ftp.wormbase.org/pub/wormbase/releases/WS254/species/c_elegans/PRJNA13758/variation/c_elegans.PRJNA13758.WS254.vcf.gz
    
    # Ensure the VCF files are indexed for faster processing
    # bcftools index "${INPUT_VCF}"
    # bcftools index "${KNOWN_SNPS_VCF}"
    
    # Step 1: Filter out known SNPs
    # Use 'bcftools isec' to output sites present in INPUT_VCF but NOT in KNOWN_SNPS_VCF.
    # -C: output sites unique to the first file
    # -w 1: output sites unique to the first file (redundant with -C but good for clarity)
    # -Oz: output gzipped VCF
    # -o: output file
    bcftools isec -C -w 1 "${INPUT_VCF}" "${KNOWN_SNPS_VCF}" -Oz -o temp_no_known_snps.vcf.gz
    
    # Step 2: Apply remaining filters on the VCF without known SNPs
    # -i: include sites that satisfy the expression
    # INFO/DP >= 5: Keep positions covered by at least 5 reads (assuming INFO/DP is the site-level depth)
    # (REF="A" && ALT="G" || REF="T" && ALT="C"): Keep only A-G or T-C transitions.
    #   (The term "antisense" here is interpreted as referring to these specific transition types regardless of strand, 
    #    as A->G on sense is T->C on antisense, and T->C on sense is A->G on antisense.)
    bcftools view -i 'INFO/DP >= 5 && (REF="A" && ALT="G" || REF="T" && ALT="C")' \
                 temp_no_known_snps.vcf.gz \
                 -Oz -o filtered.vcf.gz
    
    # Clean up temporary file
    rm temp_no_known_snps.vcf.gz
  11. 11

    Variants were scored using a previously published bayesian method and filtered for those passing a 0.99 confidence parameter

    FreeBayes (Inferred with models/gemini-2.5-flash) v1.3.6 GitHub
    $ Bash example
    # Install FreeBayes (if not already installed)
    # conda install -c bioconda freebayes
    
    # Install bcftools (if not already installed)
    # conda install -c bioconda bcftools
    
    # Define variables
    REFERENCE_GENOME="GRCh38.fa" # Placeholder: Replace with actual reference genome path (e.g., /path/to/GRCh38.fa)
    INPUT_BAM="input.bam"       # Placeholder: Replace with actual input BAM file path (e.g., /path/to/sample.bam)
    OUTPUT_RAW_VCF="raw_variants.vcf"
    OUTPUT_FILTERED_VCF="filtered_variants.vcf"
    CONFIDENCE_QUAL_THRESHOLD=20 # 0.99 confidence often translates to a Phred-scaled QUAL score of 20 (-10*log10(1-0.99))
    
    # Step 1: Score variants using FreeBayes (Bayesian method)
    # This command performs variant calling using a Bayesian model.
    # Adjust parameters like -q (min base quality), -m (min mapping quality), -C (min coverage) as needed.
    freebayes -f "${REFERENCE_GENOME}" "${INPUT_BAM}" > "${OUTPUT_RAW_VCF}"
    
    # Step 2: Filter variants for those passing the 0.99 confidence parameter
    # Using bcftools to filter based on the QUAL score.
    # A QUAL score of 20 corresponds to a Phred-scaled probability of 1 - 10^(-20/10) = 1 - 10^-2 = 1 - 0.01 = 0.99.
    bcftools filter -i "QUAL > ${CONFIDENCE_QUAL_THRESHOLD}" "${OUTPUT_RAW_VCF}" -o "${OUTPUT_FILTERED_VCF}"
  12. 12

    Surviving variants from the Adr2- samples were filtered out.

    bcftools (Inferred with models/gemini-2.5-flash) v1.19 GitHub
    $ Bash example
    # Install bcftools if not already installed
    # conda install -c bioconda bcftools
    
    # Define input and output VCF files.
    # 'adr2_samples.vcf' is a placeholder for the VCF file containing variants from Adr2- samples.
    INPUT_VCF="adr2_samples.vcf"
    OUTPUT_VCF="adr2_filtered_variants.vcf"
    
    # Define a placeholder reference genome (e.g., GRCh38) if context requires it, 
    # though not directly used by this specific bcftools filtering command.
    # REFERENCE_GENOME="GRCh38"
    
    # Filter out variants. The description "filtered out" implies removing variants 
    # that do not meet certain criteria. Here, we apply a common filter to keep 
    # variants with a quality score (QUAL) greater than 30 and where the FILTER 
    # field is either 'PASS' or not set ('.').
    # The -Oz flag compresses the output VCF with bgzip.
    bcftools view -i 'QUAL > 30 && (FILTER="PASS" || FILTER=".")' "${INPUT_VCF}" -o "${OUTPUT_VCF}" -Oz
    
    # Index the filtered VCF file for quick access
    bcftools index "${OUTPUT_VCF}"
  13. 13

    Variants were annotated using bedtools 2.26 on WS254 annotations, prioritized using the following: 3'UTR, 5'UTR, CDS, Intron, mRNA, piRNA, ncRNA, tRNA, pre-ncRNA, miRNA, snoRNA, pre_miRNA, lincRNA, snRNA, antisense RNA, rRNA, pre-miRNA, downstream from gene, upstream from gene, downstream noncoding RNA, upstream noncoding RNA and scRNA

    bedtools v2.26 GitHub
    $ Bash example
    # Install bedtools 2.26
    # conda install -c bioconda bedtools=2.26
    
    # Define input and annotation files
    VARIANTS_BED="variants.bed" # Placeholder for your variant BED file (e.g., from variant calling)
    # WS254 annotations typically refer to a specific release from WormBase (e.g., C. elegans).
    # This file should contain genomic features in BED format, with a column indicating feature type (e.g., 4th column).
    WS254_ANNOTATIONS_BED="WS254_annotations.bed" # Placeholder for WS254 annotations (e.g., downloaded from WormBase)
    OUTPUT_ANNOTATED_VARIANTS="annotated_variants_raw.bed"
    
    # Annotate variants using bedtools intersect
    # -a: input variants (e.g., a BED file of variant coordinates)
    # -b: annotation features (e.g., WS254_annotations.bed)
    # -wao: write the original entry in A and the overlapping entry in B. If no overlap, B fields are set to '.'
    # This command will output each variant along with all overlapping features from the WS254 annotations.
    bedtools intersect -a "${VARIANTS_BED}" -b "${WS254_ANNOTATIONS_BED}" -wao > "${OUTPUT_ANNOTATED_VARIANTS}"
    
    # The following step involves prioritizing annotations based on the specified list:
    # 3'UTR, 5'UTR, CDS, Intron, mRNA, piRNA, ncRNA, tRNA, pre-ncRNA, miRNA, snoRNA, pre_miRNA, lincRNA, snRNA, antisense RNA, rRNA, pre-miRNA, downstream from gene, upstream from gene, downstream noncoding RNA, upstream noncoding RNA and scRNA
    # This prioritization typically requires a custom script (e.g., Python, Awk, or R) to parse the output
    # from bedtools intersect ("${OUTPUT_ANNOTATED_VARIANTS}") and apply the defined hierarchy.
    # The script would read each variant and its associated features, then select the highest-priority feature
    # based on the provided list.
    # Example (conceptual, assuming a Python script):
    # python prioritize_annotations.py "${OUTPUT_ANNOTATED_VARIANTS}" "prioritized_variants.bed" \
    #   --priority_list "3'UTR,5'UTR,CDS,Intron,mRNA,piRNA,ncRNA,tRNA,pre-ncRNA,miRNA,snoRNA,pre_miRNA,lincRNA,snRNA,antisense RNA,rRNA,pre-miRNA,downstream from gene,upstream from gene,downstream noncoding RNA,upstream noncoding RNA,scRNA"

Tools Used

Raw Source Text
Reads were trimmed of adapter sequences TCGTATGCCGTCTTCTGCTTG, ATCTCGTATGCCGTCTTCTGCTTG, CGACAGGTTCAGAGTTCTACAGTCCGACGATC GATCGGAAGAGCACACGTCTGAACTCCAGTCAC using cutadapt v1.9.1
Trimmed reads were aligned to RepBase v18.05 using STAR v2.4.01
Reads that didn't align to RepBase were aligned to ce11 using STAR v2.4.01
Featurecounts v1.5.0-p1 and DESeq2 were used to count and normalize genes for differential expression, respectively. Gene regions were described using WS254 (Wormbase) annotations.
Reads were removed of potential duplicates and combined respective to their WT or Adr condition for editing calls using samtools v1.3.1
Reads were filtered and removed using the following parameters: reads containing junction overhangs less than 10nt long, reads containing indels, reads containing mutations within 5nt of the 3' end, reads containing more than 1 non AG or TC antisense mutation
Reads were piled up using samtools 1.3.1 [-d 1000 -E -I -p -o -f -t DP,DV,DPR,INFO/DPR,DP4,SP]
Variants were called using bcftools 1.2.1 [-O -c -A -i -v]
Variants were filtered for and removed: known SNPs (wormbase WS254), positions covered by less than 5 reads, variants which were not A-G or T-C antisense
Variants were scored using a previously published bayesian method and filtered for those passing a 0.99 confidence parameter
Surviving variants from the Adr2- samples were filtered out.
Variants were annotated using bedtools 2.26 on WS254 annotations, prioritized using the following: 3'UTR, 5'UTR, CDS, Intron, mRNA, piRNA, ncRNA, tRNA, pre-ncRNA, miRNA, snoRNA, pre_miRNA, lincRNA, snRNA, antisense RNA, rRNA, pre-miRNA, downstream from gene, upstream from gene, downstream noncoding RNA, upstream noncoding RNA and scRNA
Genome_build: ce11
Supplementary_files_format_and_content: diffexp.tsv contains DESeq2-processed differential expression results for WS254 wormbase genes between WT and Adr2- knockout conditions. editing_calls.tsv contains all annotated editing calls removed of adr2- false positives. GSF973-Hundley-SarahD-1_S1_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam, GSF973-Hundley-SarahD-3_S3_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam, and GSF973-Hundley-SarahD-6_S6_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam contain ce11-aligned reads pertaining to WT neural C elegan RNA. GSF973-Hundley-SarahD-2_S2_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam, GSF973-Hundley-SarahD-5_S5_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam, and GSF973-Hundley-SarahD-7_S7_R1_001.polyATrim.adapterTrim.rmRep.sorted.rg.bam contain ce11-aligned reads pertaining to adr2 knockout neural C elegan RNA.
← Back to Analysis