GSE33584 Processing Pipeline

RNA-Seq code_examples 2 steps

Publication

High-resolution profiling and analysis of viral and host small RNAs during human cytomegalovirus infection.

Journal of virology (2012) — PMID 22013051

Dataset

GSE33584

High-resolution profiling and analysis of viral and host small RNAs during human cytomegalovirus infection

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Adapter-trimmed sequencing libraries were mapped against an index composed of both the human (UCSC hg19, http://genome.ucsc.edu) and HCMV Towne (Genbank FJ616285.1) genomes, using Bowtie software (version 0.12.7, with parameters -k 1 -m 10 -l 25 --best).

    Bowtie v0.12.7
    $ Bash example
    # Install Bowtie (version 0.12.7 or compatible)
    # conda install -c bioconda bowtie=0.12.7
    
    # --- Reference Genome Preparation ---
    # 1. Download human genome (UCSC hg19)
    wget -O hg19.fa.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
    gunzip hg19.fa.gz
    
    # 2. Download HCMV Towne genome (Genbank FJ616285.1)
    # Ensure 'entrez-direct' is installed for 'efetch': conda install -c bioconda entrez-direct
    efetch -db nuccore -id FJ616285.1 -format fasta > HCMV_Towne.fa
    
    # 3. Combine genomes into a single FASTA file
    cat hg19.fa HCMV_Towne.fa > combined_human_hcmv.fa
    
    # 4. Build Bowtie index for the combined genome
    # The index name will be 'combined_human_hcmv'
    bowtie-build combined_human_hcmv.fa combined_human_hcmv
    
    # --- Alignment Step ---
    # Placeholder for adapter-trimmed sequencing libraries (e.g., trimmed_reads.fastq)
    # Replace 'trimmed_reads.fastq' with your actual input file
    # Replace 'mapped_reads.sam' with your desired output file name
    bowtie -k 1 -m 10 -l 25 --best combined_human_hcmv trimmed_reads.fastq mapped_reads.sam
  2. 2

    Custom Perl scripts were used to overlap mapped coordinates with smRNA annotations, which were obtained from miRBase 16.0 (http://www.mirbase.org) and the UCSC Table Browser.

    Custom Perl scripts
    $ Bash example
    # Placeholder for input mapped coordinates (e.g., from a previous alignment/peak calling step)
    INPUT_MAPPED_COORDINATES="path/to/your/mapped_coordinates.bed"
    
    # Reference smRNA annotations from miRBase 16.0
    # miRBase 16.0 is an older version. The exact download URL for a BED file might vary or require conversion from GFF.
    # For demonstration, assume a pre-processed BED file is available.
    MIRBASE_16_0_SMRNA_BED="path/to/mirbase_16.0_smrna.bed"
    
    # Reference smRNA annotations from UCSC Table Browser
    # These would typically be downloaded via the UCSC Table Browser interface for a specific genome and track.
    # For demonstration, assume a pre-processed BED file is available.
    UCSC_SMRNA_BED="path/to/ucsc_smrna.bed"
    
    # Combine smRNA annotations if the custom script expects a single file
    # This step might be handled internally by the custom script or done beforehand.
    # cat "${MIRBASE_16_0_SMRNA_BED}" "${UCSC_SMRNA_BED}" | sort -k1,1 -k2,2n | bedtools merge > combined_smrna_annotations.bed
    COMBINED_SMRNA_ANNOTATIONS="path/to/combined_smrna_annotations.bed" # Or use the output of the above command
    
    # Execute the custom Perl script to overlap mapped coordinates with smRNA annotations.
    # The exact script name and parameters are unknown, so this is a generic representation.
    # The script would take mapped coordinates and smRNA annotations as input and produce overlapped regions.
    perl custom_overlap_script.pl \
        --input_mapped_coordinates "${INPUT_MAPPED_COORDINATES}" \
        --smrna_annotations "${COMBINED_SMRNA_ANNOTATIONS}" \
        --output_overlapped_regions "overlapped_smrna_regions.bed"
Raw Source Text
Adapter-trimmed sequencing libraries were mapped against an index composed of both the human (UCSC hg19, http://genome.ucsc.edu) and HCMV Towne (Genbank FJ616285.1) genomes, using Bowtie software (version 0.12.7, with parameters -k 1 -m 10 -l 25 --best). Custom Perl scripts were used to overlap mapped coordinates with smRNA annotations, which were obtained from miRBase 16.0 (http://www.mirbase.org) and the UCSC Table Browser.
← Back to Analysis