GSE181139 Processing Pipeline

GSE code_examples 1 step

Publication

Identification of the global miR-130a targetome reveals a role for TBL1XR1 in hematopoietic stem cell self-renewal and t(8;21) AML.

Cell reports (2022) — PMID 35263585

Dataset

GSE181139

Identification of the Global miR-130a Targetome Reveals a Novel Role for TBL1XR1 in Hematopoietic Stem Cell Self-Renewal and t(8;21) AML [TBL1XR1 KD]

Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
  1. 1

    Raw Reads were aligned using STAR against known GENCODE 32 transcripts

    STAR v2.7.10a (Inferred with models/gemini-2.5-flash)
    $ Bash example
    # --- Installation (example using Conda) ---
    # conda create -n star_env star=2.7.10a -y
    # conda activate star_env
    
    # --- Reference Data Setup (GENCODE 32 for GRCh38) ---
    # This section demonstrates how to prepare the STAR genome index for GENCODE 32.
    # The actual alignment command assumes this index is already built.
    
    # # Create a directory for the genome index
    # mkdir -p star_gencode32_index
    # cd star_gencode32_index
    
    # # Download GENCODE 32 human genome (GRCh38 primary assembly) FASTA and GTF files
    # wget -c ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/GRCh38.primary_assembly.genome.fa.gz
    # wget -c ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/gencode.v32.annotation.gtf.gz
    
    # # Unzip the downloaded files
    # gunzip GRCh38.primary_assembly.genome.fa.gz
    # gunzip gencode.v32.annotation.gtf.gz
    
    # # Build the STAR genome index
    # # Adjust --sjdbOverhang based on the maximum read length minus 1 (e.g., 74 for 75bp reads).
    # # For eCLIP, read lengths can vary, often 50-75bp. Using 74 as a common example.
    # STAR --runMode genomeGenerate \
    #      --genomeDir . \
    #      --genomeFastaFiles GRCh38.primary_assembly.genome.fa \
    #      --sjdbGTFfile gencode.v32.annotation.gtf \
    #      --sjdbOverhang 74 \
    #      --runThreadN 8 # Adjust number of threads as needed
    # cd ..
    
    # --- Alignment Step ---
    # Define variables for input and output
    GENOME_DIR="star_gencode32_index" # Path to your pre-built STAR genome index
    READS_R1="raw_reads_R1.fastq.gz" # Placeholder for input FASTQ file (Read 1)
    READS_R2="raw_reads_R2.fastq.gz" # Placeholder for input FASTQ file (Read 2, remove for single-end)
    OUTPUT_PREFIX="aligned_sample" # Prefix for output files (e.g., aligned_sample.Log.out, aligned_sample.Aligned.sortedByCoordinate.out.bam)
    NUM_THREADS=8 # Adjust number of threads as needed
    
    # Execute STAR alignment
    STAR --genomeDir "${GENOME_DIR}" \
         --readFilesIn "${READS_R1}" "${READS_R2}" \
         --readFilesCommand zcat \
         --runThreadN "${NUM_THREADS}" \
         --outFileNamePrefix "${OUTPUT_PREFIX}." \
         --outSAMtype BAM SortedByCoordinate \
         --outSAMattributes All \
         --outFilterMultimapNmax 20 \
         --outFilterMismatchNmax 999 \
         --outFilterMismatchNoverLmax 0.04 \
         --alignIntronMin 20 \
         --alignIntronMax 1000000 \
         --alignMatesGapMax 1000000 \
         --sjdbScore 1 \
         --quantMode GeneCounts # Optional: Output gene counts (ReadsPerGene.out.tab)

Tools Used

Raw Source Text
Raw Reads were aligned using STAR against known GENCODE 32 transcripts
Genome_build: hg38
Supplementary_files_format_and_content: tab delimited read counts
← Back to Analysis