GSE181137 Processing Pipeline
GSE
code_examples
1 step
Publication
Identification of the global miR-130a targetome reveals a role for TBL1XR1 in hematopoietic stem cell self-renewal and t(8;21) AML.Cell reports (2022) — PMID 35263585
Dataset
GSE181137Identification of the Global miR-130a Targetome Reveals a Novel Role for TBL1XR1 in Hematopoietic Stem Cell Self-Renewal and t(8;21) AML [miR130aKD_K…
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
Raw Reads were aligned using STAR against known GENCODE 32 transcripts
STAR v2.7.10a (Inferred with models/gemini-2.5-flash)$ Bash example
# --- Installation (example using Conda) --- # conda create -n star_env star=2.7.10a -y # conda activate star_env # --- Reference Data Setup (GENCODE 32 for GRCh38) --- # This section demonstrates how to prepare the STAR genome index for GENCODE 32. # The actual alignment command assumes this index is already built. # # Create a directory for the genome index # mkdir -p star_gencode32_index # cd star_gencode32_index # # Download GENCODE 32 human genome (GRCh38 primary assembly) FASTA and GTF files # wget -c ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/GRCh38.primary_assembly.genome.fa.gz # wget -c ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/gencode.v32.annotation.gtf.gz # # Unzip the downloaded files # gunzip GRCh38.primary_assembly.genome.fa.gz # gunzip gencode.v32.annotation.gtf.gz # # Build the STAR genome index # # Adjust --sjdbOverhang based on the maximum read length minus 1 (e.g., 74 for 75bp reads). # # For eCLIP, read lengths can vary, often 50-75bp. Using 74 as a common example. # STAR --runMode genomeGenerate \ # --genomeDir . \ # --genomeFastaFiles GRCh38.primary_assembly.genome.fa \ # --sjdbGTFfile gencode.v32.annotation.gtf \ # --sjdbOverhang 74 \ # --runThreadN 8 # Adjust number of threads as needed # cd .. # --- Alignment Step --- # Define variables for input and output GENOME_DIR="star_gencode32_index" # Path to your pre-built STAR genome index READS_R1="raw_reads_R1.fastq.gz" # Placeholder for input FASTQ file (Read 1) READS_R2="raw_reads_R2.fastq.gz" # Placeholder for input FASTQ file (Read 2, remove for single-end) OUTPUT_PREFIX="aligned_sample" # Prefix for output files (e.g., aligned_sample.Log.out, aligned_sample.Aligned.sortedByCoordinate.out.bam) NUM_THREADS=8 # Adjust number of threads as needed # Execute STAR alignment STAR --genomeDir "${GENOME_DIR}" \ --readFilesIn "${READS_R1}" "${READS_R2}" \ --readFilesCommand zcat \ --runThreadN "${NUM_THREADS}" \ --outFileNamePrefix "${OUTPUT_PREFIX}." \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes All \ --outFilterMultimapNmax 20 \ --outFilterMismatchNmax 999 \ --outFilterMismatchNoverLmax 0.04 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --sjdbScore 1 \ --quantMode GeneCounts # Optional: Output gene counts (ReadsPerGene.out.tab)
Tools Used
Raw Source Text
Raw Reads were aligned using STAR against known GENCODE 32 transcripts Genome_build: hg38 Supplementary_files_format_and_content: tab delimited read counts