GSE262542 Processing Pipeline
RNA-Seq
code_examples
4 steps
Publication
Evaluation of novel computational methods to identify RNA-binding protein footprints from structural data.RNA (New York, N.Y.) (2025) — PMID 40399037
Dataset
GSE262542Evaluation of novel computational methods that identify RNA-binding protein footprints from structural data
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
data processing was done using the Skipper pipeline, freelly available at https://github.com/yeolab/skipper.
$ Bash example
# Clone the Skipper pipeline repository git clone https://github.com/yeolab/skipper.git cd skipper # The Skipper pipeline is a Snakemake workflow that requires a 'config.yaml' file # and input data (e.g., FASTQ files) to be present in the working directory. # The 'config.yaml' typically specifies parameters such as the reference genome (e.g., hg38 for human eCLIP). # Users need to prepare these files according to the pipeline's documentation. # Execute the Skipper pipeline using Snakemake. # The --cores flag specifies the number of CPU cores to use. # The --use-conda flag enables Snakemake to manage software environments via Conda. snakemake --cores 8 --use-conda
-
2
Adapters trimming was done with Skewer
$ Bash example
# Install Skewer (example using conda) # conda install -c bioconda skewer # Define input and output file names INPUT_FASTQ="input.fastq.gz" OUTPUT_PREFIX="trimmed_reads" ADAPTER_FILE="adapters.fa" # Placeholder for adapter sequences file (e.g., containing Illumina adapters) # Execute Skewer for adapter trimming # Parameters are based on common eCLIP settings (e.g., from yeolab/eclip workflow) # -x: Adapter sequences file # -l: Minimum read length after trimming (default 18 in eclip workflow) # -q: Minimum quality score to trim (default 20 in eclip workflow) # -m: Minimum overlap length for adapter detection (default 1 in eclip workflow) # -o: Output file prefix skewer -x "${ADAPTER_FILE}" -l 18 -q 20 -m 1 -o "${OUTPUT_PREFIX}" "${INPUT_FASTQ}" -
3
proccessed reads were mapped with STAR (2.7.10a_alpha_220314)
$ Bash example
# Install STAR (if not already installed) # conda install -c bioconda star # Placeholder for reference genome directory # Replace with your actual STAR genome index path (e.g., for hg38) GENOME_DIR="/path/to/STAR_genome_index/hg38" # Placeholder for input FASTQ files # Replace with your actual input FASTQ files (e.g., processed reads) # Assuming paired-end reads, adjust for single-end if necessary READS_R1="processed_reads_R1.fastq.gz" READS_R2="processed_reads_R2.fastq.gz" # Placeholder for output directory and prefix OUTPUT_DIR="star_mapping_output" OUTPUT_PREFIX="${OUTPUT_DIR}/aligned_reads" # Create output directory if it doesn't exist mkdir -p "${OUTPUT_DIR}" # Run STAR alignment # Common parameters for RNA-seq alignment STAR --genomeDir "${GENOME_DIR}" \ --readFilesIn "${READS_R1}" "${READS_R2}" \ --runThreadN 8 \ --outFileNamePrefix "${OUTPUT_PREFIX}" \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes Standard \ --outFilterMultimapNmax 20 \ --outFilterMismatchNmax 999 \ --outFilterMismatchNoverLmax 0.1 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --readFilesCommand zcat -
4
PCR bias was removed using UMIcollapse
UMIcollapse vN/A$ Bash example
# UMIcollapse is a Python script. Ensure Python is installed. # Download the script (if not already available in your environment): # wget https://raw.githubusercontent.com/MikeDacre/UMIcollapse/master/UMIcollapse.py # chmod +x UMIcollapse.py # Example usage: # Replace input.bam with your actual input BAM file containing UMIs. # Replace output.bam with your desired output deduplicated BAM file. python UMIcollapse.py -i input.bam -o output.bam
Raw Source Text
data processing was done using the Skipper pipeline, freelly available at https://github.com/yeolab/skipper. Adapters trimming was done with Skewer proccessed reads were mapped with STAR (2.7.10a_alpha_220314) PCR bias was removed using UMIcollapse Assembly: hg38 Supplementary files format and content: tab separated values files Supplementary files format and content: Supplementary files - reproducible enriched window including p and q values, enrichment scores, and the annotated regions for the significantly bound transcriptome tiled windows.