GSE137810 Processing Pipeline
Publication
Investigational eIF2B activator DNL343 modulates the integrated stress response in preclinical models of TDP-43 pathology and individuals with ALS in a randomized clinical trial.Nature communications (2025) — PMID 40825784
Processing Steps
Generate Jupyter Notebook-
1
fastq Illumina RNASeq paired-end reads were aligned using STAR (2.5.2a).
$ Bash example
# Install STAR (example using conda) # conda create -n star_env star=2.5.2a -c bioconda -c conda-forge # conda activate star_env # Build STAR index (example for human hg38 genome and Gencode annotation) # mkdir -p /path/to/STAR_index_hg38 # STAR --runMode genomeGenerate \ # --genomeDir /path/to/STAR_index_hg38 \ # --genomeFastaFiles /path/to/hg38.fa \ # --sjdbGTFfile /path/to/gencode.vXX.annotation.gtf \ # --runThreadN 8 # Adjust number of threads as needed # Align paired-end RNA-Seq reads using STAR # Reference genome: hg38 (GRCh38) - placeholder, replace with actual path # Input files: sample_R1.fastq.gz, sample_R2.fastq.gz - placeholder, replace with actual file names # Output prefix: sample_aligned_ - placeholder, replace with desired prefix STAR --genomeDir /path/to/STAR_index_hg38 \ --readFilesIn sample_R1.fastq.gz sample_R2.fastq.gz \ --readFilesCommand zcat \ --runThreadN 8 \ --outFileNamePrefix sample_aligned_ \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes All \ --outFilterType BySJout \ --outFilterMultimapNmax 20 \ --alignSJDBoverhangMin 1 \ --alignSJoverhangMin 8 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --sjdbScore 1 \ --quantMode GeneCounts \ --limitBAMsortRAM 30000000000 # Adjust RAM limit (e.g., 30GB) as needed -
2
We used Leafcutter (0.2.6) to perform differential splicing analyses of each of four defined groups of ALS cases (ALSlow, ALShigh, C9, FTD grouped samples for a total of 54 samples referred to as the training set) against the Control group (all groups are described in: link to manuscript here).
$ Bash example
# Install Leafcutter (example using conda) # conda create -n leafcutter_env python=3 r-base r-devtools # conda activate leafcutter_env # conda install -c bioconda leafcutter # --- Prepare input files (conceptual, replace with actual paths and sample names) --- # Create a dummy directory for junction files mkdir -p junctions # Define sample counts for each group based on the description (54 ALS cases + Control group) NUM_ALS_LOW=15 NUM_ALS_HIGH=15 NUM_C9=12 NUM_FTD=12 NUM_CONTROL=20 # Placeholder for the Control group size # Create juncfiles.txt and groups_all.txt # In a real scenario, these files would be generated from your experimental data. # juncfiles.txt lists paths to junction files (e.g., from STAR aligner). # groups_all.txt maps sample IDs to their respective groups. > juncfiles.txt > groups_all.txt # A master groups file for initial processing # Generate dummy ALS samples (total 54) for i in $(seq 1 $NUM_ALS_LOW); do SAMPLE_ID="ALSlow_sample${i}" echo "junctions/${SAMPLE_ID}.junc" >> juncfiles.txt echo -e "${SAMPLE_ID}\tALSlow" >> groups_all.txt done for i in $(seq 1 $NUM_ALS_HIGH); do SAMPLE_ID="ALShigh_sample${i}" echo "junctions/${SAMPLE_ID}.junc" >> juncfiles.txt echo -e "${SAMPLE_ID}\tALShigh" >> groups_all.txt done for i in $(seq 1 $NUM_C9); do SAMPLE_ID="C9_sample${i}" echo "junctions/${SAMPLE_ID}.junc" >> juncfiles.txt echo -e "${SAMPLE_ID}\tC9" >> groups_all.txt done for i in $(seq 1 $NUM_FTD); do SAMPLE_ID="FTD_sample${i}" echo "junctions/${SAMPLE_ID}.junc" >> juncfiles.txt echo -e "${SAMPLE_ID}\tFTD" >> groups_all.txt done # Generate dummy Control samples for i in $(seq 1 $NUM_CONTROL); do SAMPLE_ID="Control_sample${i}" echo "junctions/${SAMPLE_ID}.junc" >> juncfiles.txt echo -e "${SAMPLE_ID}\tControl" >> groups_all.txt done # --- Leafcutter analysis --- # Reference genome (e.g., hg38) is implicitly used by the upstream alignment step that generates .junc files. # Leafcutter itself does not directly use a FASTA file. # 1. Extract junctions and convert to counts # This step takes the list of junction files and aggregates them into a counts matrix. # It also performs filtering based on minimum reads and samples. Default parameters are often used. leafcutter_junc2counts.py -j juncfiles.txt -o leafcutter_counts # The output will be leafcutter_counts_perind.counts.gz and leafcutter_counts_perind.clusters.gz # 2. Perform differential splicing analysis for each ALS group against the Control group # Define the output directory mkdir -p leafcutter_results # Comparison 1: ALSlow vs Control echo "Running differential splicing for ALSlow vs Control..." grep -E "ALSlow|Control" groups_all.txt > groups_ALSlow_vs_Control.txt leafcutter_ds.R \ --counts leafcutter_counts_perind.counts.gz \ --groups groups_ALSlow_vs_Control.txt \ --output_prefix leafcutter_results/ALSlow_vs_Control \ --num_threads 8 \ # --exon_file <path_to_exon_file.txt.gz> # Optional: provide an exon file for annotation (e.g., from gencode) # Comparison 2: ALShigh vs Control echo "Running differential splicing for ALShigh vs Control..." grep -E "ALShigh|Control" groups_all.txt > groups_ALShigh_vs_Control.txt leafcutter_ds.R \ --counts leafcutter_counts_perind.counts.gz \ --groups groups_ALShigh_vs_Control.txt \ --output_prefix leafcutter_results/ALShigh_vs_Control \ --num_threads 8 \ # --exon_file <path_to_exon_file.txt.gz> # Comparison 3: C9 vs Control echo "Running differential splicing for C9 vs Control..." grep -E "C9|Control" groups_all.txt > groups_C9_vs_Control.txt leafcutter_ds.R \ --counts leafcutter_counts_perind.counts.gz \ --groups groups_C9_vs_Control.txt \ --output_prefix leafcutter_results/C9_vs_Control \ --num_threads 8 \ # --exon_file <path_to_exon_file.txt.gz> # Comparison 4: FTD vs Control echo "Running differential splicing for FTD vs Control..." grep -E "FTD|Control" groups_all.txt > groups_FTD_vs_Control.txt leafcutter_ds.R \ --counts leafcutter_counts_perind.counts.gz \ --groups groups_FTD_vs_Control.txt \ --output_prefix leafcutter_results/FTD_vs_Control \ --num_threads 8 \ # --exon_file <path_to_exon_file.txt.gz> # Clean up temporary groups files rm groups_ALSlow_vs_Control.txt groups_ALShigh_vs_Control.txt groups_C9_vs_Control.txt groups_FTD_vs_Control.txt -
3
In each differential splicing analysis, Leafcutter outputs a file listing the set of cluster padj values, and a second file listing the splice junctions that reside within those clusters and their delta PSIs as a result of the differential analysis.
$ Bash example
# Install Leafcutter (example using conda) # conda create -n leafcutter_env python=3.8 # conda activate leafcutter_env # pip install leafcutter # Example input files (placeholders): # junctions.counts.gz: This file would typically be generated by leafcutter_cluster.py # from aligned RNA-seq data. It contains junction counts for each sample. # The first column is the junction ID (e.g., chr:start:end:strand:cluster_id), # and subsequent columns are counts for each sample, with sample IDs in the header. # # phenotype_file.txt: A tab-separated file mapping sample IDs to their experimental groups. # The first column is the sample ID, and the second column is the group. # Example content: # sample1\tcontrol # sample2\tcontrol # sample3\ttreatment # sample4\ttreatment # Create dummy input files for demonstration purposes. # In a real pipeline, these would be actual data files generated from upstream steps. # Create dummy junctions.counts.gz echo -e "junction_id\tsample1\tsample2\tsample3\tsample4" > junctions.counts.gz echo -e "chr1:1000:1100:+\:clu_1\t100\t120\t50\t60" >> junctions.counts.gz echo -e "chr1:1050:1150:+\:clu_1\t50\t60\t100\t110" >> junctions.counts.gz echo -e "chr2:2000:2100:-\:clu_2\t200\t210\t180\t190" >> junctions.counts.gz echo -e "chr2:2050:2150:-\:clu_2\t80\t90\t120\t130" >> junctions.counts.gz # Create dummy phenotype_file.txt echo -e "sample1\tcontrol" > phenotype_file.txt echo -e "sample2\tcontrol" >> phenotype_file.txt echo -e "sample3\ttreatment" >> phenotype_file.txt echo -e "sample4\ttreatment" >> phenotype_file.txt # Run Leafcutter differential splicing analysis # -j: Path to the junction counts file (e.g., output from leafcutter_cluster.py) # -m: Path to the metadata/phenotype file # -o: Output prefix for the results files # The description mentions "cluster padj values" and "splice junctions that reside within those clusters and their delta PSIs". # Leafcutter's `leafcutter_ds.py` script outputs these directly: # - <output_prefix>_ds_clusters.txt.gz: Contains differential splicing results per cluster, including padj values. # - <output_prefix>_ds_junctions.txt.gz: Contains delta PSI values for individual junctions within differentially spliced clusters. python leafcutter_ds.py -j junctions.counts.gz -m phenotype_file.txt -o differential_splicing_results
-
4
Clusters with a padj value <0.1 were selected for further analysis; we aggregated splice junctions and their corresponding delta PSIs across the 4 analyses into a single matrix.
$ Bash example
# This script filters input files based on a padj threshold and then aggregates # the delta PSI values for selected splice junctions across multiple analyses # into a single matrix. # Assume input files are named 'analysis1_results.tsv', 'analysis2_results.tsv', etc. # Each file is expected to be tab-separated and contain at least 'JunctionID', 'DeltaPSI', and 'padj' columns. # Example dummy data creation (for demonstration purposes) # These files would typically be outputs from a differential splicing analysis tool like rMATS or LeafCutter. echo -e "JunctionID\tDeltaPSI\tPValue\tpadj\tOther" > analysis1_results.tsv echo -e "J1\t0.1\t0.001\t0.005\tdata_a1" >> analysis1_results.tsv echo -e "J2\t0.2\t0.05\t0.08\tdata_a1" >> analysis1_results.tsv echo -e "J3\t0.05\t0.1\t0.15\tdata_a1" >> analysis1_results.tsv echo -e "J4\t0.3\t0.002\t0.003\tdata_a1" >> analysis1_results.tsv echo -e "JunctionID\tDeltaPSI\tPValue\tpadj\tOther" > analysis2_results.tsv echo -e "J1\t0.15\t0.002\t0.008\tdata_a2" >> analysis2_results.tsv echo -e "J4\t0.35\t0.01\t0.02\tdata_a2" >> analysis2_results.tsv echo -e "J5\t0.08\t0.08\t0.12\tdata_a2" >> analysis2_results.tsv echo -e "J6\t0.25\t0.005\t0.007\tdata_a2" >> analysis2_results.tsv echo -e "JunctionID\tDeltaPSI\tPValue\tpadj\tOther" > analysis3_results.tsv echo -e "J1\t0.12\t0.003\t0.006\tdata_a3" >> analysis3_results.tsv echo -e "J2\t0.18\t0.04\t0.07\tdata_a3" >> analysis3_results.tsv echo -e "J7\t0.4\t0.001\t0.002\tdata_a3" >> analysis3_results.tsv echo -e "JunctionID\tDeltaPSI\tPValue\tpadj\tOther" > analysis4_results.tsv echo -e "J1\t0.11\t0.004\t0.009\tdata_a4" >> analysis4_results.tsv echo -e "J4\t0.32\t0.008\t0.015\tdata_a4" >> analysis4_results.tsv echo -e "J8\t0.28\t0.003\t0.004\tdata_a4" >> analysis4_results.tsv # Python script for filtering and aggregation # This script uses the pandas library for data manipulation. # conda install -c conda-forge pandas python3 -c " import pandas as pd import glob input_files = sorted(glob.glob('analysis*_results.tsv')) padj_threshold = 0.1 output_matrix_file = 'aggregated_delta_psis.tsv' all_filtered_data = [] for i, file_path in enumerate(input_files): df = pd.read_csv(file_path, sep='\t') # Filter for padj < threshold filtered_df = df[df['padj'] < padj_threshold] # Select JunctionID and DeltaPSI, rename DeltaPSI column for aggregation filtered_df = filtered_df[['JunctionID', 'DeltaPSI']].rename(columns={'DeltaPSI': f'DeltaPSI_analysis{i+1}'}) all_filtered_data.append(filtered_df) # Merge all filtered dataframes on JunctionID if all_filtered_data: # Start with the first dataframe merged_df = all_filtered_data[0] # Merge subsequent dataframes using an outer join to keep all junctions present in at least one filtered analysis for i in range(1, len(all_filtered_data)): merged_df = pd.merge(merged_df, all_filtered_data[i], on='JunctionID', how='outer') # Fill NaN values (where a junction was not significant in a particular analysis) with 0 merged_df = merged_df.fillna(0) merged_df.to_csv(output_matrix_file, sep='\t', index=False) print(f'Aggregated matrix saved to {output_matrix_file}') else: print('No data found after filtering.') " -
5
Differentially spliced events are ordered by the max delta PSI across the 4 analyses (file: sig.junctions.padj01.sorted_by_max_deltapsi_Conlon_et_al_2018.txt).
rMATS (Inferred with models/gemini-2.5-flash) vv3.2.5$ Bash example
# rMATS installation (example using conda) # conda create -n rmats python=3.7 # conda activate rmats # conda install -c bioconda rmats # --- rMATS Execution (to generate differentially spliced events and delta PSI values) --- # This command is an example for rMATS v3.2.5, which was used in Conlon et al. 2018. # Replace 'b1.txt' and 'b2.txt' with actual paths to files listing BAM files for your two conditions. # Each line in b1.txt/b2.txt should be a path to a BAM file. # Example for b1.txt: # /path/to/condition1_replicate1.bam # /path/to/condition1_replicate2.bam # Reference GTF annotation file (e.g., GENCODE v38) # Download example: wget -P /path/to/references/ ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_38/gencode.v38.annotation.gtf.gz # gunzip /path/to/references/gencode.v38.annotation.gtf.gz GTF_FILE="/path/to/references/gencode.v38.annotation.gtf" # Output directory for rMATS results OUTPUT_DIR="rmats_output" mkdir -p "${OUTPUT_DIR}" # Execute rMATS # python /path/to/rMATS.py \ # --b1 b1.txt \ # --b2 b2.txt \ # --gtf "${GTF_FILE}" \ # --od "${OUTPUT_DIR}" \ # --tmp "${OUTPUT_DIR}/tmp" \ # -t paired \ # --readLength 100 \ # --nthread 8 \ # --libType fr-firststrand \ # --task junction # --- Post-processing and Sorting --- # The description implies that differentially spliced events have been identified, # filtered by adjusted p-value (padj < 0.01), and a 'max delta PSI' value has been calculated # across 4 analyses (which might involve combining results from multiple rMATS runs or comparisons). # This typically involves custom scripting (e.g., Python or R) to: # 1. Parse rMATS output files (e.g., SE.MATS.JunctionCountOnly.txt, MXE.MATS.JunctionCountOnly.txt). # 2. Combine different event types into a single file. # 3. Filter events by FDR (adjusted p-value) < 0.01. # 4. If multiple rMATS runs (e.g., 4 analyses) were performed, combine results and # calculate the maximum delta PSI for each event across these runs. # 5. Generate an intermediate file, e.g., 'sig.junctions.padj01.txt', # which contains the events and their 'max_delta_psi' values. # For demonstration, let's assume 'sig.junctions.padj01.txt' is the input file # and the 'max_delta_psi' is in a specific column (e.g., column 17, common for IncLevelDifference). INPUT_FILE="sig.junctions.padj01.txt" OUTPUT_FILE="sig.junctions.padj01.sorted_by_max_deltapsi_Conlon_et_al_2018.txt" DELTA_PSI_COLUMN=17 # Adjust this column index based on your actual file format # Preserve the header and sort the rest of the file by the 'max_delta_psi' column # in numerical, reverse (descending) order. head -n 1 "${INPUT_FILE}" > "${OUTPUT_FILE}" tail -n +2 "${INPUT_FILE}" | sort -k"${DELTA_PSI_COLUMN}","${DELTA_PSI_COLUMN}"nr >> "${OUTPUT_FILE}" -
6
Splice junction coordinates are intersected with Gencode release 25 annotations.
$ Bash example
# Define input and output files # Replace 'splice_junctions.bed' with the actual path to your splice junction coordinates file. SPLICE_JUNCTIONS_BED="splice_junctions.bed" # Replace 'gencode.v25.annotation.bed' with the actual path to your Gencode release 25 annotations in BED format. # If starting from GTF, you would first need to download and convert it: # # Download Gencode v25 GTF: # # wget -O gencode.v25.annotation.gtf.gz ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_25/gencode.v25.annotation.gtf.gz # # gunzip gencode.v25.annotation.gtf.gz # # Convert GTF to BED (e.g., using bedops gtf2bed or a custom script): # # gtf2bed < gencode.v25.annotation.gtf > gencode.v25.annotation.bed GENCODE_ANNOTATIONS_BED="gencode.v25.annotation.bed" INTERSECTED_OUTPUT="intersected_junctions.bed" # Ensure bedtools is installed (e.g., via conda) # conda install -c bioconda bedtools # Intersect splice junction coordinates with Gencode release 25 annotations # It is good practice to sort both input BED files before intersection for optimal performance. # Example: sort -k1,1 -k2,2n "$SPLICE_JUNCTIONS_BED" > "${SPLICE_JUNCTIONS_BED}.sorted" # Example: sort -k1,1 -k2,2n "$GENCODE_ANNOTATIONS_BED" > "${GENCODE_ANNOTATIONS_BED}.sorted" # Then use the sorted files in the command below. bedtools intersect -a "$SPLICE_JUNCTIONS_BED" -b "$GENCODE_ANNOTATIONS_BED" > "$INTERSECTED_OUTPUT" -
7
PSI ratios across all samples in Conlon et al.
MISO (Mixture of Isoforms) (Inferred with models/gemini-2.5-flash) v0.5.3 (Inferred with models/gemini-2.5-flash) GitHub$ Bash example
# Install MISO (if not already installed) # pip install MISO # Define variables MISO_ANNOTATIONS_DIR="/path/to/miso_annotations/hg38_events" # Placeholder: Path to pre-built MISO alternative event annotations (e.g., from UCSC hg38 GTF) READ_LENGTH=75 # Placeholder: Adjust based on your sequencing data read length # Placeholder: List of all input BAM files (aligned reads) # Replace with actual sample BAM files, e.g., "sample1.bam sample2.bam control1.bam control2.bam" SAMPLE_BAM_FILES="/path/to/sample1.bam /path/to/sample2.bam /path/to/control1.bam" OUTPUT_BASE_DIR="./miso_output" mkdir -p "$OUTPUT_BASE_DIR" MISO_OUTPUT_DIRS="" # 1. Calculate PSI values for each individual sample # This step generates .miso output files for each sample. for bam_file in $SAMPLE_BAM_FILES; do sample_name=$(basename "$bam_file" .bam) sample_output_dir="$OUTPUT_BASE_DIR/$sample_name" mkdir -p "$sample_output_dir" echo "Running MISO for $sample_name..." python -m miso.run_miso --run "$MISO_ANNOTATIONS_DIR" "$bam_file" --output-dir "$sample_output_dir" --read-len "$READ_LENGTH" MISO_OUTPUT_DIRS+=" $sample_output_dir" done # 2. Summarize MISO output (optional, but useful for an overview of PSIs across samples) SUMMARY_OUTPUT_DIR="$OUTPUT_BASE_DIR/summary" mkdir -p "$SUMMARY_OUTPUT_DIR" echo "Summarizing MISO output..." python -m miso.summarize_miso --summarize-samples $MISO_OUTPUT_DIRS --output-dir "$SUMMARY_OUTPUT_DIR" # 3. Calculate PSI ratios (differential PSI) between samples # The description "PSI ratios across all samples" implies comparisons. # This example shows a pairwise comparison. For a comprehensive analysis across all samples, # you might perform multiple pairwise comparisons or use custom scripts to parse the summarized data. # Placeholder: Select two sample output directories for comparison # Replace with actual sample directories, e.g., "$OUTPUT_BASE_DIR/sample1" and "$OUTPUT_BASE_DIR/control1" SAMPLE_A_MISO_DIR="$(echo $MISO_OUTPUT_DIRS | awk '{print $1}')" # First sample in the list SAMPLE_B_MISO_DIR="$(echo $MISO_OUTPUT_DIRS | awk '{print $2}')" # Second sample in the list if [ -n "$SAMPLE_A_MISO_DIR" ] && [ -n "$SAMPLE_B_MISO_DIR" ]; then COMPARISON_OUTPUT_DIR="$OUTPUT_BASE_DIR/$(basename $SAMPLE_A_MISO_DIR)_vs_$(basename $SAMPLE_B_MISO_DIR)_comparison" mkdir -p "$COMPARISON_OUTPUT_DIR" echo "Comparing $(basename $SAMPLE_A_MISO_DIR) vs $(basename $SAMPLE_B_MISO_DIR)..." python -m miso.compare_miso --compare-samples "$SAMPLE_A_MISO_DIR" "$SAMPLE_B_MISO_DIR" --output-dir "$COMPARISON_OUTPUT_DIR" else echo "Not enough samples to perform a comparison. Please provide at least two BAM files." fi -
8
2018 (77 training and test samples, see link to manuscript here) were computed using the leafcutter_quantify_psi.R script starting from the splice junction counts (see https://github.com/davidaknowles/leafcutter/issues/34)
$ Bash example
# Install Leafcutter (if not already installed) # Note: Leafcutter requires R and several R packages. It's often installed via Bioconda or directly from GitHub. # conda install -c bioconda leafcutter # Create dummy input files/directories for demonstration # In a real scenario, 'junction_files_dir' would contain multiple .junc files (e.g., from STAR alignment output) # and 'clusters.txt' would be generated by leafcutter_cluster.py. mkdir -p junction_files_dir echo "chr1:1000:1050:+\t10\t20\t30" > junction_files_dir/sample1.junc echo "chr1:1000:1050:+\t15\t25\t35" > junction_files_dir/sample2.junc echo "chr1:1000:1050:+\tcluster_1" > clusters.txt # Run leafcutter_quantify_psi.R script # This script quantifies PSI (Percent Spliced In) values for each cluster. # It takes a directory of splice junction count files and a clusters file as input. # The '--output_dir' parameter specifies where the results will be saved. # The '--num_threads' parameter can be used for parallel processing. Rscript leafcutter_quantify_psi.R \ --output_dir psi_output \ --num_threads 8 \ junction_files_dir \ clusters.txt
Tools Used
Raw Source Text
fastq Illumina RNASeq paired-end reads were aligned using STAR (2.5.2a). We used Leafcutter (0.2.6) to perform differential splicing analyses of each of four defined groups of ALS cases (ALSlow, ALShigh, C9, FTD grouped samples for a total of 54 samples referred to as the training set) against the Control group (all groups are described in: link to manuscript here). In each differential splicing analysis, Leafcutter outputs a file listing the set of cluster padj values, and a second file listing the splice junctions that reside within those clusters and their delta PSIs as a result of the differential analysis. Clusters with a padj value <0.1 were selected for further analysis; we aggregated splice junctions and their corresponding delta PSIs across the 4 analyses into a single matrix. Differentially spliced events are ordered by the max delta PSI across the 4 analyses (file: sig.junctions.padj01.sorted_by_max_deltapsi_Conlon_et_al_2018.txt). Splice junction coordinates are intersected with Gencode release 25 annotations. PSI ratios across all samples in Conlon et al. 2018 (77 training and test samples, see link to manuscript here) were computed using the leafcutter_quantify_psi.R script starting from the splice junction counts (see https://github.com/davidaknowles/leafcutter/issues/34) Genome_build: GRCh38 Supplementary_files_format_and_content: sig.junctions.padj01.sorted_by_max_deltapsi_Conlon_et_al_2018.txt : Differentially spliced events, coordinates and gene names are ordered by the max delta PSI across the 4 analyses of 54 ALS and Control samples. Supplementary_files_format_and_content: DS_leafcutter.ratios_Conlon_et_al_2018.txt : PSI ratios across all samples (77 training and test samples in Conlon et al. 2018, link to manuscript here) were computed using the leafcutter_quantify_psi.R script starting from the splice junction counts (see https://github.com/davidaknowles/leafcutter/issues/34)