GSE16681 Processing Pipeline
GSE
code_examples
1 step
Publication
A distinct microRNA signature for definitive endoderm derived from human embryonic stem cells.Stem cells and development (2010) — PMID 19807270
Dataset
GSE16681mRNA expression data from differentiation of human ESCs into definitive endoderm, Cyt49 on matrigel
Warning: Pipeline descriptions and code snippets may be inferred or AI-generated. Use them only as a starting point to guide analysis, and validate before use.
Processing Steps
Generate Jupyter Notebook-
1
The data were normalised using quantile normalisation with IlluminaGUI in R
$ Bash example
# Install R and limma if not already present # conda install -c conda-forge r-base # conda install -c bioconda bioconductor-limma # Create a dummy R script for quantile normalization # This script assumes 'input_data.tsv' contains the data matrix # and will output 'normalized_data.tsv' cat << 'EOF' > normalize_data.R # Load necessary library library(limma) # --- Configuration --- input_file <- "input_data.tsv" # Placeholder for input data file output_file <- "normalized_data.tsv" # Placeholder for output data file # --- Load Data --- # Assuming input_data.tsv is a tab-separated file with header # and the first column is gene/feature IDs, and subsequent columns are samples # Adjust read.delim parameters based on actual file format data_matrix <- as.matrix(read.delim(input_file, row.names = 1, sep = "\t", header = TRUE)) # --- Perform Quantile Normalization --- # The 'method="quantile"' argument specifies quantile normalization normalized_matrix <- normalizeBetweenArrays(data_matrix, method = "quantile") # --- Save Normalized Data --- # Write the normalized matrix to a new tab-separated file write.table(normalized_matrix, file = output_file, sep = "\t", quote = FALSE, col.names = NA) message(paste("Quantile normalization complete. Normalized data saved to:", output_file)) EOF # Create a dummy input file for demonstration echo -e "Gene\tSample1\tSample2\tSample3" > input_data.tsv echo -e "GeneA\t100\t120\t90" >> input_data.tsv echo -e "GeneB\t50\t60\t45" >> input_data.tsv echo -e "GeneC\t200\t210\t180" >> input_data.tsv echo -e "GeneD\t75\t80\t70" >> input_data.tsv # Execute the R script Rscript normalize_data.R
Tools Used
Raw Source Text
The data were normalised using quantile normalisation with IlluminaGUI in R