Generic term for methods to go from raw instrumental data to clean data for data processing.
Transforming the clean data to make them ready for data processing (scaling, centering, etc).
The actual data analysis (PCA, PLS etc.).
Transforming the results from the processing for interpretation and visualization.
All activities aimed at assuring the quality of the conclusions drawn from the data analysis.
Hypothesis generated, pathways affected, or visualization of the data.
Resolving overlapping peaks in an NMR spectrum or GC or LC chromatogram using the second dimension (usually MS). In the case of GC or LC this generates a peak table where each metabolite is represented by one variable.
Peaks in an NMR or GC or LC-MS chromatogram are selected that may represent signals. This results in a table with rt_m/z channels and corresponding intensities.
Synchronizing the chromatograms or NMR spectra such that each metabolite signal has the same retention time or chemical shift in each sample.
Operation performed within or across rows to make the row profiles comparable in size.
Operation across the rows to translate the center of gravity of the dataset.
Commonly used method for centering in which each column is expressed in deviations from its mean (across the rows). Subtracts the mean of the column, thereby translating the center of gravity of the data to the origin.
Operation performed within a column to make the column profiles more comparable.
A form of scaling which mean-centers each value of the column followed by dividing row entries of a column by the standard deviation within that column. Also called UV (unit variance) or Z- scaling.
Mean-centering followed by dividing row entries of a column through the square root of the standard deviation within that column.
Transformations to linearize or otherwise change the scale of the data, e.g., to remove heteroscedastic noise.
Data in the table which are not available for the analysis.
Data points (samples, variables or a specific combination of both) which deviate from the distribution of the majority of the data.
The model selected for analyzing the data (PCA, PLS, OPLS etc.)
Parameters in models/methods that have to be fitted to the data.
A parameter that helps define the structure and optimization of the model.
Transforming the data back to the original domain (if a transformation was performed prior to the analysis).
Plots that represent the original data or the results from the data analysis in a such a way that facilitates interpretation.
Subset of samples used to estimate the parameters.
Subset of samples used to estimate the metaparameters.
Subset of samples used to establish the generalizability of the model/method.