Glossary
This glossary consolidates uncommon or specialized terms used in TIM Detect's documentation, and is meant to save time and promote consistency.
term | definition |
---|---|
Anomalous behavior model | An important part of TIM Detect's engine for kpi-driven anomaly detection, responsible for creating a model for each detection perspective resulting in corresponding anomaly indicators. |
Anomaly | An observation that does not conform to the expected behavior of a given group, i.e. an observation that does not fit well with the rest of the data (often referred to as anomaly, outlier, exception or contaminant). |
Anomaly indicator | An anomaly indicator is a number in the interval (0, infinity) specifying the extent to which a given observation is anomalous. The number 1 is the anomaly indicator threshold - if the indicator is below or equal to 1, the corresponding observation is considered normal; if it is above, it is considered an anomaly. The higher the number, the more anomalous that particular observation is. The anomaly indicator(s) is(/are) a final output of TIM Detect's model building for both kpi-driven and system-driven anomaly detection. |
Anomaly indicator window | A time range over which the anomaly indicator is smoothed. This smoothing happens by averaging the last n successive values, where n is determined by the length of the window (the time range). |
Approach | The form of solving the multidimensional anomaly detection problem. Currently, TIM supports a kpi-driven approach and a system-driven approach. |
Autocorrelation | Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. |
Causation | Causation indicates a relationship between two variables where one variable affects another. Thus, when the value of one variable changes as a result of a change in the value of another variable, there is a causal link between these variables. |
Collective anomaly | A type of anomaly composed of a group of observations that are individually not anomalous (neither in a contextual, nor in a global sense), but their occurence together (as a group) is abnormal. |
Configuration | All the available parameters to configure TIM Detect's models throughout their lifecycle. The configuration options differ between TIM Detect's kpi-driven and system-driven approaches. |
Contextual anomaly | A type of anomaly that occurs when one or more observations are anomalous regarding the context, meaning the values preceding it and/or the values of the influencers at the same point in time. |
Correlation | A statistical measure that reveals the degree to which two variables are linearly related (without declaration of cause and effect). |
Dependency-oriented data | Data that contains observations that may be linked to eachother by implicit or explicit relationships. Time series are dependency-oriented data that often contain implicit dependencies: two successive observations are likely related to eachother, therefore, the time attribute implicitly specifies a dependency between them. In such data, anomalies are usually defined in a contextual or collective sense and are harder to distinguish from noise. |
Experiment design | The choices made given the context of an experiment, characterized by the data (chosen KPI (if any), influencers) and the domain specifics. |
Detection | The evaluation or application of an existing model, typically on new data, to detect anomalies in that data. |
Detection perspective | A viewpoint from which to look at anomalies; each detection perspective is dedicated to identify a different type of anomalies. Detection perspectives come into play in TIM Detect's kpi-driven anomaly detection. |
Domain specifics | Settings that are related to the domain of the use case and the data. These settings differ between TIM Detect's kpi-driven and system-driven approaches. |
Feature | A transformation (by the TIM engine) of the original variable/variables. |
Global anomaly | A type of anomaly that occurs when an observation deviates strongly from most observations of a given dataset. It is also called a global outlier. |
Influencer | Influencers, also known as explanatory or independent variables, are variables influencing the KPI, also known as the dependent variable. |
In-sample period | The period used for configuring, training and creating a model. |
Key performance indicator (KPI) | The KPI, also know as the dependent variable, is the variable on which the detection focuses; anomalies are detected on this variable (with the kpi-driven approach). |
Model | A representation of reality (data). |
Model building | The process that creates of a new model. |
Model rebuilding | The reconstruction of an existing model |
Multivariate AD with kpi-driven approach | An anomaly detection problem on a dataset with multiple variables, focusing on anomalies in one of the variables (the KPI). |
Multivariate AD with system-driven approach | An anomaly detection problem on a dataset with multiple variables, focusing on anomalies over the entire group of variables. |
Normal behavior | The output of a normal behavior model representing the expected value of a given KPI |
Normal behavior model | An important part of TIM Detect's engine for kpi-driven anomaly detection, responsible for creating model characterizing the expected (normal) behavior of a given KPI. |
Out-of-sample period | The period used for evaluating a model's performance. |
Rebuild type | The type of rebuild that is peformed for a rebuild model job, defining what part of the model has to be reconstructed. |
Root cause analysis | An interpretation of what drives normal behavior. |
Semi-supervised AD | A type of algorithm for anomaly detection problems where a model representing normal behavior is contructed from a given normal training data set, and then used to test the likelihood that a test instance is generated by the learned model. |
Sensitivity | A percentual number that defines the ratio of observations in the in-sample period that is expected to be anomalous; in such, it represents the sensitivity of the underlying model to anomalies. The sensitivity relates to each of the detection perspectives in the kpi-driven approach as well as to the system-driven approach. |
Residuals | The difference between actual values and normal behavior values of a given KPI. |
Supervised AD | A type of algorithm for anomaly detection problems on data for which all training observations are labeled as either "normal" or "anomalous" prior to training. This involves training a classifier, with the key difference to many other classification problems lying in the inherently unbalanced nature of outlier detection. |
System | The process of a given problem. |
Temporal continuity | Refers to the fact that the patterns in the data are not expected to change abruptly, unless there are abnormal processes at work. |
Time-series data | Time-series data is a collection of observations for a single subject (entity) obtained through repeated measurements at different time intervals (generally equally spaced as in the case of metrics or unequally spaced as in the case of events). |
Univariate AD | A one dimensional anomaly detection problem, meaning an anomaly detection problem on a single variable. |
Unsupervised AD | A type of algorithm for anomaly detection problems where there is no label available. (It is not known in advance which of the training observations are anomalous and which are normal.) |
Variable | A column in the dataset with the potential of characterizing the system, this can be either a KPI (dependent variable) or an influencer (explanatory variable). |