Outputs
This section summarizes the mathematical outputs of TIM Detect with a drift approach. Note that this approach does not return accuracies in the outputs because there is no target or label to refer to.
CSV result (table)
build-model, detect.
column_name | distance | p_value | drift |
---|---|---|---|
Temperature | 0.099 | 0.569 | false |
Pressure | 0.175 | 0.042 | true |
Speed | 0.161 | 0.971 | false |
Column name
The column_name represents the name of the column from dataset that corresponds to the given row of drift outputs.
Distance
The distance is Kolmogorov-Smirnov statics computed between reference and test data of given column.
P-value
The probability of the null hypothesis that the reference and test data come from the same distributions. If the probability is less than to the given p-value threshold, the drift is detected.
Drift
The drift column contains boolean values indicating whether there is drift between reference and test data for given column.
API Model
Model version
The version of model. Each approach has its own version.
"modelVersion": "5.0"
Approach
The approach used to build the model.
"approach": "drift"
Algorithm
Algorithm which was used to build the model.
"algorithm": "kolmogorov-smirnov"
Model
The model for drift contains only settings and parameters, there is no actual model, the CDF of reference data is always calculated from data.
Settings
Stored settings which were used to build the model and are important to execute detection correctly. There are three of them:
- reference rows - rows which should be used as reference data. Reference rows are always stored as an array of "from", "to" ranges even if they were entered relatively.
- columns - array of variable names for which the drift detection is performed
- p-value - the threshold p-value
"settings": {
"referenceRows": [
{
"to": "2022-06-16T00:00:00.000Z",
"from": "2022-01-01T00:00:00.000Z"
}
],
"columns": [
"variable1",
"variable2"
],
"pValue": 0.05
}
Parameters
Parameters describe the data properties for which the model was built. There are two of them:
- sampling period - the time scale given by user, if not given the estimated sampling period of the data. The sampling period used when building the model is stored in the ISO 8601 duration format.
- time zone - the time zone given by dataset
"parameters": {
"samplingPeriod": "P1D",
"timeZone": "UTC"
}
Signature
The signature serves to verify model originality.
"signature": "395b068eb6747efe4f9eb78"