Filters
Filters are operations that can be defined under preprocessors. Preprocessors should be written as an array of filters. TIM applies alls filters in a specified order (i.e. with the AND operation in between array items). There are two main categories of filters: row filters and column filters. Currently only a categorical filter is supported.
Row filters
Category filter
For a specified categorical column, this filter enables the selection of only those rows with a specific value. Currently only group keys of panel data are supported. Training models only on a small subset of the existing groups saves training time, which can be useful when a user wants to forecast only those specific groups.
There are two possibilities for setting a categorical filter: a simple and complex filter.
Simple filter
A simple category filter contains only a single filter object, with the relevant column name and categories that should be filtered. Categories can be expressed both as Integers and as Strings.
{
"type" : "CategoryFilter",
"value": {
"column": "ColumnName_1",
"categories": [1, 2, 3]
}
}
Complex filter
A complex category filter allows for the definition of more complex filtering, enabling both AND and OR operations between filters. Such a filter is written as an array of arrays of filter values. In the inner arrays, filters are combined using the AND operation; between those inner arrays, (sets of) filters are combined using the OR operation.
The following example selects only those rows where
- the values of ColumnName_1 are 1 or 2,
- AND the values of ColumnName_2 are "a" or "b",
OR the rows where
- the values of ColumnName_1 are 3 or 4,
- AND the values of ColumnName_2 are "c" or "d".
Schematically, this means:
( ColumnName_1 in [1, 2] AND ColumnName_2 in ["a", "b"] )
OR
( ColumnName_1 in [3, 4] AND ColumnName_2 in ["c", "d"] )
The JSON for passing this set of filters to the TIM API looks like this:
{
"type" : "CategoryFilter",
"value": [
[
{
"column": "ColumnName_1",
"categories": [1, 2]
},
{
"column": "ColumnName_2",
"categories": ["a", "b"]
}
],
[
{
"column": "ColumnName_1",
"categories": [3, 4]
},
{
"column": "ColumnName_2",
"categories": ["c", "d"]
}
]
]
}
Column filters
These filters are currently not available in the preprocessors; it is only possible to manually list those columns that should be returned.