Forecasting
After uploading a dataset, the user can start working with the uploaded dataset to build and apply forecasting models.
This section explains how to use the TIM Python client to create a forecasting job, execute such a job, combine both into a single method call and retrieve results. It also explains how to delete jobs from the TIM Repository and how to perform a clean forecast, deleting both dataset and job from the TIM Repository after receiving the results.
clean_forecast - perform a clean forecast
clean_forecast(self, dataset: pandas.core.frame.DataFrame, dataset_configuration: tim.data_sources.dataset.types.UploadCSVConfiguration = {}, job_configuration: Union[tim.data_sources.forecast.types.BuildForecastingModelConfiguration, NoneType] = None, handle_dataset_upload_status_poll: Union[Callable[[tim.data_sources.dataset.types.DatasetStatusResponse], NoneType], NoneType] = None, handle_forecast_status_poll: Union[Callable[[tim.types.StatusResponse], NoneType], NoneType] = None) -> Union[tim.data_sources.forecast.types.CleanForecastResponse, tim.types.ExecuteResponse]
The clean_forecast
method uploads the dataset in the default workspace, creates a forecast job in this workspace, executes it, returns the results and deletes the dataset and job from the TIM Repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.clean_forecast(dataset = <dataset>, dataset_configuration = <dataset configuration>, job_configuration = <job configuration>, handle_dataset_upload_status_poll = <dataset upload callback function>, handle_forecast_status_poll = <forecasting job execution callback function>)
using keyword arguments, or in the following statement:
client.clean_forecast(<dataset>, <dataset configuration>, <job configuration>, <dataset upload callback function>, <forecasting job execution callback function>)
using positional arguments, where <dataset>
, <dataset configuration>
and <job configuration>
are replaced by the DataFrame and Dictionaries representing them, respectively, and <dataset upload callback function>
and <forecasting job execution callback function>
are replaced by optional callback functions for status polling on dataset upload and forecasting job execution, respectively.
The arguments are:
- dataset: a DataFrame containing the dataset, which consists of time-series data
- dataset_configuration: a Dictionary containing metadata of the dataset. This is an optional argument, available keys are:
- timestampFormat: a string describing the format of the timestamps,
- timestampColumn: a string containing the name of the timestamp column, or an integer containing the index of the timestamp column,
- decimalSeparator: the decimal separator used,
- name: the desired name for the dataset in the TIM repository,
- description: the desired description for the dataset in the TIM repository,
- samplingPeriod: the sampling period of the data,
- job_configuration: a Dictionary containing model building and forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model building and model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- modelQuality: controls the model complexity/training time tradeoff. The higher the quality, the longer the required time to build the model zoo. Options are: "Combined", "Low", "Medium", "High", "VeryHigh" and "UltraHigh"; the default value is "Combined", meaning that the quality "VeryHigh" will be used for the intraday and day-ahead forecasts and the "High" quality will be set for further forecasting horizons. This parameter is deprecated and is replaced by setting 'targetOffsets' and 'predictorOffsets',
- targetOffsets: determines the target offsets used in the model building process. The more specific offsets for each situation are, the longer the required time to build the model zoo. Options are: "None", "Common", "Close" and "Combined". Default setting "Combined" if allowOffsets is "true" and predictorOffsets is "Common". "Combined" means that "Close" offsets will be used for the first 2 days from the last target timestamp and "Common" will be used for the rest,
- predictorOffsets: determines the predictor offsets used in the model building process. Options are: "Common" and "Close". The default setting is "Common", which means that common predictor offsets for situations within one day will be used. While "Close" means that the closest possible offsets of predictors are used for each situation in the forecasting horizon. "Close" is more time expensive,
- normalization: a boolean indicating whether predictors are normalized (scaled by their mean and standard deviation); the default setting is true; switching off may help to model data with structural changes,
- maxModelComplexity: determines the maximal possible number of terms in each model in the model zoo. Available values lie between 1 and 100 (inclusive). If not set, TIM will calculate the complexity automatically based on the sampling period of the dataset,
- features: an enumeration of the types of transformations TIM can use during the feature engineering. Available transformations are "ExponentialMovingAverage", "RestOfWeek", "Periodic", "Intercept", "PiecewiseLinear", "TimeOffsets", "Polynomial", "Identity", "SimpleMovingAverage", "Month", "Trend", "DayOfWeek", "Fourier" and "PublicHolidays". If not provided, the TIM Engine will determine the optimal features automatically,
- dailyCycle: a boolean indicating whether models should focus on respective times within the day (specific hours, quarter-hours, etc.); if not set, TIM will determine this property automatically using autocorrelation analysis,
- memoryLimitCheck: a boolean enabling reducing datasets by dynamically throwing away columns and rows taking into consideration the current RAM state; the default setting is true,
- predictionIntervals: the confidence level of the returned symmetric prediction intervals, expressed as a percentage,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- backtest: determines whether in-sample and out-of-sample forecasts should be returned. Possible values are "All", "Production" and "OutOfSample"; "Production" results in only the production forecast being returned; "OutOfSample" returns the out-of-sample forecasts on top of that.
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the in-sample range are defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- columns: a list defining which columns (variables) should be used to build the model; if not set, all columns are selected. Each list item can either be a string (the name of the column) or an integer (the index of the column),
- targetColumn: defines which column represents the target variable. The desired target has to be included in the "columns" selection; if not set, the first column in the "columns" selection is used. The target column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- holidayColumn: defines which column represents the holiday variable. This is an optional argument; if not set, no column is defined as holiday column. The holiday column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- timeScale: a Dictionary defining the sampling period to which TIM should aggregate the dataset before model building and forecasting. This is an optional parameter; if not set, TIM uses the dataset's sampling period. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute" and "Second",
- value: the number of base units; the minimum value is 1,
- aggregation: defines the aggregation function to be used for the target variable. The default aggregation is mean for numerical variables and maximum for boolean variables. This is an optional argument; options are "Mean", "Sum" "Maximum" and "Minimum",
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- handle_dataset_upload_status_poll: an optional callback function handling polling for the status and progress of the dataset upload,
- handle_forecast_status_poll: an optional callback function handling polling for the status and progress of the forecasting job execution.
This method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "Registered", "Running", "Finished", "FinishedWithWarning", "Failed", "Queued",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target, forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- dataset_logs: a list of Dictionaries, each of which contain the following keys:
- message: the log message,
- messageType: the type of the message, possible values are "Info", "Debug" and "Warning",
- createdAt: the time of creation of the log,
- origin: the origin of the log, in this case this will be "Upload",
- forecast_logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the dataset upload and the forecasting job, metadata, model_result, table_result and accuracies will be populated; forecast_logs will be returned for any job, including failed jobs; dataset_logs will be included even upon failed dataset upload.
If an error is encountered, a Dictionary will be returned with the keys message and code containing additional information about the error.
build_forecasting_model - create a forecasting model building job
build_forecasting_model(self, dataset_id: str, job_configuration: tim.data_sources.forecast.types.BuildForecastingModelConfiguration) -> str
The build_forecasting_model
method registers a forecasting model building job in the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.build_forecasting_model(dataset_id = <dataset ID>, job_configuration = <job configuration>)
using keyword arguments, or in the following statement:
client.build_forecasting_model(<dataset ID>, <job configuration>)
using positional arguments, where <dataset ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, respectively.
The arguments are:
- dataset_id: the ID of a dataset in the TIM repository,
- job_configuration: a Dictionary containing model building and forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model building and model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- modelQuality: controls the model complexity/training time tradeoff. The higher the quality, the longer the required time to to build the model zoo. Options are: "Combined", "Low", "Medium", "High", "VeryHigh" and "UltraHigh"; the default value is "Combined", meaning that the quality "VeryHigh" will be used for the intraday and day-ahead forecasts and the "High" quality will be set for further forecasting horizons. This parameter is deprecated and is replaced by setting 'targetOffsets' and 'predictorOffsets',
- targetOffsets: determines the target offsets used in the model building process. The more specific offsets for each situation are, the longer the required time to build the model zoo. Options are: "None", "Common", "Close" and "Combined". Default setting "Combined" if allowOffsets is "true" and predictorOffsets is "Common". "Combined" means that "Close" offsets will be used for the first 2 days from the last target timestamp and "Common" will be used for the rest,
- predictorOffsets: determines the predictor offsets used in the model building process. Options are: "Common" and "Close". The default setting is "Common", which means that common predictor offsets for situations within one day will be used. While "Close" means that the closest possible offsets of predictors are used for each situation in the forecasting horizon. "Close" is more time expensive,
- normalization: a boolean indicating whether predictors are normalized (scaled by their mean and standard deviation); the default setting is true; switching off may help to model data with structural changes,
- maxModelComplexity: determines the maximal possible number of terms in each model in the model zoo. Available values lie between 1 and 100 (inclusive). If not set, TIM will calculate the complexity automatically based on the sampling period of the dataset,
- features: an enumeration of the types of transformations TIM can use during the feature engineering. Available transformations are "ExponentialMovingAverage", "RestOfWeek", "Periodic", "Intercept", "PiecewiseLinear", "TimeOffsets", "Polynomial", "Identity", "SimpleMovingAverage", "Month", "Trend", "DayOfWeek", "Fourier" and "PublicHolidays". If not provided, the TIM Engine will determine the optimal features automatically,
- dailyCycle: a boolean indicating whether models should focus on respective times within the day (specific hours, quarter-hours, etc.); if not set, TIM will determine this property automatically using autocorrelation analysis,
- allowOffsets: a boolean indicating whether the models can use offsets of the variables,
- offsetLimit: the bottom limit for offsets, defining how far offsets are taken into account in the model building process, if not set TIM will determine this automatically,
- memoryLimitCheck: a boolean enabling reducing datasets by dynamically throwing away columns and rows taking into consideration the current RAM state; the default setting is true,
- predictionIntervals: the confidence level of the returned symmetric prediction intervals, expressed as a percentage,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- backtest: determines whether in-sample and out-of-sample forecasts should be returned. Possible values are "All", "Production" and "OutOfSample"; "Production" results in only the production forecast being returned; "OutOfSample" returns the out-of-sample forecasts on top of that.
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the in-sample range are defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- columns: a list defining which columns (variables) should be used to build the model; if not set, all columns are selected. Each list item can either be a string (the name of the column) or an integer (the index of the column),
- targetColumn: defines which column represents the target variable. The desired target has to be included in the "columns" selection; if not set, the first column in the "columns" selection is used. The target column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- holidayColumn: defines which column represents the holiday variable. This is an optional argument; if not set, no column is defined as holiday column. The holiday column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- timeScale: a Dictionary defining the sampling period to which TIM should aggregate the dataset before model building and forecasting. This is an optional parameter; if not set, TIM uses the dataset's sampling period. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute" and "Second",
- value: the number of base units; the minimum value is 1,
- aggregation: defines the aggregation function to be used for the target variable. The default aggregation is mean for numerical variables and maximum for boolean variables. This is an optional argument; options are "Mean", "Sum" "Maximum" and "Minimum".
This method returns the ID of the forecasting job that has been created.
execute_forecast - execute a forecasting job
execute_forecast(self, forecast_job_id: str, wait_to_finish: bool = True, handle_status_poll: Union[Callable[[tim.types.StatusResponse], NoneType], NoneType] = None) -> Union[tim.data_sources.forecast.types.ForecastResultsResponse, tim.types.ExecuteResponse]
The execute_forecast
method executes a forecasting job that has been registered. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.execute_forecast(forecast_job_id = <forecasting job ID>, wait_to_finish = <wait to finish>, handle_status_poll = <callback function>)
using keyword arguments, or in the following statement:
client.execute_forecast(<forecasting job ID>, <wait to finish>, <callback function>)
using positional arguments, where <forecasting job ID>
and <wait to finish>
are replaced by the ID of the forecasting job to execute and an optional boolean indicating whether to wait for the execution to finish before returning, respectively, and <callback function>
is replaced by an optional callback function for status polling.
The arguments are:
- forecast_job_id: the ID of the forecasting job to execute,
- wait_to_finish: a boolean indicating whether to wait for the execution to finish before returning; this is an optional parameter,
- handle_status_poll: an optional callback function handling polling for the status and progress of the forecasting job execution.
If wait_to_finish is set to True, this method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "build-model", "rebuild-model", "rca",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target, forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the forecasting job, metadata, model_result, table_result and accuracies will be populated; logs will be returned for any job, including failed jobs.
If wait_to_finish is set to False, this method returns a Dictionary with the following keys:
- message: a message indicating what has happened (the forecasting job has been posted to a queue),
- code: a code providing more information on this message.
If an error is encountered, a similar Dictionary will be returned with the keys message and code containing additional information about the error.
build_forecasting_model_and_execute - create and execute a forecasting model building job
build_forecasting_model_and_execute(self, dataset_id: str, job_configuration: Union[tim.data_sources.forecast.types.BuildForecastingModelConfiguration, NoneType] = None, wait_to_finish: bool = True, handle_status_poll: Union[Callable[[tim.types.StatusResponse], NoneType], NoneType] = None) -> Union[tim.data_sources.forecast.types.ForecastResultsResponse, tim.types.ExecuteResponse]
The build_forecasting_model_and_execute
method registers a forecasting job in the TIM repository and then immediatly executes it. It combines the functionality of build_forecasting_model()
and execute_forecast()
. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.build_forecasting_model_and_execute(dataset_id = <dataset ID>, job_configuration = <job configuration>, wait_to_finish = <wait to finish>, handle_status_poll = <callback function>)
using keyword arguments, or in the following statement:
client.build_forecasting_model_and_execute(<dataset ID>, <job configuration>, <wait to finish>, <callback function>)
using positional arguments, where <dataset ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, <wait to finish>
represents an optional boolean indicating whether to wait for the execution to finish before returning, and <callback function>
is replaced by an optional callback function for status polling.
The arguments are:
- dataset_id: the ID of a dataset in the TIM repository,
- job_configuration: model building and forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: an Dictionary containing the configuration for the model building and model application process. This is an optional argument. The following keys are available:
- description:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- modelQuality: controls the model complexity/training time tradeoff. The higher the quality, the longer the required time to to build the model zoo. Options are: "Combined", "Low", "Medium", "High", "VeryHigh" and "UltraHigh"; the default value is "Combined", meaning that the quality "VeryHigh" will be used for the intraday and day-ahead forecasts and the "High" quality will be set for further forecasting horizons. This parameter is deprecated and is replaced by setting 'targetOffsets' and 'predictorOffsets',
- targetOffsets: determines the target offsets used in the model building process. The more specific offsets for each situation are, the longer the required time to build the model zoo. Options are: "None", "Common", "Close" and "Combined". Default setting "Combined" if allowOffsets is "true" and predictorOffsets is "Common". "Combined" means that "Close" offsets will be used for the first 2 days from the last target timestamp and "Common" will be used for the rest,
- predictorOffsets: determines the predictor offsets used in the model building process. Options are: "Common" and "Close". The default setting is "Common", which means that common predictor offsets for situations within one day will be used. While "Close" means that the closest possible offsets of predictors are used for each situation in the forecasting horizon. "Close" is more time expensive,
- normalization: a boolean indicating whether predictors are normalized (scaled by their mean and standard deviation); the default setting is true; switching off may help to model data with structural changes,
- maxModelComplexity: determines the maximal possible number of terms in each model in the model zoo. Available values lie between 1 and 100 (inclusive). If not set, TIM will calculate the complexity automatically based on the sampling period of the dataset,
- features: an enumeration of the types of transformations TIM can use during the feature engineering. Available transformations are "ExponentialMovingAverage", "RestOfWeek", "Periodic", "Intercept", "PiecewiseLinear", "TimeOffsets", "Polynomial", "Identity", "SimpleMovingAverage", "Month", "Trend", "DayOfWeek", "Fourier" and "PublicHolidays". If not provided, the TIM Engine will determine the optimal features automatically,
- dailyCycle: a boolean indicating whether models should focus on respective times within the day (specific hours, quarter-hours, etc.); if not set, TIM will determine this property automatically using autocorrelation analysis,
- memoryLimitCheck: a boolean enabling reducing datasets by dynamically throwing away columns and rows taking into consideration the current RAM state; the default setting is true,
- predictionIntervals: the confidence level of the returned symmetric prediction intervals, expressed as a percentage,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- backtest: determines whether in-sample and out-of-sample forecasts should be returned. Possible values are "All", "Production" and "OutOfSample"; "Production" results in only the production forecast being returned; "OutOfSample" returns the out-of-sample forecasts on top of that.
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used"; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the in-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- columns: a list defining which columns (variables) should be used to build the model; if not set, all columns are selected. Each list item can either be a string (the name of the column) or an integer (the index of the column),
- targetColumn: defines which column represents the target variable. The desired target has to be included in the "columns" selection; if not set, the first column in the "columns" selection is used. The target column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- holidayColumn: defines which column represents the holiday variable. This is an optional argument; if not set, no column is defined as holiday column. The target column can be indicated either by a string (the name of the column) or an integer (the index of the column),
- timeScale: a Dictionary defining the sampling period to which TIM should aggregate the dataset before model building and forecasting. This is an optional parameter; if not set, TIM uses the dataset's sampling period. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute" and "Second",
- value: the number of base units; the minimum value is 1,
- aggregation: defines the aggregation function to be used for the target variable. The default aggregation is mean for numerical variables and maximum for boolean variables. This is an optional argument; options are "Mean", "Sum" "Maximum" and "Minimum",
- wait_to_finish: a boolean indicating whether to wait for the execution to finish before returning; this is an optional parameter,
- handle_status_poll: an optional callback function handling polling for the status and progress of the forecasting job execution.
If wait_to_finish is set to True, this method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "Registered", "Running", "Finished", "FinishedWithWarning", "Failed", "Queued",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target, forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the forecasting job, metadata, model_result, table_result and accuracies will be populated; logs will be returned for any job, including failed jobs.
If wait_to_finish is set to False, this method returns a Dictionary with the following keys:
- message: a message indicating what has happened (the forecasting job has been posted to a queue),
- code: a code providing more information on this message.
If an error is encountered, a similar Dictionary will be returned with the keys message and code containing additional information about the error.
create_forecast - create a forecasting prediction job
create_forecast(self, parent_job_id: str, job_configuration: Optional[tim.data_sources.forecast.types.ForecastingPredictConfiguration] = None) -> str
The create_forecast
method registers a forecasting prediction job in the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.create_forecast(parent_job_id = <parent forecasting job ID>, job_configuration = <job configuration>)
using keyword arguments, or in the following statement:
client.create_forecast(<parent forecasting job ID>, <job configuration>)
using positional arguments, where <parent job ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, respectively.
The arguments are:
- parent_job_id: the ID of a forecasting job (of type build model or rebuild model) in the TIM repository,
- job_configuration: a Dictionary containing forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0.
This method returns the ID of the forecasting job that has been created.
create_forecast_and_execute - create and execute a forecasting prediction job
create_forecast_and_execute(self, parent_job_id: str, job_configuration: Optional[tim.data_sources.forecast.types.ForecastingPredictConfiguration] = None, wait_to_finish: bool = True, handle_status_poll: Optional[Callable[[tim.types.StatusResponse], NoneType]] = None) -> Union[tim.data_sources.forecast.types.ForecastResultsResponse, tim.types.ExecuteResponse]
The create_forecast_and_execute
method registers a forecasting prediction job in the TIM repository and then immediatly executes it. It combines the functionality of create_forecast()
and execute_forecast()
. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.create_forecast_and_execute(parent_job_id = <parent forecasting job ID>, job_configuration = <job configuration>, wait_to_finish = <wait to finish>, handle_status_poll = <callback function>))
using keyword arguments, or in the following statement:
client.create_forecast_and_execute(<parent forecasting job ID>, <job configuration>, <wait to finish>, <calback function>)
using positional arguments, where <parent job ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, <wait to finish>
represents an optional boolean indicating whether to wait for the execution to finish before returning, and <callback function>
is replaced by an optional callback function for status polling.
The arguments are:
- parent_job_id: the ID of a forecasting job (of type build model or rebuild model) in the TIM repository,
- job_configuration: a Dictionary containing forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- wait_to_finish: a boolean indicating whether to wait for the execution to finish before returning; this is an optional parameter,
- handle_status_poll: an optional callback function handling polling for the status and progress of the forecasting job execution.
If wait_to_finish is set to True, this method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "Registered", "Running", "Finished", "FinishedWithWarning", "Failed", "Queued",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target, forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the forecasting job, metadata, model_result, table_result and accuracies will be populated; logs will be returned for any job, including failed jobs.
If wait_to_finish is set to False, this method returns a Dictionary with the following keys:
- message: a message indicating what has happened (the forecasting job has been posted to a queue),
- code: a code providing more information on this message.
If an error is encountered, a similar Dictionary will be returned with the keys message and code containing additional information about the error.
This method returns the ID of the forecasting job that has been created.
rebuild_forecasting_model - create a forecasting model rebuilding job
rebuild_forecasting_model(self, parent_job_id: str, job_configuration: Union[tim.data_sources.forecast.types.ForecastingRebuildModelConfiguration, NoneType] = None) -> str
The rebuild_forecasting_model
method registers a forecasting model rebuilding job in the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.rebuild_forecasting_model(parent_job_id = <parent forecasting job ID>, job_configuration = <job configuration>)
using keyword arguments, or in the following statement:
client.rebuild_forecasting_model(<parent forecasting job ID>, <job configuration>)
using positional arguments, where <parent job ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, respectively.
The arguments are:
- parent_job_id: the ID of a forecasting job (of type build model or rebuild model) in the TIM repository,
- job_configuration: a Dictionary containing model rebuilding and forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model building and model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- modelQuality: controls the model complexity/training time tradeoff. The higher the quality, the longer the required time to to build the model zoo. Options are: "Combined", "Low", "Medium", "High", "VeryHigh" and "UltraHigh"; the default value is "Combined", meaning that the quality "VeryHigh" will be used for the intraday and day-ahead forecasts and the "High" quality will be set for further forecasting horizons. This parameter is deprecated and is replaced by setting 'targetOffsets' and 'predictorOffsets',
- targetOffsets: determines the target offsets used in the model building process. The more specific offsets for each situation are, the longer the required time to build the model zoo. Options are: "None", "Common", "Close" and "Combined". Default setting "Combined" if allowOffsets is "true" and predictorOffsets is "Common". "Combined" means that "Close" offsets will be used for the first 2 days from the last target timestamp and "Common" will be used for the rest,
- predictorOffsets: determines the predictor offsets used in the model building process. Options are: "Common" and "Close". The default setting is "Common", which means that common predictor offsets for situations within one day will be used. While "Close" means that the closest possible offsets of predictors are used for each situation in the forecasting horizon. "Close" is more time expensive,
- normalization: a boolean indicating whether predictors are normalized (scaled by their mean and standard deviation); the default setting is true; switching off may help to model data with structural changes,
- maxModelComplexity: determines the maximal possible number of terms in each model in the model zoo. Available values lie between 1 and 100 (inclusive). If not set, TIM will calculate the complexity automatically based on the sampling period of the dataset,
- features: an enumeration of the types of transformations TIM can use during the feature engineering. Available transformations are "ExponentialMovingAverage", "RestOfWeek", "Periodic", "Intercept", "PiecewiseLinear", "TimeOffsets", "Polynomial", "Identity", "SimpleMovingAverage", "Month", "Trend", "DayOfWeek", "Fourier" and "PublicHolidays". If not provided, the TIM Engine will determine the optimal features automatically,
- allowOffsets: a boolean indicating whether the models can use offsets of the variables,
- memoryLimitCheck: a boolean enabling reducing datasets by dynamically throwing away columns and rows taking into consideration the current RAM state; the default setting is true,
- rebuildingPolicy: a Dictionary to set the policy that determines which models to rebuild. Available keys are:
- type: indicates the type of rebuilding policy, options are "NewSituations", "All" and "OlderThan",
- time: only makes sense when combined with type "OlderThan", a Dictionary to determine the boundary in model age that separates "old" models (models to rebuild) from "new" models (not (yet) to rebuild). Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- backtest: determines whether in-sample and out-of-sample forecasts should be returned. Possible values are "All", "Production" and "OutOfSample"; "Production" results in only the production forecast being returned; "OutOfSample" returns the out-of-sample forecasts on top of that.
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the in-sample range are defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- columns: a list defining which columns (variables) should be used to build the model; if not set, all columns are selected. Each list item can either be a string (the name of the column) or an integer (the index of the column).
This method returns the ID of the forecasting job that has been created.
rebuild_forecasting_model_and_execute - create and execute a forecasting model rebuilding job
rebuild_forecasting_model_and_execute(self, parent_job_id: str, job_configuration: Union[tim.data_sources.forecast.types.ForecastingRebuildModelConfiguration, NoneType] = None, wait_to_finish: bool = True, handle_status_poll: Union[Callable[[tim.types.StatusResponse], NoneType], NoneType] = None) -> Union[tim.data_sources.forecast.types.ForecastResultsResponse, tim.types.ExecuteResponse]
The rebuild_forecasting_model_and_execute
method registers a forecasting model rebuilding job in the TIM repository and then immediately executes it. It combines the functionality of rebuild_forecasting_model()
and execute_forecast()
. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.rebuild_forecasting_model_and_execute(parent_job_id = <parent forecasting job ID>, job_configuration = <job configuration>, wait_to_finish = <wait to finish>, handle_status_poll = <callback function>)
using keyword arguments, or in the following statement:
client.rebuild_forecasting_model_and_execute(<parent forecasting job ID>, <job configuration>, <wait to finish>, <callback function>)
using positional arguments, where <dataset ID>
and <job configuration>
are replaced by the ID and Dictionary representing them, <wait to finish>
represents an optional boolean indicating whether to wait for the execution to finish before returning, and <callback function>
is replaced by an optional callback function for status polling.
The arguments are:
- parent_job_id: the ID of a forecasting job (of type build model or rebuild model) in the TIM repository,
- job_configuration: a Dictionary containing model rebuilding and forecasting configuration for the TIM engine. This is an optional argument, available keys are:
- name: the name to give to the forecasting job. This is an optional argument.
- configuration: a Dictionary containing the configuration for the model building and model application process. This is an optional argument. The following keys are available:
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionFrom: complements predictionTo to create a forecasting horizon, samples outside of this range are skipped; its keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- modelQuality: controls the model complexity/training time tradeoff. The higher the quality, the longer the required time to to build the model zoo. Options are: "Combined", "Low", "Medium", "High", "VeryHigh" and "UltraHigh"; the default value is "Combined", meaning that the quality "VeryHigh" will be used for the intraday and day-ahead forecasts and the "High" quality will be set for further forecasting horizons. This parameter is deprecated and is replaced by setting 'targetOffsets' and 'predictorOffsets',
- targetOffsets: determines the target offsets used in the model building process. The more specific offsets for each situation are, the longer the required time to build the model zoo. Options are: "None", "Common", "Close" and "Combined". Default setting "Combined" if allowOffsets is "true" and predictorOffsets is "Common". "Combined" means that "Close" offsets will be used for the first 2 days from the last target timestamp and "Common" will be used for the rest,
- predictorOffsets: determines the predictor offsets used in the model building process. Options are: "Common" and "Close". The default setting is "Common", which means that common predictor offsets for situations within one day will be used. While "Close" means that the closest possible offsets of predictors are used for each situation in the forecasting horizon. "Close" is more time expensive,
- normalization: a boolean indicating whether predictors are normalized (scaled by their mean and standard deviation); the default setting is true; switching off may help to model data with structural changes,
- maxModelComplexity: determines the maximal possible number of terms in each model in the model zoo. Available values lie between 1 and 100 (inclusive). If not set, TIM will calculate the complexity automatically based on the sampling period of the dataset,
- features: an enumeration of the types of transformations TIM can use during the feature engineering. Available transformations are "ExponentialMovingAverage", "RestOfWeek", "Periodic", "Intercept", "PiecewiseLinear", "TimeOffsets", "Polynomial", "Identity", "SimpleMovingAverage", "Month", "Trend", "DayOfWeek", "Fourier" and "PublicHolidays". If not provided, the TIM Engine will determine the optimal features automatically,
- allowOffsets: a boolean indicating whether the models can use offsets of the variables,
- offsetLimit: the bottom limit for offsets, defining how far offsets are taken into account in the model building process, if not set TIM will determine this automatically,
- memoryLimitCheck: a boolean enabling reducing datasets by dynamically throwing away columns and rows taking into consideration the current RAM state; the default setting is true,
- rebuildingPolicy: a Dictionary to set the policy that determines which models to rebuild. Available keys are:
- type: indicates the type of rebuilding policy, options are "NewSituations", "All" and "OlderThan",
- time: only makes sense when combined with type "OlderThan", a Dictionary to determine the boundary in model age that separates "old" models (models to rebuild) from "new" models (not (yet) to rebuild). Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the default (and minimum) value is 1,
- predictionBoundaries: a Dictionary to set upper and lower boundaries for predictions; if not provided TIM will default to boundaries created based on the Inter-Quartile-Range of the target. Available keys are:
- type: indicates whether explicit boundaries should be used, options are "Explicit" and "None",
- maxValue: the upper boundary,
- minValue: the lower boundary,
- rollingWindow: a Dictionary defining the backtesting behavior; a new backtest forecast is generated every 'n' samples rolling backwards from the last target timestamp. Forecasts generated like this are of the same length as the production forecast generated inside the prediction horizon defined by "predictionTo" and "predictionFrom". If not provided, TIM will use 1 Day for daily cycle data and "predictionTo" otherwise. Available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 1,
- backtest: determines whether in-sample and out-of-sample forecasts should be returned. Possible values are "All", "Production" and "OutOfSample"; "Production" results in only the production forecast being returned; "OutOfSample" returns the out-of-sample forecasts on top of that.
- predictionTo: defines how far ahead TIM should forecast (the so-called forecasting horizon); its keys are:
- data: a Dictionary containing the configuration for the data that is used for model building and application. This is an optional argument, available keys are:
- version: a Dictionary containing the key id, refering to the ID of the dataset version to use; if not set, the latest version will be used,
- inSampleRows: a Dictionary or list of Dictionaries defining which samples should be used for model building; if not set, all observations but the ones defined in "outOfSampleRows" will be used; if there is an overlap with the "outOfSampleRows", timestamps in the intersection will be considered out-of-sample. Either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the in-sample range are defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the in-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- outOfSampleRows: a Dictionary or list of Dictionaries defining which samples should be used to backtest the model zoo; if not set, none will be used. This is an optional argument, and either a relative range or (absolute) ranges can be defined:
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- baseUnit: base unit in which time is counted, one of "Day", "Hour", "Minute", "Second", "Month" and "Sample"; the default value is "Sample",
- value: the number of base units; the minimum value is 0,
- for (absolute) ranges, the timestamps of the out-of-sample range ar defined as a list of explicit from-to ranges. A list of Dictionaries should be passed; the available keys in each Dictionary are:
- from: the start timestamp of the range,
- to: the end timestamp of the range,
- for a relative range, the timestamps of the out-of-sample range are defined as the last period of data counted backwards value baseUnits from the last value of the target variable. The available keys are:
- imputation: a Dictionary to define the imputation of missing data; if not provided, TIM will default to the "Linear" imputation with a gap length of 6 samples. This is an optional argument, available keys are:
- type: the type of imputation to be applied; options are "Linear", "LOCF" (Last Observation Carried Forward) and "None",
- maxGapLength: the maximum length of gaps that should be imputed; the minimum value is 0,
- columns: a list defining which columns (variables) should be used to build the model; if not set, all columns are selected. Each list item can either be a string (the name of the column) or an integer (the index of the column),
- wait_to_finish: a boolean indicating whether to wait for the execution to finish before returning; this is an optional parameter,
- handle_status_poll: an optional callback function handling polling for the status and progress of the forecasting job execution.
If wait_to_finish is set to True, this method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "Registered", "Running", "Finished", "FinishedWithWarning", "Failed", "Queued",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
rebuild_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
rebuild_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
rebuild_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target,forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the forecasting job, metadata, model_result, table_result and accuracies will be populated; logs will be returned for any job, including failed jobs.
If wait_to_finish is set to False, this method returns a Dictionary with the following keys:
- message: a message indicating what has happened (the forecasting job has been posted to a queue),
- code: a code providing more information on this message.
If an error is encountered, a similar Dictionary will be returned with the keys message and code containing additional information about the error.
get_forecast_results - retrieve the results of an executed forecasting job
get_forecast_results(self, forecast_job_id: str) -> tim.data_sources.forecast.types.ForecastResultsResponse
The get_forecast_results
method retrieves the results of an executed forecasting job from the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.get_forecast_results(forecast_job_id = <forecasting job ID>)
using a keyword argument, or in the following statement:
client.get_forecast_results(<forecasting job ID>)
using a positional argument, where <forecasting job ID>
is replaced by the ID of the forecasting job the results should be retrieved of.
The arguments are:
- forecast_job_id: the ID of the forecasting job of which to retrieve the results.
This method returns the following data:
- metadata: a Dictionary containing metadata of the forecasting job (or None when the job failed); available keys are:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job, options are "build-model", "rebuild-model", "rca",
- status: the status of the forecasting job, options are "Finished", "FinishedWithWarning",
- parentId: the ID of the parent job of the forecasting job (if any),
- experiment: a Dictionary containing the key id, refering to the ID of the experiment in which the forecasting job resides,
- useCase: a Dictionary containing the key id, refering to the ID of the use case in which the experiment containing the forecasting job resides,
- dataset: a Dictionary containing the key dataset, which in turn is a Dictionary containing the key id, refering to the ID of the dataset version the forecasting job is based on,
- createdAt: the datetime at which the forecasting job was created,
- executedAt: the datetime at which the forecasting job was executed,
- completedAt: the datetime at which the forecasting job was completed,
- workerVersion: the version of the worker the forecasting job has been executed with,
- errorMeasures: a Dictionary containing the error measures related to the forecasting job; available keys are: all, bin, samplesAhead, each of which contain the keys:
- name: the name of the error measure,
- inSample: the value of the error measure on the in-sample data,
- outOfSample: the value of the error measure on the out-of-sample data,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information,
- jobLoad: the load of the forecasting job, options are "Light" and "Heavy",
- calculationTime: the amount of time that was required to execute the job,
- model_result: a Dictionary containing information related to the model; available keys are:
- modelVersion: the version of the model,
- model: the model, containing the key Model Zoo , which in turn contains the following keys:
- samplingPeriod: the sampling period of the dataset the model is built with,
- averageTrainingLength: the average number of observations of the target variable that entered the model building process for the model,
- models: a list of Dictionaries containing the models themselves; each dictionary contains the following keys:
- index: the index of the model in the Model Zoo,
- terms: a list of Dictionaries containing the terms that compose the model; for each Dictionary the available keys are:
- importance: the importance (level of contribution) of the term or feature in the model,
- parts: a list of Dictionaries containing the parts that compose the term or feature; the following keys can be available:
- type: the type of the term or feature; this determines the other key(s) that are present,
- predictor: the predictor the term or feature relates to; used together with offset for some terms or features,
- offset: the offset of the predictor the term or feature relates to; used together with predictor for some terms or features,
- value: the value of the term or feature; used instead of predictor and offset for some terms or features,
- period: the period related to the term or feature; used together with unit for some terms or features,
- unit: the unit related to the period of a term or feature; used together with period for some terms or features,
- window: the window of some transformation to a predictor creating some terms or features,
- dayTime: the time of day the model is built to forecast for in case of a daily cycle model, null in case of a non-daily cycle model,
- dataOffsets: Dictionaries containing the offsets of the variables that are used by the model; the available key in each Dictionary is
- the name of the variable, containing a Dictionary with the following keys:
- start: the offset at which the model starts using the variable,
- stop: the offset at which the model stops using the variable,
- the name of the variable, containing a Dictionary with the following keys:
- samplesAhead: a list containing the number of samples ahead the model is built to forecast,
- modelQuality: the level of quality the model is built for,
- predictionIntervals: a list containing the (lower and upper boundaries of) the prediction intervals,
- lastTargetTimestamp: the last timestamp of the target variable,
- RInv: a mathematical parameter used for the computation of the root cause analysis,
- g: a mathematical parameter used for the computation of the root cause analysis,
- mx: a mathematical parameter used for the computation of the root cause analysis,
- cases: a list of Dictionaries identifying the cases the model is suitable to be used in; available keys in each Dictionary are:
- dayTime: the time of day the model is suitable for forecasting for (for daily cycle models),
- offsets: a Dictionary identifying the offsets of each variable that is used by the model to forecast in this case; available keys are the names of the variables,
- difficulty: a score indicating how difficult (complex) the data to be modeled is,
- targetName: the name of the target variable,
- holidayName: the name of the holiday column, if there is one,
- upperBoundary: the upper boundary for forecasted values,
- lowerBoundary: the lower boundary for forecasted values,
- dailyCycle: whether the model follows a daily cycle,
- variableProperties: a list of Dictionaries containing properties of each of the variables; each dictionary contains the following keys:
- name: the name of the variable,
- min: the minimum value of the variable,
- max: the maximum value of the variable,
- dataFrom: the largest offset of the variable used by the model,
- importance: the importance (level of contribution) of the variable in the model,
- signature: the signature of the model,
- table_result: a DataFrame containing the result table of the forecasting job; available columns are: datetime, date_from, time_from, target, forecast, forecast_type, relative_distance, model_index, samples_ahead, lower_bound, upper_bound and bin; this table is empty for jobs of type "RCA",
- accuracies: a Dictionary containing accuracy measures that can be used to evaluate the model and compare it to other models; available keys are:
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- name: the name of the group of accuracy measures; in this case "all",
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- bin: a list of Dictionaries containing accuracy measures aggregated by bin; available keys are:
- name: the name of the bin (samples ahead indicating the start and end of the bin),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- samplesAhead: a list of Dictionaries containing accuracy measures aggregated by the number of samples ahead the model is forecasting for; available keys are:
- name: the name of the group of accuracy measures (the amount of samples ahead),
- inSample: a Dictionary containing the in-sample accuracy measures (i.e. on the trainingsdata); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- outOfSample: a Dictionary containing the out-of-sample accuracy measures (i.e. on the validation data); available keys are:
- mae: the mean absolute error,
- mape: the mean average percentage error,
- rmse: the root mean squared error,
- accuracy: the accuracy,
- all: a Dictionary containing (aggregated) accuracy measures for all results; available keys are:
- logs: a list of Dictionaries, each of which containing the following keys:
- message: the actual message of the log,
- messageType: the type of the message; possible values are "Info", "Warning" and "Error",
- createdAt: the time the message was raised,
- origin: the phase of processing during which the message was raised; possible values are "Registration" and "Execution".
Upon succesful execution of the forecasting job, metadata, model_result, table_result and accuracies will be populated; logs will be returned for any job, including failed jobs.
get_forecasting_jobs - retrieve a list of forecasting jobs
get_forecasting_jobs(self, offset: Optional[int] = None, limit: Optional[int] = None, sort: Optional[str] = None, experiment_id: Optional[str] = None, use_case_id: Optional[str] = None, type: Optional[str] = None, status: Optional[str] = None, parent_id: Optional[str] = None, from_datetime: Optional[str] = None, to_datetime: Optional[str] = None) -> List[tim.data_sources.forecast.types.ForecastMetadata]
The get_forecasting_jobs
method retrieves a list of available forecasting jobs from the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.get_forecasting_jobs(offset = <offset>, limit = <limit>, sort = <sorting order>, experiment_id = <experiment ID>, use_case_id = <use case ID>, type = <job type>, status = <job status>, parent_id = <parent job ID>, from_datetime = <from datetime>, to_datetime = <to datetime>)
using keyword arguments, or in the following statement:
client.get_forecasting_jobs(<offset>, <limit>, <sorting order>, <experiment ID>, <use case ID>, <type>, <status>, <parent job ID>, <from datetime>, <to datetime>)
using positional arguments, where <offset>
, <limit>
, <sorting order>
, <experiment ID>
, <use case ID>
, <type>
, <status>
, <parent job ID>
, <from datetime>
and <to datetime>
are replaced by the relevant values.
The arguments are:
- offset: the number of forecasting jobs to be skipped from the beginning of the list (related to pagination), this is an optional argument with a default value of 0,
- limit: the maximum number of forecasting jobs to be returned, this is an optional argument with a default value of 10000,
- sort: a sorting order to sort results by, possible values are "+createdAt", "-createdAt", "+executedAt", "-executedAt", "+completedAt", "-completedAt", "+priority" and "-priority", where "+" and "-" indicate ascending and descending order, respectively. This is an optional argument with a default value of "-createdAt" (most recently created forecasting jobs are returned first),
- experiment_id: a filter on the ID of the experiment a forecasting job resides in, this is an optional argument with a default value of None,
- use_case_id: a filter on the ID of the use case a forecasting job resides in, this is an optional argument with a default value of None,
- type: a filter on the type of the forecasting job, possible values are any comma separated list of "build-model", "rebuild-model", "predict" and "rca". This is an optional argument with a default value of None,
- status: a filter on the status of the forecasting job, possible values are any comma separated list of "Registered", "Queued", "Running", "Finished", "FinishedWithWarning" and "Failed". This is an optional argument with a default value of None,
- parent_id: a filter on the ID of the parent job of the forecasting job(s) to retrieve, this is an optional argument with a default value of None,
- from_datetime: a filter for a minimal date and time of job creation, this is an optional argument with a default value of None,
- to_datetime: a filter for a maximal date and time of job creation, this is an optional argument with a default value of None.
This method returns a list of Dictionaries, each of which can include the following data:
- id: the ID of the forecasting job,
- name: the name of the forecasting job,
- type: the type of the forecasting job,
- status: the status of the forecasting job,rrr
- parentJob: a Dictionary containing the following keys:
- id: the ID of the parent job of the forecasting job,
- useCase: a Dictionary containing the following keys:
- id: the ID of the use case the forecasting job resides in,
- experiment: a Dictionary containing the following keys:
- id: the ID of the experiment the forecasting job resides in, -dataset: a Dictionary containing the following keys:
- version: a Dictionary containing the following keys:
- id: the ID of the dataset version the job is executed on,
- createdAt: the date and time of creation of this forecasting job,
- completedAt: the date and time of completion of this forecasting job,
- executedAt: the date and time of execution of this forecasting job,
- workerVersion: the version of the worker used to execute this forecasting job,
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
build_forecasting_model()
); available keys are:- name: the name given to the forecasting job upon registration,
- useCase: a Dictionary containing the key id, refering to the ID of the use case the job was registered in,
- configuration: the configuration of the job, see the key configuration in the argument job_configuration of
build_forecasting_model()
for more information. - data: the data-related configuration of the job, see the key data in the argument job_configuration of
build_forecasting_model()
for more information.
- registrationBody: a Dictionary containing the settings that were used when registering the forecasting job (ex.g. by calling
delete_forecast - delete a forecasting job
delete_forecast(self, forecast_job_id) -> tim.types.ExecuteResponse
The delete_forecast
method deletes a forecasting job from the TIM repository. This method is called on the authenticated instance of Tim created as described in Authentication ("client"), like in the following statement:
client.delete_forecast(forecast_job_id = <forecast job ID>)
using keyword arguments, or in the following statement:
client.delete_anomaly_detection(<forecast job ID>)
using positional arguments, where <forecast job ID>
is replaced by the ID of the forecast job to delete.
The argument is:
- forecast_job_id: the ID of the forecast job to delete.
This method returns a Dictionary with the following keys:
- message: a message indicating what has happened (the forecast job has successfully been deleted),
- code: a code providing more information on this message; if the deletion was successful, this code will be "FC07027".
If an error is encountered, a similar Dictionary will be returned with the keys message and code containing additional information about the error.