Contact centers rely on pool of resources ready to help customers when they reach out via call, email, chat, or other channel. For contact centers, predicting volume of incoming requests at specific times is critical input to resource scheduling (very short- and short-term horizon) and resource management (mid to long term horizons). For short term forecasts, typical task would be predicting volumes for the next 7 days, hour by hour. High quality forecast would bring confidence that FTEs (full time equivalent - indicates workload of an employed person) planned for the next week are just right for delivering on SLAs. Not to mention other benefits, such as higher confidence when planning absence (due to vacation, education etc.), or improving morale of employees who would not face overload from "sudden" volume peaks.
To build a high-quality forecast, it is necessary to gather relevant, and valid data with predictive power. In such case it is possible to employ ML technology like TIM RTInstantML that can build models for time-series data in fraction of time.
We will showcase how TIM can predict volumes of requests for next 7 days on hourly basis in our sample use case.
Business objective: | Reduce risk of resources shortage |
Business value: | Optimal resources planning |
KPI: | - |
Business objective: | Reduce risk of not meeting SLAs |
Business value: | Better customer relations, lower/no penalties |
KPI: | - |
Business objective: | Reduce effort on forecasting |
Business value: | Gain capacity of high skilled personnel |
KPI: | - |
import logging
import pandas as pd
import plotly as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import json
import tim_client
with open('credentials.json') as f:
credentials_json = json.load(f) # loading the credentials from credentials.json
TIM_URL = 'https://timws.tangent.works/v4/api' # URL to which the requests are sent
SAVE_JSON = False # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/' # folder where the requests and responses are stored
LOGGING_LEVEL = 'INFO'
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)
api_client.save_json = SAVE_JSON
api_client.json_saving_folder_path = JSON_SAVING_FOLDER
Dataset contains information about request volumes, temperature, holiday, no. of regular customers, marketing campaign, no. of customers for which contract will expire within next 30 or 60 days, no. of invoices sent, flag whether particular timestamp is day when invoices are sent, flag if contact center is open at given timestamp.
Hourly.
Structure of CSV file:
Column name | Description | Type | Availability |
---|---|---|---|
Date | Timestamp | Timestamp column | |
Volumes | No. of requests | Target | t+0 |
Temperature | Temperature in Celsius | Predictor | t+168 |
PublicHolidays | Binary flag for holidays | Predictor | t+168 |
IsOpen | Binary flag to show if contact center is open at given timestamp | Predictor | t+168 |
IsMktingCampaign | Binary flag to show if product team is running marketing campaign at given timestamp | Predictor | t+168 |
ContractsToExpireIn30days | No. of regular contracts that will expire within 30 days | Predictor | t+168 |
ContractsToExpireIn60days | No. of regular contracts that will expire within 60 days | Predictor | t+168 |
RegularCustomers | No. of active contracts for regular customers | Predictor | t+168 |
InvoiceDay | Binary flag to show if invoices are sent at given timestamp | Predictor | t+168 |
InvoicesSent | No. of invoices sent at given timestamp | Predictor | t+168 |
We want to predict volume for the next 7 days, for each hour. Time of prediction is at 23:00 every day. This situation is reflected in values present in CSV file. TIM will simulate this situation throughout the whole out-of-sample interval to calculate accuracy metrics.
CSV file used in experiments can be downloaded here.
This is synthetic dataset generated by simulating outcome of events relevant to operations of contact center.
data = tim_client.load_dataset_from_csv_file('data2B.csv', sep=',')
data
target_column = 'Volumes'
timestamp_column = 'Date'
fig = go.Figure()
fig.add_trace( go.Scatter( x=data.iloc[:]['Date'], y=data.iloc[:][ target_column ] ) )
fig.update_layout( width=1300, height=700, title='Volumes' )
fig.show()
Parameters that need to be set:
We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.
back_test_length = int( data.shape[0] * .33 )
prediction_horizon_samples = 7*24
configuration_backtest = {
'usage': {
'predictionTo': {
'baseUnit': 'Sample',
'offset': prediction_horizon_samples
},
'backtestLength': back_test_length
},
'extendedOutputConfiguration': {
'returnExtendedImportances': True
}
}
backtest = api_client.prediction_build_model_predict(data, configuration_backtest)
backtest.status
backtest.result_explanations
Simple and extended importances are available for you to see to what extent each predictor contributes to explanation of variance of target variable.
simple_importances = backtest.predictors_importances['simpleImportances']
simple_importances = sorted(simple_importances, key = lambda i: i['importance'], reverse=True)
simple_importances = pd.DataFrame.from_dict( simple_importances )
fig = go.Figure()
fig.add_trace( go.Bar( x = simple_importances['predictorName'],
y = simple_importances['importance'] ) )
fig.update_layout(
title='Simple importances',
width = 1200,
height = 700
)
fig.show()
extended_importances = backtest.predictors_importances['extendedImportances']
extended_importances = sorted(extended_importances, key = lambda i: i['importance'], reverse=True)
extended_importances = pd.DataFrame.from_dict( extended_importances )
extended_importances[ extended_importances['time']=='11:00:00' ]
fig = go.Figure()
fig.add_trace( go.Bar( x = extended_importances[ extended_importances['time'] == '11:00:00' ]['termName'],
y = extended_importances[ extended_importances['time'] == '11:00:00' ]['importance'] ) )
fig.update_layout(
title='Features generated from predictors used by model for 11:00',
width = 1200,
height = 700
)
fig.show()
Results for out-of-sample interval.
# Helper function, merges actual and predicted values together
def create_eval_df( predictions ):
data2 = data.copy()
data2[ timestamp_column ] = pd.to_datetime( data2[ timestamp_column ]).dt.tz_localize('UTC')
data2.rename( columns={ timestamp_column: 'Timestamp' }, inplace=True)
data2.set_index( 'Timestamp', inplace=True)
eval_data = data2[ [ target_column ] ].join( predictions, how='inner' )
return eval_data
for i in range(0,7):
print('Day:', i+1, 'RMSE:', backtest.aggregated_predictions[i]['accuracyMetrics']['RMSE'] )
# backtest.aggregated_predictions[0]['type'], backtest.aggregated_predictions[6]['type']
edf = create_eval_df( backtest.aggregated_predictions[0]['values'] )
fig = go.Figure()
fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='In-Sample') )
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )
fig.update_layout( width=1200, height=700, title='Actual vs. predicted' )
fig.show()
for i in range(7,14):
print('Day:',i-6,'RMSE:',backtest.aggregated_predictions[i]['accuracyMetrics']['RMSE'] )
#backtest.aggregated_predictions[7]['type'], backtest.aggregated_predictions[13]['type']
edf = create_eval_df( backtest.aggregated_predictions[7]['values'] )
fig = go.Figure()
fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='Out-of-Sample') )
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )
fig.update_layout( width=1200, height=700, title='Actual vs. predicted' )
fig.show()
We demonstrated how TIM can automate forecasting of key inputs to resource planning/scheduling - predicting volume of requests.
Predictors used, and their quality, play vital role in building of such forecasting system, thus it is assumed that cooperation with LoBs (lines of business) that possess relevant information, preferably with forecasted values, is established.
Contact centers that support multiple channels that customers can use to submit query may benefit from forecasts for various perspectives. With TIM RTInstantML it is possible to build new model and make predictions for various perspectives, e.g. volume per channel (incoming calls, messages from social media, emails etc.), volumes per region, consolidated volumes, and other. Equally, need for various prediction horizons does not mean any additional burden for TIM, depending on sampling of your data, you can predict from minutes to years ahead.