Accelerating adaption of electric vehicles (EVs) is driving improvements of battery technologies at rocket speed. We have not seen such progress in decades. Bigger capacities, faster charging and longer lifespan of batteries are in focus. It is not only EVs that challenge batteries tech status quo. Smartphones, batteries installed at households, and other areas would benefit from progress in this field.
One of the factors that has impact on battery health (span-of-life in particular) and capacity is temperature. The amount of energy available for discharge is influenced by temperature. With rising temperature there is more of it (battery can provide the most at 45°C) and vice versa. However, the optimal temperature for maximizing battery lifespan and usable capacity is between 15 to 35°C. Temperature is important factor also during battery charging. To maximize the lifespan of Li-ion batteries they should not be charged below 0°C.
Nowadays, advanced battery systems rely on cooling/heating mechanisms that help batteries to operate (and keep them in healthy condition) even in extreme conditions. It is also possible to control current flow and thus help to balance battery temperature.
In this use case, we will demonstrate how TIM can predict temperature of battery.
TIM can be deployed on edge, and scale, we can imagine various scenarios of deployment, from device level (e.g. in EV), or in cloud to which vast battery grids could be connected.
Business objective: | Improved quality of products |
Business value: | Greater value for customers |
KPI: | - |
import logging
import pandas as pd
import plotly.graph_objects as go
import numpy as np
import json
import datetime
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
import tim_client
with open('credentials.json') as f:
credentials_json = json.load(f) # loading the credentials from credentials.json
TIM_URL = 'https://timws.tangent.works/v4/api' # URL to which the requests are sent
SAVE_JSON = False # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/' # folder where the requests and responses are stored
LOGGING_LEVEL = 'INFO'
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)
results = dict()
Data contain measurements of multiple Li-ion batteries used in charge/discharge experiment.
Batteries were continuously operated in cycles - charging, discharging, resting cycles with various modifications. Our focus was on discharging with current charging randomly (random walk), current setpoints were selected from 4.5A, 3.75A, 3A, 2.25A, 1.5A and 0.75A.
During the experiment, each selected current setpoint was applied until either the battery voltage went down to 3.2V or 5 minutes passed, however we wanted to focus solely on natural discharge events, not timeouts, thus we selected the later part of original data where barttery aging/degradation was already visible.
Values were resampled from original sampling (almost regular 1-second) to regular 5-second basis with mean aggregation.
Data may contain gaps, this is the case mainly for dataset with merged data from multiple batteries (of the same type and parameters). During the experiments, distribution of current to which batteries were exposed to was different. Merging such data means higher variance of values but at the same time makes models built more stable.
Column name | Description | Type | Availability |
---|---|---|---|
timestamp | Absolute time stamp of sample | Timestamp column | |
temperature | Temperature of battery (Celsius) | Predictor | t+0 |
current | Current measured in Amps | Predictor | t+0 |
voltage | Voltage measured in Volts | Predictor | t+0 |
voltage_cumsum | Cumulative sum of voltage within given cycle (Volts) | Predictor | t+0 |
capacity_removed | Removed capacity within given cycle (current x timeframe) | Predictor | t+0 |
capacity_removed_cumulative | Cumulative capacity reduced within given cycle | Predictor | t+0 |
Reg. ambient temperature, it is not provided, although it is known that for RW10 it was "room temperature", while for RW25 and RW21 it was "approximately 40°C".
The original file was obtained in Matlab format, it was transformed into CSV, enhanced with additional information derived from original data (capacity removed column and cumsum columns), and filtered to discharge cycles only.
To demonstrate results for out-of-sample interval, dataset was resampled to regular sampling so it was possible to match timestamps and calculate residuals. Technically, TIM is capable to work with any kind of sampling, spanning from milliseconds, irregularly sampled data, data with gaps etc.
TIM detects forecasting situation from current "shape" of data, i.e. if values for target and predictors end all at the same timestamp, this pattern of availability is assumed also for the model building and calculation of out-of-sample values.
Our forecasting situation assumes we have no forecasted values for predictors, we simply aim to forecast N steps ahead.
CSV files used in experiments can be downloaded here.
Raw dataset was obtained at NASA Prognostics Center of Excellence website.
B. Bole, C. Kulkarni, and M. Daigle "Randomized Battery Usage Data Set", NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA. Analysis of a similar dataset is published in: Brian Bole, Chetan Kulkarni, and Matthew Daigle, "Adaptation of an Electrochemistry-based Li-Ion Battery Model to Account for Deterioration Observed Under Randomized Use", in the proceedings of the Annual Conference of the Prognostics and Health Management Society, 2014
List of files for each experiment iteration:
datafiles ={
'RW10': 'datasets/battery_discharge_r_RW10_temperature.csv',
'RW25': 'datasets/battery_discharge_r_RW25_temperature.csv',
'RW21': 'datasets/battery_discharge_r_RW21_temperature.csv',
'merged': 'datasets/battery_discharge_r_merged_temperature.csv',
}
results = dict()
# NOTE: start re-run of each iteration by changing key to CSV file
FOCUS = 'RW10'
data = tim_client.load_dataset_from_csv_file( datafiles.get(FOCUS), sep=',' )
prediction_horizon = 12
data.shape
(20289, 7)
data.tail(prediction_horizon+1)
timestamp | temperature | voltage | current | capacity_removed | capacity_removed_cumsum | voltage_cumsum | |
---|---|---|---|---|---|---|---|
20276 | 2014-06-02 06:06:30 | 35.846852 | 3.2452 | 3.75 | 3.75 | 41.2496 | 39.793 |
20277 | 2014-06-02 06:06:35 | NaN | NaN | NaN | NaN | NaN | NaN |
20278 | 2014-06-02 06:06:40 | NaN | NaN | NaN | NaN | NaN | NaN |
20279 | 2014-06-02 06:06:45 | NaN | NaN | NaN | NaN | NaN | NaN |
20280 | 2014-06-02 06:06:50 | NaN | NaN | NaN | NaN | NaN | NaN |
20281 | 2014-06-02 06:06:55 | NaN | NaN | NaN | NaN | NaN | NaN |
20282 | 2014-06-02 06:07:00 | NaN | NaN | NaN | NaN | NaN | NaN |
20283 | 2014-06-02 06:07:05 | NaN | NaN | NaN | NaN | NaN | NaN |
20284 | 2014-06-02 06:07:10 | NaN | NaN | NaN | NaN | NaN | NaN |
20285 | 2014-06-02 06:07:15 | NaN | NaN | NaN | NaN | NaN | NaN |
20286 | 2014-06-02 06:07:20 | NaN | NaN | NaN | NaN | NaN | NaN |
20287 | 2014-06-02 06:07:25 | NaN | NaN | NaN | NaN | NaN | NaN |
20288 | 2014-06-02 06:07:30 | NaN | NaN | NaN | NaN | NaN | NaN |
timestamp_column = 'timestamp'
target_column = 'temperature'
Visualization
def apply_regular_timeline( df, sec, timestamp_column ):
vis_df = df.copy()
vis_df[ timestamp_column ] = pd.to_datetime( vis_df[ timestamp_column ] )
vis_df.set_index( timestamp_column, inplace=True )
timestamps = [ vis_df.index.min() + i * datetime.timedelta( seconds=sec ) for i in range( int( ( vis_df.index.max() - vis_df.index.min() ).total_seconds()/sec ) + 1 ) ]
temp_df = pd.DataFrame( { timestamp_column: timestamps } )
temp_df.set_index( timestamp_column, inplace=True )
vis_df = temp_df.join( vis_df )
vis_df[ timestamp_column ] = vis_df.index
return vis_df.reset_index( drop=True )
last_n = 10000
vis_df = apply_regular_timeline( data.iloc[-last_n:], 5, timestamp_column )
#vis_df = apply_regular_timeline( data.iloc[:], 5, timestamp_column )
fig = go.Figure()
fig.add_trace(go.Scatter( x = vis_df[ timestamp_column ], y = vis_df[ target_column ], name=target_column ) )
#fig.update_layout( height = 700, width = 1200, title='Target visualization' )
fig.update_layout( height = 700, width = 1200, title='Target visualization, last '+str(last_n)+' records' )
fig.show()
Parameters that need to be set are:
For datasets with many gaps, when model is expected to use lagged values, it is recommended to set interpolation length accordingly. If not set, TIM will use default value of 6.
We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.
backtest_length = 14000
configuration_backtest = {
'usage': {
'predictionTo': {
'baseUnit': 'Sample',
'offset': prediction_horizon
},
'backtestLength': backtest_length
},
'interpolation':{'type':'Linear','maxLength':40},
'extendedOutputConfiguration': {
'returnExtendedImportances': True,
}
}
We will run experiments for different datasets, they differ by distribution of values for current and data derived from it.
Results for accuracy and plot are shown in Evaluation section.
backtest = api_client.prediction_build_model_predict( data, configuration_backtest )
backtest.status
'FinishedWithWarning'
backtest.result_explanations
[{'index': 1, 'message': 'Predictor temperature has a value missing for timestamp 2014-05-29 09:31:55.'}, {'index': 2, 'message': 'Using weak model for some timestamp, there is a gap in some of the predictors in its most recent records. Try changing the interpolationLength.'}]
Simple and extended importances are available for you to see to what extent each predictor contributes in explaining variance of target variable.
simple_importances = pd.DataFrame.from_dict( backtest.predictors_importances['simpleImportances'], orient='columns' )
simple_importances
importance | predictorName | |
---|---|---|
0 | 50.79 | temperature |
1 | 22.33 | current |
2 | 13.85 | voltage |
3 | 6.35 | capacity_removed |
4 | 4.47 | capacity_removed_cumsum |
5 | 2.22 | voltage_cumsum |
fig = go.Figure()
fig.add_trace(go.Bar( x = simple_importances['predictorName'],
y = simple_importances['importance'] ) )
fig.update_layout( width = 1200, height = 700, title='Simple importances' )
fig.show()
extended_importances_temp = backtest.predictors_importances['extendedImportances']
extended_importances_temp = sorted( extended_importances_temp, key = lambda i: i['importance'], reverse=True )
extended_importances = pd.DataFrame.from_dict( extended_importances_temp )
extended_importances
time | type | termName | importance | |
---|---|---|---|---|
0 | [2] | TargetAndTargetTransformation | temperature(t-2) | 30.63 |
1 | [5] | TargetAndTargetTransformation | temperature(t-5) | 29.04 |
2 | [1] | TargetAndTargetTransformation | temperature(t-1) | 28.55 |
3 | [6] | TargetAndTargetTransformation | temperature(t-6) | 28.31 |
4 | [4] | TargetAndTargetTransformation | temperature(t-4) | 28.19 |
... | ... | ... | ... | ... |
235 | [2] | Interaction | capacity_removed(t-12) & voltage_cumsum(t-12) | 0.70 |
236 | [4] | Interaction | cos(2πt / 8.0 hours) & cos(2πt / 6.0 hours) | 0.70 |
237 | [2] | Interaction | temperature(t-12) & capacity_removed_cumsum(t-12) | 0.65 |
238 | [2] | Interaction | temperature(t-12) & voltage_cumsum(t-12) | 0.64 |
239 | [1] | Interaction | capacity_removed(t-12) & voltage_cumsum(t-12) | 0.44 |
240 rows × 4 columns
fig = go.Figure()
fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '[1]' ]['termName'],
y = extended_importances[ extended_importances['time'] == '[1]' ]['importance'] ) )
fig.update_layout(
title='Model features sorted by importance (model for the 1st step in prediction horizon)',
width = 1200,
height = 700
)
fig.show()
fig = go.Figure()
fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '['+str(prediction_horizon)+']' ]['termName'],
y = extended_importances[ extended_importances['time'] == '['+str(prediction_horizon)+']' ]['importance'] ) )
fig.update_layout(
title='Model features sorted by importance (model for the last step in prediction horizon)',
width = 1200,
height = 700
)
fig.show()
Results for out-of-sample interval.
def build_evaluation_data( backtest, data, timestamp_column, target_column ):
out_of_sample_predictions = backtest.aggregated_predictions[1]['values']
out_of_sample_predictions.rename( columns = {'Prediction':target_column+'_pred'}, inplace=True)
out_of_sample_timestamps = out_of_sample_predictions.index.tolist()
evaluation_data = data.copy()
evaluation_data[ timestamp_column ] = pd.to_datetime(data[ timestamp_column ]).dt.tz_localize('UTC')
evaluation_data = evaluation_data[ evaluation_data[ timestamp_column ].isin( out_of_sample_timestamps ) ]
evaluation_data.set_index( timestamp_column,inplace=True)
evaluation_data = evaluation_data[ [ target_column ] ]
evaluation_data = evaluation_data.join( out_of_sample_predictions )
return evaluation_data
def plot_results( e ):
fig = go.Figure()
fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,1], name=e.columns[1] ) )
fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,0], name=e.columns[0] ) )
fig.update_yaxes( range=[ int( e.iloc[:,0].min()*0.33 ) ,int( e.iloc[:,0].max()*1.5 ) ] )
fig.update_layout( height = 700, width = 1200, title='Actual vs. predicted' )
fig.show()
e = build_evaluation_data( backtest, data, timestamp_column, target_column )
e['timestamp'] = e.index
e['timestamp'] = e['timestamp'].apply( lambda x: datetime.datetime.strftime(x, '%Y-%m-%d %H:%M:%S' ) )
# e
e = apply_regular_timeline( e, 5, 'timestamp')
backtest.aggregated_predictions[1]['accuracyMetrics']
{'MAE': 0.050870133254161995, 'MSE': 0.006706561483426089, 'MAPE': 0.137741146601865, 'RMSE': 0.08189359854974068}
results[FOCUS] = backtest.aggregated_predictions[1]['accuracyMetrics'] # save results for consolidated overview
Zoom-in chart to compare actual vs. predicted values.
Iteration with "RW21" dataset
plot_results(e)
Iteration with "RW25" dataset
plot_results(e)
Iteration with "RW10" dataset
plot_results(e)
Iteration with "merged" dataset
plot_results(e)
pd.DataFrame.from_dict(results)
merged | RW21 | RW25 | RW10 | |
---|---|---|---|---|
MAE | 0.133795 | 0.096543 | 0.123426 | 0.050870 |
MSE | 0.044368 | 0.031783 | 0.037176 | 0.006707 |
MAPE | 0.302775 | 0.216050 | 0.257891 | 0.137741 |
RMSE | 0.210636 | 0.178277 | 0.192811 | 0.081894 |
We can see that TIM, with default settings, achieved quite remarkable accuracy. We used 5-second sampling and predicted values 1 minute (12 x 5 sec.) ahead.
It is assumed that actual values for temperature are available and models rely on them in calculations. This use case can be further altered by brining additional data e.g., ambient temperature, more load information, etc. and at the same time experiment with reduction of sensory measurements from dataset. This would bring benefits to manufacturers of batteries as eliminating sensors means cheaper production.
This use case worked with (almost) the same dataset as use case demonstrating forecasting of remaining time until discharge, we encourage you to check that one as well.