Battery temperature¶

Title: Battery temperature
Author: Michal Bezak, Tangent Works
Industry: Electronics, Automotive, Manufacturing ...
Area: Customer experience, Operations
Type: Forecasting

Description¶

Accelerating adaption of electric vehicles (EVs) is driving improvements of battery technologies at rocket speed. We have not seen such progress in decades. Bigger capacities, faster charging and longer lifespan of batteries are in focus. It is not only EVs that challenge batteries tech status quo. Smartphones, batteries installed at households, and other areas would benefit from progress in this field.

One of the factors that has impact on battery health (span-of-life in particular) and capacity is temperature. The amount of energy available for discharge is influenced by temperature. With rising temperature there is more of it (battery can provide the most at 45°C) and vice versa. However, the optimal temperature for maximizing battery lifespan and usable capacity is between 15 to 35°C. Temperature is important factor also during battery charging. To maximize the lifespan of Li-ion batteries they should not be charged below 0°C.

Nowadays, advanced battery systems rely on cooling/heating mechanisms that help batteries to operate (and keep them in healthy condition) even in extreme conditions. It is also possible to control current flow and thus help to balance battery temperature.

In this use case, we will demonstrate how TIM can predict temperature of battery.

TIM can be deployed on edge, and scale, we can imagine various scenarios of deployment, from device level (e.g. in EV), or in cloud to which vast battery grids could be connected.

Business parameters¶

Business objective: Improved quality of products
Business value: Greater value for customers
KPI: -
In [200]:
import logging
import pandas as pd
import plotly.graph_objects as go
import numpy as np
import json
import datetime

from sklearn.metrics import mean_squared_error, mean_absolute_error
import math

import tim_client
In [201]:
with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'
In [202]:
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
In [203]:
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)
In [5]:
results = dict()

Dataset(s)¶

Data contain measurements of multiple Li-ion batteries used in charge/discharge experiment.

Batteries were continuously operated in cycles - charging, discharging, resting cycles with various modifications. Our focus was on discharging with current charging randomly (random walk), current setpoints were selected from 4.5A, 3.75A, 3A, 2.25A, 1.5A and 0.75A.

During the experiment, each selected current setpoint was applied until either the battery voltage went down to 3.2V or 5 minutes passed, however we wanted to focus solely on natural discharge events, not timeouts, thus we selected the later part of original data where barttery aging/degradation was already visible.

Sampling and gaps¶

Values were resampled from original sampling (almost regular 1-second) to regular 5-second basis with mean aggregation.

Data may contain gaps, this is the case mainly for dataset with merged data from multiple batteries (of the same type and parameters). During the experiments, distribution of current to which batteries were exposed to was different. Merging such data means higher variance of values but at the same time makes models built more stable.

Data¶

Column name Description Type Availability
timestamp Absolute time stamp of sample Timestamp column
temperature Temperature of battery (Celsius) Predictor t+0
current Current measured in Amps Predictor t+0
voltage Voltage measured in Volts Predictor t+0
voltage_cumsum Cumulative sum of voltage within given cycle (Volts) Predictor t+0
capacity_removed Removed capacity within given cycle (current x timeframe) Predictor t+0
capacity_removed_cumulative Cumulative capacity reduced within given cycle Predictor t+0

Reg. ambient temperature, it is not provided, although it is known that for RW10 it was "room temperature", while for RW25 and RW21 it was "approximately 40°C".

The original file was obtained in Matlab format, it was transformed into CSV, enhanced with additional information derived from original data (capacity removed column and cumsum columns), and filtered to discharge cycles only.

To demonstrate results for out-of-sample interval, dataset was resampled to regular sampling so it was possible to match timestamps and calculate residuals. Technically, TIM is capable to work with any kind of sampling, spanning from milliseconds, irregularly sampled data, data with gaps etc.

Forecasting situation¶

TIM detects forecasting situation from current "shape" of data, i.e. if values for target and predictors end all at the same timestamp, this pattern of availability is assumed also for the model building and calculation of out-of-sample values.

Our forecasting situation assumes we have no forecasted values for predictors, we simply aim to forecast N steps ahead.

CSV files used in experiments can be downloaded here.

Source¶

Raw dataset was obtained at NASA Prognostics Center of Excellence website.

B. Bole, C. Kulkarni, and M. Daigle "Randomized Battery Usage Data Set", NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA. Analysis of a similar dataset is published in: Brian Bole, Chetan Kulkarni, and Matthew Daigle, "Adaptation of an Electrochemistry-based Li-Ion Battery Model to Account for Deterioration Observed Under Randomized Use", in the proceedings of the Annual Conference of the Prognostics and Health Management Society, 2014

List of files for each experiment iteration:

In [18]:
datafiles ={
    'RW10': 'datasets/battery_discharge_r_RW10_temperature.csv',
    'RW25': 'datasets/battery_discharge_r_RW25_temperature.csv',
    'RW21': 'datasets/battery_discharge_r_RW21_temperature.csv',
    'merged': 'datasets/battery_discharge_r_merged_temperature.csv',
}
In [19]:
results = dict()
In [162]:
# NOTE: start re-run of each iteration by changing key to CSV file
FOCUS = 'RW10'  
In [163]:
data = tim_client.load_dataset_from_csv_file( datafiles.get(FOCUS), sep=',' )
In [164]:
prediction_horizon  = 12
In [165]:
data.shape
Out[165]:
(20289, 7)
In [166]:
data.tail(prediction_horizon+1)
Out[166]:
timestamp temperature voltage current capacity_removed capacity_removed_cumsum voltage_cumsum
20276 2014-06-02 06:06:30 35.846852 3.2452 3.75 3.75 41.2496 39.793
20277 2014-06-02 06:06:35 NaN NaN NaN NaN NaN NaN
20278 2014-06-02 06:06:40 NaN NaN NaN NaN NaN NaN
20279 2014-06-02 06:06:45 NaN NaN NaN NaN NaN NaN
20280 2014-06-02 06:06:50 NaN NaN NaN NaN NaN NaN
20281 2014-06-02 06:06:55 NaN NaN NaN NaN NaN NaN
20282 2014-06-02 06:07:00 NaN NaN NaN NaN NaN NaN
20283 2014-06-02 06:07:05 NaN NaN NaN NaN NaN NaN
20284 2014-06-02 06:07:10 NaN NaN NaN NaN NaN NaN
20285 2014-06-02 06:07:15 NaN NaN NaN NaN NaN NaN
20286 2014-06-02 06:07:20 NaN NaN NaN NaN NaN NaN
20287 2014-06-02 06:07:25 NaN NaN NaN NaN NaN NaN
20288 2014-06-02 06:07:30 NaN NaN NaN NaN NaN NaN
In [167]:
timestamp_column = 'timestamp'

target_column = 'temperature'

Visualization

In [168]:
def apply_regular_timeline( df, sec, timestamp_column ):
    vis_df = df.copy()

    vis_df[ timestamp_column ] = pd.to_datetime( vis_df[ timestamp_column ] )
    vis_df.set_index( timestamp_column, inplace=True )

    timestamps = [ vis_df.index.min() + i * datetime.timedelta( seconds=sec ) for i in range( int( ( vis_df.index.max() - vis_df.index.min() ).total_seconds()/sec ) + 1 ) ]

    temp_df = pd.DataFrame( { timestamp_column: timestamps } )
    temp_df.set_index( timestamp_column, inplace=True )

    vis_df = temp_df.join( vis_df )
    vis_df[ timestamp_column ] = vis_df.index

    return vis_df.reset_index( drop=True )
In [169]:
last_n = 10000

vis_df = apply_regular_timeline( data.iloc[-last_n:], 5, timestamp_column )

#vis_df = apply_regular_timeline( data.iloc[:], 5, timestamp_column )
In [170]:
fig = go.Figure()

fig.add_trace(go.Scatter( x = vis_df[ timestamp_column ], y = vis_df[ target_column ], name=target_column ) )

#fig.update_layout( height = 700, width = 1200, title='Target visualization' )

fig.update_layout( height = 700, width = 1200, title='Target visualization, last '+str(last_n)+' records' )

fig.show()

Engine settings¶

Parameters that need to be set are:

  • Prediction horizon.
  • Back-test length, to obtain values for out-of-sample interval.

For datasets with many gaps, when model is expected to use lagged values, it is recommended to set interpolation length accordingly. If not set, TIM will use default value of 6.

We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.

In [210]:
backtest_length = 14000
In [227]:
configuration_backtest = {
    'usage': {                                 
        'predictionTo': { 
            'baseUnit': 'Sample',             
            'offset': prediction_horizon
        }, 
        'backtestLength': backtest_length   
    }, 
    'interpolation':{'type':'Linear','maxLength':40}, 
    'extendedOutputConfiguration': {
        'returnExtendedImportances': True,
    }
}

Experiment iteration(s)¶

We will run experiments for different datasets, they differ by distribution of values for current and data derived from it.

  • uniform distribution (RW10)
  • high skewed (RW25)
  • low skewed (RW21)
  • merged (all three in single dataset)

Results for accuracy and plot are shown in Evaluation section.

In [229]:
backtest = api_client.prediction_build_model_predict( data, configuration_backtest )  

backtest.status   
Out[229]:
'FinishedWithWarning'
In [230]:
backtest.result_explanations
Out[230]:
[{'index': 1,
  'message': 'Predictor temperature has a value missing for timestamp 2014-05-29 09:31:55.'},
 {'index': 2,
  'message': 'Using weak model for some timestamp, there is a gap in some of the predictors in its most recent records. Try changing the interpolationLength.'}]

Insights - inspecting ML models¶

Simple and extended importances are available for you to see to what extent each predictor contributes in explaining variance of target variable.

In [231]:
simple_importances = pd.DataFrame.from_dict( backtest.predictors_importances['simpleImportances'], orient='columns' )

simple_importances
Out[231]:
importance predictorName
0 50.79 temperature
1 22.33 current
2 13.85 voltage
3 6.35 capacity_removed
4 4.47 capacity_removed_cumsum
5 2.22 voltage_cumsum
In [232]:
fig = go.Figure()

fig.add_trace(go.Bar( x = simple_importances['predictorName'],
                      y = simple_importances['importance'] ) )

fig.update_layout( width = 1200, height = 700, title='Simple importances' )

fig.show()
In [233]:
extended_importances_temp = backtest.predictors_importances['extendedImportances']
extended_importances_temp = sorted( extended_importances_temp, key = lambda i: i['importance'], reverse=True ) 
extended_importances = pd.DataFrame.from_dict( extended_importances_temp )

extended_importances
Out[233]:
time type termName importance
0 [2] TargetAndTargetTransformation temperature(t-2) 30.63
1 [5] TargetAndTargetTransformation temperature(t-5) 29.04
2 [1] TargetAndTargetTransformation temperature(t-1) 28.55
3 [6] TargetAndTargetTransformation temperature(t-6) 28.31
4 [4] TargetAndTargetTransformation temperature(t-4) 28.19
... ... ... ... ...
235 [2] Interaction capacity_removed(t-12) & voltage_cumsum(t-12) 0.70
236 [4] Interaction cos(2πt / 8.0 hours) & cos(2πt / 6.0 hours) 0.70
237 [2] Interaction temperature(t-12) & capacity_removed_cumsum(t-12) 0.65
238 [2] Interaction temperature(t-12) & voltage_cumsum(t-12) 0.64
239 [1] Interaction capacity_removed(t-12) & voltage_cumsum(t-12) 0.44

240 rows × 4 columns

In [234]:
fig = go.Figure()

fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '[1]' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '[1]' ]['importance'] ) )

fig.update_layout(
        title='Model features sorted by importance (model for the 1st step in prediction horizon)',
        width = 1200,
        height = 700
)

fig.show()

fig = go.Figure()

fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '['+str(prediction_horizon)+']' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '['+str(prediction_horizon)+']' ]['importance'] ) )

fig.update_layout(
        title='Model features sorted by importance (model for the last step in prediction horizon)',
        width = 1200,
        height = 700
)

fig.show()

Evaluation of results¶

Results for out-of-sample interval.

In [235]:
def build_evaluation_data( backtest, data, timestamp_column, target_column ):
    out_of_sample_predictions = backtest.aggregated_predictions[1]['values']

    out_of_sample_predictions.rename( columns = {'Prediction':target_column+'_pred'}, inplace=True)

    out_of_sample_timestamps = out_of_sample_predictions.index.tolist()

    evaluation_data = data.copy()

    evaluation_data[ timestamp_column ] = pd.to_datetime(data[ timestamp_column ]).dt.tz_localize('UTC')
    evaluation_data = evaluation_data[ evaluation_data[ timestamp_column ].isin( out_of_sample_timestamps ) ]

    evaluation_data.set_index( timestamp_column,inplace=True)

    evaluation_data = evaluation_data[ [ target_column ] ]

    evaluation_data = evaluation_data.join( out_of_sample_predictions )

    return evaluation_data
In [236]:
def plot_results( e ):
    fig = go.Figure()
   
    fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,1], name=e.columns[1] ) )
    fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,0], name=e.columns[0] ) )

    fig.update_yaxes( range=[ int( e.iloc[:,0].min()*0.33 ) ,int( e.iloc[:,0].max()*1.5 ) ] )
    fig.update_layout( height = 700, width = 1200, title='Actual vs. predicted' )

    fig.show()
In [237]:
e = build_evaluation_data( backtest, data, timestamp_column, target_column )
In [238]:
e['timestamp'] = e.index
e['timestamp'] = e['timestamp'].apply( lambda x: datetime.datetime.strftime(x, '%Y-%m-%d %H:%M:%S' ) )

# e
In [239]:
e = apply_regular_timeline( e, 5, 'timestamp')

Accuracy metrics¶

In [240]:
backtest.aggregated_predictions[1]['accuracyMetrics']
Out[240]:
{'MAE': 0.050870133254161995,
 'MSE': 0.006706561483426089,
 'MAPE': 0.137741146601865,
 'RMSE': 0.08189359854974068}
In [241]:
results[FOCUS] = backtest.aggregated_predictions[1]['accuracyMetrics']   # save results for consolidated overview

Out-of-sample charts¶

Zoom-in chart to compare actual vs. predicted values.

Iteration with "RW21" dataset

In [110]:
plot_results(e) 

Iteration with "RW25" dataset

In [161]:
plot_results(e) 

Iteration with "RW10" dataset

In [244]:
plot_results(e) 

Iteration with "merged" dataset

In [47]:
plot_results(e) 

Summary¶

In [245]:
pd.DataFrame.from_dict(results)
Out[245]:
merged RW21 RW25 RW10
MAE 0.133795 0.096543 0.123426 0.050870
MSE 0.044368 0.031783 0.037176 0.006707
MAPE 0.302775 0.216050 0.257891 0.137741
RMSE 0.210636 0.178277 0.192811 0.081894

We can see that TIM, with default settings, achieved quite remarkable accuracy. We used 5-second sampling and predicted values 1 minute (12 x 5 sec.) ahead.

It is assumed that actual values for temperature are available and models rely on them in calculations. This use case can be further altered by brining additional data e.g., ambient temperature, more load information, etc. and at the same time experiment with reduction of sensory measurements from dataset. This would bring benefits to manufacturers of batteries as eliminating sensors means cheaper production.

This use case worked with (almost) the same dataset as use case demonstrating forecasting of remaining time until discharge, we encourage you to check that one as well.