Time left till battery discharge¶

Title: Time left till battery discharge
Author: Michal Bezak, Tangent Works
Industry: Electronics, Automotive, Manufacturing ...
Area: Customer experience, Utility
Type: Forecasting

Description¶

Accelerating adaption of electric vehicles (EVs) is driving improvements of battery technologies at rocket speed. We have not seen such progress in decades. Bigger capacities, faster charging and longer lifespan of batteries are in focus. It is not only EVs that challenge battery status quo. Smartphones, batteries installed at households, and other areas would benefit from progress in this field.

Until capacities of batteries increase substantially, the information about how much time left till complete battery discharge is critical to everyone. This parameter depends on how vehicle or device is used, what is profile of load, environmental conditions etc. Knowing how much time is left helps us to plan further actions e.g., to re-plan route, charging etc.

In our use case, we will demonstrate how TIM can predict time left till discharge.

TIM can be deployed on edge, and scale, there are multiple scenarios of deployment, from device level or (at scale) in cloud to which vast battery grids could be connected.

Business parameters¶

Business objective: Process optimization
Business value: Operational decisions timed more accurately
KPI: -
Business objective: Improved quality of products
Business value: Greater value and utility delivered to customers
KPI: -
In [1]:
import logging
import pandas as pd
import plotly.graph_objects as go
import numpy as np
import json
import datetime

from sklearn.metrics import mean_squared_error, mean_absolute_error
import math

import tim_client
In [2]:
with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'
In [3]:
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
In [4]:
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)
In [5]:
results = dict()

Dataset(s)¶

Data contain measurements of multiple Li-ion batteries used in charge/discharge experiment.

Batteries were continuously operated in cycles - charging, discharging, resting cycles with various modifications. Our focus was on discharging with current charging randomly (random walk), current setpoints were selected from 4.5A, 3.75A, 3A, 2.25A, 1.5A and 0.75A.

During the experiment, each selected current setpoint was applied until either the battery voltage went down to 3.2V or 5 minutes passed, however we wanted to focus solely on natural discharge events, not timeouts, thus we selected the later part of original data where barttery aging/degradation was already visible.

Sampling and gaps¶

Data were resampled from original sampling (almost regular 1-second) to regular 5-second basis with mean aggregation.

Data may contain gaps, this is the case mainly for dataset with merged data from multiple batteries (of the same type and parameters). During the experiments, distribution of current to which batteries were exposed to was different. Merging such data means higher variance of values but at the same time makes models built more stable.

Data¶

Column name Description Type Availability
timestamp Absolute time stamp of sample Timestamp column
remaining_time_in_cycle Time left till discharge in seconds Target t-1
temperature Temperature of battery (Celsius) Predictor t+0
current Current measured in Amps Predictor t+0
voltage Voltage measured in Volts Predictor t+0
voltage_cumsum Cummulative sum of voltage within given cycle (Volts) Predictor t+0
capacity_removed Removed capacity within given cycle (current x timeframe) Predictor t+0
capacity_removed_cumulative Cumulative capacity reduced within given cycle Predictor t+0

remaining_time_in_cycle was calculated as time left (at timestamp t) until voltage reached close to 3.2V which is basically end of cycle.

Ambient temperature values are not provided, although it is known that for RW10 it was "room temperature", while for RW25 and RW21 it was "approximately 40°C".

The original file was obtained in Matlab format, it was transformed into CSV, enhanced with additional information derived from original data (capacity removed column and cumsum columns), and filtered to random discharge cycles only.

To demonstrate results for out-of-sample interval, dataset was resampled to regular sampling so it was possible to match timestamps and calculate residuals. Technically, TIM is capable to work with any kind of sampling, spanning from milliseconds, irregularly sampled data, data with gaps etc.

Forecasting situation¶

TIM detects forecasting situation from current "shape" of data, i.e. if values for target and predictors end at particular timestamp it would assume this pattern of availability also for the model building and calculation of out-of-sample values.

Our forecasting situation assumes values for all predictors to be available at time t except target. We frame the problem as the calculation of target value based on predictors values. Also we will not rely on any lagged values for target.

CSV files used in experiments can be downloaded here.

Source¶

Raw dataset was obtained at NASA Prognostics Center of Excellence website.

B. Bole, C. Kulkarni, and M. Daigle "Randomized Battery Usage Data Set", NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA. Analysis of a similar dataset is published in: Brian Bole, Chetan Kulkarni, and Matthew Daigle, "Adaptation of an Electrochemistry-based Li-Ion Battery Model to Account for Deterioration Observed Under Randomized Use", in the proceedings of the Annual Conference of the Prognostics and Health Management Society, 2014

In [7]:
datafiles ={
    'RW10': 'datasets/battery_discharge_r_RW10.csv',
    'RW25': 'datasets/battery_discharge_r_RW25.csv',
    'RW21': 'datasets/battery_discharge_r_RW21.csv',
    'merged': 'datasets/merged_r_RW10_RW25_RW21.csv',
}
In [8]:
results = dict()
In [84]:
# NOTE: start re-run of each iteration by changing key to CSV file
FOCUS = 'RW10'
In [85]:
data = tim_client.load_dataset_from_csv_file( datafiles.get(FOCUS), sep=',' )

data.tail()
Out[85]:
timestamp remaining_time_in_cycle voltage current temperature capacity_removed capacity_removed_cumsum voltage_cumsum
20284 2014-06-02 06:07:10 9.650 3.24200 2.250200 36.484470 2.250200 83.250000 126.624200
20285 2014-06-02 06:07:15 4.650 3.22000 2.249600 36.533760 2.249600 94.499200 142.767800
20286 2014-06-02 06:07:20 0.575 3.19625 2.592750 36.580737 1.490375 77.425875 117.499750
20287 2014-06-02 06:07:25 5.150 3.26900 2.251333 36.628223 1.500000 2.250000 6.567333
20288 2014-06-02 06:07:30 NaN 3.20960 2.249800 36.650810 1.867130 10.865730 19.448200
In [86]:
data.shape
Out[86]:
(20289, 8)
In [87]:
prediction_horizon = 1
In [88]:
target_column    = 'remaining_time_in_cycle'

timestamp_column = 'timestamp'

Visualization

In [89]:
def apply_regular_timeline( df, sec, timestamp_column ):
    vis_df = df.copy()

    vis_df[ timestamp_column ] = pd.to_datetime( vis_df[ timestamp_column ] )
    vis_df.set_index( timestamp_column, inplace=True )

    timestamps = [ vis_df.index.min() + i * datetime.timedelta( seconds=sec ) for i in range( int( ( vis_df.index.max() - vis_df.index.min() ).total_seconds()/sec ) + 1 ) ]

    temp_df = pd.DataFrame( { timestamp_column: timestamps } )
    temp_df.set_index( timestamp_column, inplace=True )

    vis_df = temp_df.join( vis_df )
    vis_df[ timestamp_column ] = vis_df.index

    return vis_df.reset_index( drop=True )
In [90]:
last_n = 10000

vis_df = apply_regular_timeline( data.iloc[-last_n:], 5, timestamp_column )
In [91]:
fig = go.Figure()

fig.add_trace(go.Scatter( x = vis_df[ timestamp_column ], y = vis_df[ target_column ], name=target_column ) )

fig.update_layout( height = 700, width = 1200, title='Target visualization (last '+str(last_n)+' records)' )

fig.show()
03:00Jun 1, 201406:0009:0012:0015:0018:0021:0000:00Jun 2, 201403:0006:00050100150200250300
Target visualization (last 10000 records)
plotly-logomark

Engine settings¶

Parameters that need to be set are:

  • Prediction horizon = 1, we want to calculate how much time is left in given timestamp based on predictors value only.
  • Back-test length, to obtain values for out-of-sample interval.
  • modelQuality and/or allowOffsets - we must not rely on lagged target values; setting modelQuality to Medium would instruct TIM engine to not use target offsets during model building; setting allowOffsets to false would cause the same for all columns in dataset (including predictors); experiments run with both options, accuracy was not significantly different, although it is recommended to not switch off use of predictor's offsets, as they are expected to improve overall accuracy.

We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.

In [92]:
backtest_length = int( data.shape[0] * .3 ) 

backtest_length
Out[92]:
6086
In [93]:
configuration_backtest = {
    'usage': {                                 
        'predictionTo': { 
            'baseUnit': 'Sample',             
            'offset': prediction_horizon
        }, 
        'modelQuality':  [ {'day':0, 'quality':'Medium'} ],
        'backtestLength': backtest_length   
    }, 
    #'allowOffsets': False,
    'extendedOutputConfiguration': {
        'returnExtendedImportances': True,
    }
}

Experiment iteration(s)¶

We will run experiments for different datasets, they differ by distribution of values for current and data derived from it.

  • uniform distribution (RW10)
  • high skewed (RW25)
  • low skewed (RW21)
  • merged (all three in single dataset)

Results for accuracy and plot are shown in Evaluation section at the bottom.

In [94]:
backtest = api_client.prediction_build_model_predict( data, configuration_backtest )  

backtest.status   
Out[94]:
'FinishedWithWarning'
In [95]:
backtest.result_explanations
Out[95]:
[{'index': 1,
  'message': 'Predictor remaining_time_in_cycle has a value missing for timestamp 2014-05-29 09:31:55.'},
 {'index': 2,
  'message': 'There are 495 gaps with median length 39.0 in your target. Consider changing the imputation length to enable more transformations.'}]

Insights - inspecting ML models¶

Simple and extended importances are available for you to see to what extent each predictor contributes in explaining variance of target variable.

In [96]:
simple_importances = pd.DataFrame.from_dict( backtest.predictors_importances['simpleImportances'], orient='columns' )

simple_importances
Out[96]:
importance predictorName
0 34.99 voltage
1 21.97 capacity_removed_cumsum
2 19.28 current
3 15.27 voltage_cumsum
4 7.41 capacity_removed
5 1.07 temperature
In [97]:
fig = go.Figure()

fig.add_trace(go.Bar( x = simple_importances['predictorName'],
                      y = simple_importances['importance'] ) )

fig.update_layout( width = 1200, height = 700, title='Simple importances' )

fig.show()
voltagecapacity_removed_cumsumcurrentvoltage_cumsumcapacity_removedtemperature05101520253035
Simple importances
plotly-logomark
In [98]:
extended_importances_temp = backtest.predictors_importances['extendedImportances']
extended_importances_temp = sorted( extended_importances_temp, key = lambda i: i['importance'], reverse=True ) 
extended_importances = pd.DataFrame.from_dict( extended_importances_temp )

extended_importances
Out[98]:
time type termName importance
0 [1] Interaction (voltage(t) - 3.25)⁺ & (-capacity_removed_cums... 22.42
1 [1] Interaction (current(t) - 1.50)⁺ & (voltage(t) - 3.25)⁺ 10.25
2 [1] Interaction (-capacity_removed_cumsum(t) + 211.67)⁺ & (-vo... 7.24
3 [1] Interaction (voltage(t) - 3.25)⁺ & (-voltage(t) + 3.58)⁺ 6.84
4 [1] Interaction (voltage_cumsum(t) - 432.03)⁺ & voltage_cumsum 6.80
5 [1] Interaction current & (-capacity_removed_cumsum(t) + 211.67)⁺ 6.14
6 [1] Interaction (current(t) - 2.25)⁺ & capacity_removed 3.61
7 [1] Interaction (voltage_cumsum(t) - 432.03)⁺ & (-capacity_rem... 3.54
8 [1] Interaction capacity_removed & (-voltage(t) + 3.58)⁺ 3.30
9 [1] Interaction (-voltage(t) + 3.36)⁺ & voltage_cumsum 3.17
10 [1] Interaction voltage_cumsum & voltage(t-1) 3.14
11 [1] Interaction capacity_removed_cumsum & (voltage(t) - 3.25)⁺ 2.95
12 [1] Interaction (current(t) - 1.50)⁺ & (-capacity_removed_cums... 2.89
13 [1] Interaction capacity_removed_cumsum & capacity_removed 2.86
14 [1] Interaction (-capacity_removed_cumsum(t) + 211.67)⁺ & (-vo... 2.68
15 [1] Interaction (voltage_cumsum(t) - 432.03)⁺ & (-voltage(t) +... 2.60
16 [1] Interaction current & (-voltage(t) + 3.58)⁺ 2.55
17 [1] Interaction (voltage_cumsum(t) - 432.03)⁺ & (-voltage(t) +... 2.41
18 [1] Interaction (-voltage(t) + 3.36)⁺ & current 2.40
19 [1] Interaction capacity_removed & current(t-1) 2.21
In [99]:
fig = go.Figure()

fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '[1]' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '[1]' ]['importance'] ) )

fig.update_layout(
        title='Model features sorted by importance',
        width = 1200,
        height = 700
)

fig.show()
(voltage(t) - 3.25)⁺ & (-capacity_removed_cumsum(t) + 211.67)⁺(current(t) - 1.50)⁺ & (voltage(t) - 3.25)⁺(-capacity_removed_cumsum(t) + 211.67)⁺ & (-voltage(t) + 3.58)⁺(voltage(t) - 3.25)⁺ & (-voltage(t) + 3.58)⁺(voltage_cumsum(t) - 432.03)⁺ & voltage_cumsumcurrent & (-capacity_removed_cumsum(t) + 211.67)⁺(current(t) - 2.25)⁺ & capacity_removed(voltage_cumsum(t) - 432.03)⁺ & (-capacity_removed_cumsum(t) + 211.67)⁺capacity_removed & (-voltage(t) + 3.58)⁺(-voltage(t) + 3.36)⁺ & voltage_cumsumvoltage_cumsum & voltage(t-1)capacity_removed_cumsum & (voltage(t) - 3.25)⁺(current(t) - 1.50)⁺ & (-capacity_removed_cumsum(t) + 211.67)⁺capacity_removed_cumsum & capacity_removed(-capacity_removed_cumsum(t) + 211.67)⁺ & (-voltage_cumsum(t) + 42.50)⁺(voltage_cumsum(t) - 432.03)⁺ & (-voltage(t) + 3.58)⁺current & (-voltage(t) + 3.58)⁺(voltage_cumsum(t) - 432.03)⁺ & (-voltage(t) + 3.36)⁺(-voltage(t) + 3.36)⁺ & currentcapacity_removed & current(t-1)05101520
Model features sorted by importance
plotly-logomark

Evaluation of results¶

Results for out-of-sample interval.

In [100]:
def build_evaluation_data( backtest, data, timestamp_column, target_column ):
    out_of_sample_predictions = backtest.aggregated_predictions[1]['values']

    out_of_sample_predictions.rename( columns = {'Prediction':target_column+'_pred'}, inplace=True)

    out_of_sample_timestamps = out_of_sample_predictions.index.tolist()

    evaluation_data = data.copy()

    evaluation_data[ timestamp_column ] = pd.to_datetime(data[ timestamp_column ]).dt.tz_localize('UTC')
    evaluation_data = evaluation_data[ evaluation_data[ timestamp_column ].isin( out_of_sample_timestamps ) ]

    evaluation_data.set_index( timestamp_column,inplace=True)

    evaluation_data = evaluation_data[ [ target_column ] ]

    evaluation_data = evaluation_data.join( out_of_sample_predictions )

    return evaluation_data
In [101]:
def plot_results( e ):
    fig = go.Figure()
   
    fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,1], name=e.columns[1] ) )
    fig.add_trace(go.Scatter( x = e.index, y = e.iloc[:,0], name=e.columns[0] ) )

    fig.update_yaxes( range=[ int( e.iloc[:,0].min()*0.33 ) ,int( e.iloc[:,0].max()*1.5 ) ] )
    fig.update_layout( height = 700, width = 1200, title='Actual vs. predicted' )

    fig.show()
In [102]:
e = build_evaluation_data( backtest, data, timestamp_column, target_column )
In [103]:
e['timestamp'] = e.index
e['timestamp'] = e['timestamp'].apply( lambda x: datetime.datetime.strftime(x, '%Y-%m-%d %H:%M:%S' ) )

# e
In [104]:
e = apply_regular_timeline( e, 5, 'timestamp')

Accuracy metrics¶

In [105]:
backtest.aggregated_predictions[1]['accuracyMetrics']
Out[105]:
{'MAE': 15.83182040866166,
 'MSE': 992.6355711814798,
 'MAPE': 271.6063984370307,
 'RMSE': 31.506119583050527}
In [106]:
results[FOCUS] = backtest.aggregated_predictions[1]['accuracyMetrics']  # store results for overview

Out-of-sample chart¶

Zoom-in chart to compare actual vs. predicted values.

Iteration with "RW21" dataset

In [59]:
plot_results(e) 
05001000150020002500300001020304050607080
remaining_time_in_cycle_predremaining_time_in_cycleActual vs. predicted
plotly-logomark

Iteration with "RW25" dataset

In [83]:
plot_results(e) 
01000200030004000500001020304050607080
remaining_time_in_cycle_predremaining_time_in_cycleActual vs. predicted
plotly-logomark

Iteration with "RW10" dataset

In [107]:
plot_results(e) 
0100020003000400050006000050100150200250300350400
remaining_time_in_cycle_predremaining_time_in_cycleActual vs. predicted
plotly-logomark

Iteration with "merged" dataset

In [35]:
plot_results(e) 
02k4k6k8k10k12k14k16k01020304050607080
remaining_time_in_cycle_predremaining_time_in_cycleActual vs. predicted
plotly-logomark

Summary¶

In [247]:
pd.DataFrame.from_dict(results)
Out[247]:
merged RW10 RW25 RW21
MAE 6.489112 15.831820 1.549011 1.009662
MSE 71.892082 992.635571 17.188447 6.530812
MAPE 45.330211 271.606398 13.596176 5.445381
RMSE 8.478920 31.506120 4.145895 2.555545

We demonstrated how TIM, with default mathematical settings, can predict time left until battery discharge.

This use case can be further altered by brining additional data e.g., ambient temperature, more load information, etc. and at the same time experiment with reduction of sensory measurements from dataset. This would bring benefits to manufacturers of batteries as eliminating sensors means cheaper production.

In this use case we worked with (almost) the same dataset as use case demonstrating forecasting of battery temperature, we encourage you to check that one as well.