Predictive maintenance for Aircraft Engines (Forecasting)¶

Title: Predictive maintenance for Aircraft Engines
Author: Michal Bezak, Tangent Works
Industry: Manufacturing, Transportation
Area: Predictive maintenance
Type: Forecasting

Description¶

Predictive maintenance covers variety of areas, including but not limited to:

  • prediction of failure,
  • root cause analysis,
  • failure detection,
  • failure type classification,
  • recommendation of mitigation or maintenance actions after failure.

To avoid downtimes caused by malfunction of technical equipment, companies follow maintenance schedule to discover and fix potential issues. Nevertheless, with more data (from sensors installed) available there are new possibilities how to make maintenance process better. With AI/ML it is possible to further optimize maintenance schedule.

This has tangible financial implications, imagine two extreme situations:

  • Maintenance planned for later than needed - tech. equipment would reach point of failure and cause operations disruptions, every minute of downtime can bring big losses, delays of other parts of the process, not to mention implications to health or lives of people.
  • If a company makes maintenance too often or too soon, expenses for time and material spent is higher than situation required, also, should component be replaced completely, capital invested into machine/components is not utilized to its full potential.

There is zero tolerance for failure of aircraft engines and so they are checked rather sooner than later. This is a very good example for our sample Use case.

In this Use case we will solve following problem: How much useful life (time) remains for given engine (until it fails)?. This kind of problem is also known as predicting Remaining Useful Life (RUL). From ML perspective this can be framed as forecasting problem.

Business parameters¶

Business objective: Financial: Maximize value of capital, minimize operational costs
Business value: Lower cost, capital utilization
KPI: -
Business objective: Operations: Increase aircraft’s safety
Business value: Enhanced personnel/passengers safety
KPI: -
In [5]:
import logging
import pandas as pd
import plotly as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import json
import datetime

from sklearn.metrics import mean_squared_error, mean_absolute_error
import math

import tim_client

Credentials and logging

(Do not forget to fill in your credentials in the credentials.json file)

In [6]:
with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'
In [7]:
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
In [8]:
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)

api_client.save_json = SAVE_JSON
api_client.json_saving_folder_path = JSON_SAVING_FOLDER
[INFO] 2021-02-08 17:30:27,019 - tim_client.api_client:save_json:66 - Saving JSONs functionality has been disabled
[INFO] 2021-02-08 17:30:27,020 - tim_client.api_client:json_saving_folder_path:75 - JSON destination folder changed to logs

Dataset¶

Data contain operational information about 100 aircraft engines consolidated into single table with columns about:

  • operational settings (3),
  • measurements from sensors (21),
  • cycle no.

Train part of dataset contained information about when (at which cycle) engine failed, based on which RUL variable was created.

Sampling¶

Data are sampled on a daily basis.

Data¶

Original raw data files were consolidated into a single time series file - train file is used for in-sample interval only, test file is use for out-of-sample interval.

Structure of CSV file:

Column name Description Type Availability
Timestamp Date Timestamp column
rul Remaning useful life (in days) Target t-1
cycle Operations cycle no. Predictor t+1
setting1 ... setting3 Operational settings Predictor t+1
s1 ... s21 Sensor measurements Predictor t+1

Due to consolidation into a single file timestamps were re-generated to avoid duplicates and be continous (they are not real timestamps from operations records).

Please note that engine_id column is not present in file because it is identifier, and not real measurement/setting.

Data situation¶

If we want TIM to do classification the very last record of target must be kept empty (NaN/None). TIM will use all available predictors to classify given record. Furthermore, this situation will be replicated to calculate results for all out-of-sample records during back-testing.

CSV files used in experiments can be downloaded here.

Source¶

Raw data files were acquired from Microsoft's server:

http://azuremlsamples.azureml.net/templatedata/PM_train.txt

http://azuremlsamples.azureml.net/templatedata/PM_test.txt

http://azuremlsamples.azureml.net/templatedata/PM_truth.txt

In [9]:
fpath = 'data_rul.csv'
data = tim_client.load_dataset_from_csv_file( fpath , sep=',')

data.tail()
Out[9]:
timestamp RUL cycle setting1 setting2 setting3 s1 s2 s3 s4 ... s12 s13 s14 s15 s16 s17 s18 s19 s20 s21
33722 2102-08-29 24.0 194 0.0049 0.0000 100.0 518.67 643.24 1599.45 1415.79 ... 520.69 2388.00 8213.28 8.4715 0.03 394 2388 100.0 38.65 23.1974
33723 2102-08-30 23.0 195 -0.0011 -0.0001 100.0 518.67 643.22 1595.69 1422.05 ... 521.05 2388.09 8210.85 8.4512 0.03 395 2388 100.0 38.57 23.2771
33724 2102-08-31 22.0 196 -0.0006 -0.0003 100.0 518.67 643.44 1593.15 1406.82 ... 521.18 2388.04 8217.24 8.4569 0.03 395 2388 100.0 38.62 23.2051
33725 2102-09-01 21.0 197 -0.0038 0.0001 100.0 518.67 643.26 1594.99 1419.36 ... 521.33 2388.08 8220.48 8.4711 0.03 395 2388 100.0 38.66 23.2699
33726 2102-09-02 NaN 198 0.0013 0.0003 100.0 518.67 642.95 1601.62 1424.99 ... 521.07 2388.05 8214.64 8.4903 0.03 396 2388 100.0 38.70 23.1855

5 rows × 27 columns

In [10]:
data.shape
Out[10]:
(33727, 27)

Visualization of data.

In [13]:
fig = go.Figure()

fig.add_trace(go.Scatter( x = data.index,
                      y = data['RUL']
             ))

fig.update_layout(
        height = 700,
        width = 1200,
        title = 'RUL'
)

fig.show()
In [12]:
timestamp_column = 'timestamp'
target_column = 'RUL'

Engine settings¶

Parameters that need to be set are:

  • Prediction horizon (Sample + 1) - because we want to predict the next value only.
  • Back-test length (that defines out-of-sample interval) is set arbitrarily, to be of exact size of test dataset.
  • Model quality, by setting it to 'Medium' we ban TIM to rely on target lags (offsets).

We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.

In [25]:
backtest_length = 13096
In [26]:
configuration_backtest = {
    'usage': {                                 
        'predictionTo': { 
            'baseUnit': 'Sample',             
            'offset': 1
        }, 
        'modelQuality':[ {'day':0,'quality':'Medium'} ],
        'backtestLength': backtest_length  
    },
    'extendedOutputConfiguration': {
        'returnExtendedImportances': True,
    }
}

Experiment iteration(s)¶

We will evaluate results from two perspectives.

  1. evaluate only the last data point per engine,
  2. evaluate all points in out-of-sample interval.

Iteration 1¶

In [27]:
backtest = api_client.prediction_build_model_predict(data, configuration_backtest)     # running the RTInstantML forecasting using data and defined configuration
backtest.status                                                                        # status of the job
Out[27]:
'Finished'
In [28]:
backtest.result_explanations
Out[28]:
[]
In [54]:
out_of_sample_predictions = backtest.aggregated_predictions[1]['values']
out_of_sample_predictions.rename( columns = {'Prediction':target_column+'_pred'}, inplace=True)
In [55]:
out_of_sample_timestamps = out_of_sample_predictions.index.tolist()
In [56]:
evaluation_data = data.copy()

evaluation_data[ timestamp_column ] = pd.to_datetime(data[ timestamp_column ]).dt.tz_localize('UTC')
evaluation_data = evaluation_data[ evaluation_data[ timestamp_column ].isin( out_of_sample_timestamps ) ]

evaluation_data.set_index( timestamp_column,inplace=True)

evaluation_data = evaluation_data[ [ target_column,'cycle' ] ]

evaluation_data = evaluation_data.join( out_of_sample_predictions )

evaluation_data.head()
Out[56]:
RUL cycle RUL_pred
timestamp
2066-10-24 00:00:00+00:00 0.0 200 0.000
2066-10-25 00:00:00+00:00 142.0 1 210.700
2066-10-26 00:00:00+00:00 141.0 2 225.228
2066-10-27 00:00:00+00:00 140.0 3 217.445
2066-10-28 00:00:00+00:00 139.0 4 206.110
In [57]:
# Raw data contains records per 100 engines, where records per each engine ended at certain cycle, thus in our consolidated file, we can detect the last record for each engine based on this information (sudden drop of cycle value).

# Filter out evaluation data to keep only the last record for each engine.
evaluation_data['cycle_diff'] = evaluation_data['cycle'].diff(-1)
evaluation_data = evaluation_data[ evaluation_data['cycle_diff']>0 ]

evaluation_data.head()
Out[57]:
RUL cycle RUL_pred cycle_diff
timestamp
2066-10-24 00:00:00+00:00 0.0 200 0.0000 199.0
2066-11-24 00:00:00+00:00 112.0 31 181.0570 30.0
2067-01-12 00:00:00+00:00 98.0 49 139.5480 48.0
2067-05-18 00:00:00+00:00 69.0 126 64.9790 125.0
2067-09-01 00:00:00+00:00 82.0 106 89.7947 105.0
In [58]:
evaluation_data.shape
Out[58]:
(100, 4)

Insights - inspecting ML models¶

Simple and extended importances are available for you to see to what extent each predictor contributes to explanation of variance of target variable.

In [59]:
simple_importances = backtest.predictors_importances['simpleImportances']
simple_importances = sorted(simple_importances, key = lambda i: i['importance'], reverse=True) 

simple_importances = pd.DataFrame.from_dict( simple_importances )
In [60]:
fig = go.Figure()

fig.add_trace(go.Bar( x = simple_importances['predictorName'],
                      y = simple_importances['importance'] )
             )

fig.update_layout(
        title='Simple importances'
)

fig.show()
In [61]:
extended_importances = backtest.predictors_importances['extendedImportances']
extended_importances = sorted(extended_importances, key = lambda i: i['importance'], reverse=True) 

extended_importances = pd.DataFrame.from_dict( extended_importances )

#extended_importances
In [62]:
fig = go.Figure()

fig.add_trace(go.Bar( x = extended_importances[ extended_importances['time'] == '[1]' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '[1]' ]['importance'] )
             )

fig.update_layout(
        title='Features generated from predictors used by model',
        height = 700
)

fig.show()

Evaluation of results¶

Results for out-of-sample interval.

In [64]:
fig = go.Figure()

fig.add_trace(go.Scatter( x = evaluation_data.index,
                      y = evaluation_data[target_column],
                      name='Actual'
             ))
fig.add_trace(go.Scatter( x = evaluation_data.index,
                      y = evaluation_data[target_column+'_pred'],
                      name='Predicted'
             ))

fig.update_layout(
        title = 'Predicted vs. Actual',
        height = 700,
        width = 1200
)

fig.show()
In [40]:
rmse_last = math.sqrt( mean_squared_error( evaluation_data[target_column], evaluation_data[target_column+'_pred']) )
rmse_last
Out[40]:
25.3202808752172
In [41]:
mae_last = mean_absolute_error( evaluation_data[target_column], evaluation_data[target_column+'_pred'])
mae_last
Out[41]:
19.709900600000005
In [42]:
mape_last = abs( ( evaluation_data[target_column] - evaluation_data[target_column+'_pred'] ) / evaluation_data[target_column] ).mean()
mape_last
Out[42]:
0.36884273224467423

Iteration 2¶

In [43]:
# backtest = api_client.prediction_build_model_predict(data, configuration_backtest)     # running the RTInstantML forecasting using data and defined configuration
# backtest.status                                                                        # status of the job
In [44]:
# backtest.result_explanations

In this iteration will evaluate all values in out-of-sample interval.

In [45]:
out_of_sample_predictions = backtest.aggregated_predictions[1]['values']
out_of_sample_predictions.rename( columns = {'Prediction':target_column+'_pred'}, inplace=True)
In [46]:
out_of_sample_timestamps = out_of_sample_predictions.index.tolist()
In [47]:
evaluation_data = data.copy()

evaluation_data[ timestamp_column ] = pd.to_datetime(data[ timestamp_column ]).dt.tz_localize('UTC')
evaluation_data = evaluation_data[ evaluation_data[ timestamp_column ].isin( out_of_sample_timestamps ) ]

evaluation_data.set_index( timestamp_column,inplace=True)

evaluation_data = evaluation_data[ [ target_column,'cycle' ] ]

evaluation_data = evaluation_data.join( out_of_sample_predictions )

evaluation_data.head()
Out[47]:
RUL cycle RUL_pred
timestamp
2066-10-24 00:00:00+00:00 0.0 200 0.000
2066-10-25 00:00:00+00:00 142.0 1 210.700
2066-10-26 00:00:00+00:00 141.0 2 225.228
2066-10-27 00:00:00+00:00 140.0 3 217.445
2066-10-28 00:00:00+00:00 139.0 4 206.110
In [48]:
evaluation_data.shape
Out[48]:
(13096, 3)

Insights - inspecting ML models¶

This iteration is using the very same models/settings as prevous one, so will skip visualization.

Evaluation of results¶

Results for out-of-sample interval.

In [49]:
fig = go.Figure()

fig.add_trace(go.Scatter( x = evaluation_data.index,
                      y = evaluation_data[target_column],
                      name='Actual'
             ))
fig.add_trace(go.Scatter( x = evaluation_data.index,
                      y = evaluation_data[target_column+'_pred'],
                      name='Predicted'
             ))

fig.update_layout(
        title = 'Predicted vs. Actual',
        height = 700,
        width = 1200
)

fig.show()
In [50]:
rmse_all = math.sqrt( mean_squared_error( evaluation_data[target_column], evaluation_data[target_column+'_pred']) )
rmse_all
Out[50]:
40.39381016475072
In [51]:
mae_all = mean_absolute_error( evaluation_data[target_column], evaluation_data[target_column+'_pred'])
mae_all
Out[51]:
30.787960507238854
In [52]:
mape_all = abs( ( evaluation_data[target_column] - evaluation_data[target_column+'_pred'] ) / evaluation_data[target_column] ).mean()
mape_all
Out[52]:
0.2475012954893914

Summary¶

In this Use case, we demonstrated how TIM can be used in predictive maintenance scenarios with little effort. Without further optimisation of math settings TIM delivered its predictions of RUL in fraction of time.