Methods

Unsupervised ans Supervised Learning Methods in the Context of Predictive Maintenance

Overview

The following table illustrate the differnet methodologies that are provided by the SysWatch framework.

1 Operating Modes

We use clustering methods to determine operating mode conditions. Subsequently, the cluster centers are assigned predefined states.

Therefore, the number of clusters has to be set. Notice that the number has to be greater than or equal to the number of operating modes. SysWatch calculates the centres of all k clusters and categorizes each data point to an operating mode automatically. Notice again, that for each mode the found centres has to be assigned by the user.

Training

Step 1: Define your operating modes

Step 2: Choose your training preferences

Properties

Description

Comment

Step Size

Numerical value of the mesh size dependent on the time unit (default minutes)

The smaller the value, the more reference points arise. This has a negative impact on the performance

Time Unit

Time parameter for the aggregation of the data set (seconds, minutes, hours)

The time unit sets the bin width

Aggregation

Aggregation for each reference data point in time dependent on time unit, step size and embedding window

The aggregation has a strong effect on the state space

Training Range

Start and end datetime of the reference set in the training phase

Algorithms

clustering algorithms (K-Nearest, DBSCAN)

The parameter K represents the number of clusters

Step 3: Assign calculated cetenters to the predifend modes

Evaluation

In the monitoring process every measurement record is flagged with the above defined modes. Naturally, there is a rule, if the sensor measurements are not transferred to the server. In this case there is a default category called “DownTime”. If there is a gap that is ten times bigger than the sampling rate we flag the not existent mesh points as “DownTime”.

2 Anomaly Detection

State Space

The appropriate generation of a state space is elementary. The point cloud is generated by the following points:

  • parameters

  • aggregation

  • operating mode filter

Training

Step 0: Chooce an anomaly detection approach

Step 1: Choose your parameters and the appropriate aggregation

Step 2: Choose your training preferences

Properties

Description

Comment

Step Size

Numerical value of the mesh size dependent on the time unit (default minutes)

The smaller the value, the more reference points arise. This has a negative impact on the performance

Time Unit

Time parameter for the aggregation of the data set (seconds, minutes, hours)

The time unit sets the bin width

Embedding Window

Number of embedded points in time for each datetime

A high value increases the autoregressive influence, which increases the accuracy and decreases the performance

Aggregation

Aggregation for each reference data point in time dependent on time unit, step size and embedding window

The aggregation has a strong effect on the state space

Training Range

Start and end datetime of the reference set in the training phase

Algorithms

auto-associative kernel regressions (Closest, K-Nearest, Kernel) and neural networks (autoencoder)

Reference Days

Number of months, which are in the reference set for the next evaluation

It is a memory time span value that determines how far the algorithm looks into the past

Validation Days

Number of months, which get evaluated before retraining

Retraining rhythm

The anomaly detection method is mainly affected by two parameters. The parameter validation days (VD) and the history days (HD) parameters. Starting at the first element of a time series, every VD-step the process renews its training/reference and validation set. The first part is the reference set, which includes the data of the last HD days (the maximum size of the reference set is the amount of the VD value). The second part is called the validation set and includes the data of the next VD days. Each data point in this set gets evaluated and the results are archived in a data base. In a periodic rhythm (depending on the parameter VD) the reference set gets updated and the new validation set gets evaluated. This routine repeats until the last training record is reached. With this procedure an automatic retraining is available.

Training and validation process in case of anomaly detection.

Monitoring

With a defined monitoring frequency the service checks, if new records have entered the database. If there are new measurements, the monitoring service starts a new calculation for each active model and automatically evaluates new results.

3 Pattern Recognition

The Pattern Recognition technique enables us learn conspicuous operating occurrences.

  • First the user has to flag an amount of similar patterns in a limited time span (see figure 10).

  • Subsequently our software shifts the relevant timely patterns until those are synchronized (see figure 11).

  • A pattern recognition model can be determined (see figure 12).

  • A pattern recognition model learns the pattern that represent a pre-critical behaviour. It is possible to transfer the model to other assets (of same type).

Figure 12: The shown training results manifest by the red probability curve representing the identification oft he learned pattern.

State Space

All properties are the same as the ones in the supervised learning case. The only difference is the property “horizon”. This property hast to be zero and therefore we are not looking for pre-critical states, but rather for the flagged behaviour.

Training

The training step is the same as supervised forecast with only one restriction.

Monitoring

With a defined monitoring frequency the monitoring service checks, if new records have been entered into the database. If there are new measurements, the monitoring service starts a new calculation for each active model and evaluates automatically new results. The results are stored in its database.

Retraining

Every time, new data gets stored, the computing retrains a certain model (using the specific state space), if the period between the last (re-)training and the actual time of data is greater than the size* of the partitions in the training phase.

4 Event Prediction

State Space

Figure 8: State Space Definitions describing the most meaningful model properties.

Properties

Description

Comment

Step Size

Numerical value of the mesh size dependent on the time unit (default minutes)

The smaller the value, the more reference points arise. This has a negative impact on the performance

Time Unit

Time parameter for the aggregation of the data set (seconds, minutes, hours)

The time unit sets the bin width

Embedding Window

Number of embedded points in time for each datetime

A high value increases the autoregressive influence, which increase the accuracy and decreases the performance

Training Start

Start datetime for the reference set in the training phase

Training End

Last datetime, which gets evaluated in the training phase

Horizon

Prognosis horizon for output signals dependent on the time unit and actual period

A high value increases the prognosis horizon, however it decreases the performance

Algorithms

See description of nonlinear regressions

Cross

Validation

number of cross validation/ number of data sets (reference and validation)

The separation of the data set prevents the learning algorithm from overfitting

The definition of the state space is very similar to the case of anomaly detection. For the purpose of pattern recognition, there are three important differences to the anomaly detection. The first difference is with regard to the validation procedure (see subchapter “Training”). The second difference concerns the relation between inputs and outputs. The goal of supervised forecast is to find a relation between all chosen inputs and one or more selected output(s). The embedding window influences the input vector, while and this is the third difference the horizon only influences the output vector. For every point in time, the algorithm tries to predict the behaviour of the output(s) in a time period (=SS*H) that depends on time unit (TU), horizon (H) and step size (SS).

Training

Figure 9: Training process in case of supervised forecast.

The data set is divided into in a number of cross validation (CV) partitions of same size (Notice that this parameter is taken for retraining). For a certain partition, the algorithm will be trained on the CV-1 other ones. Afterwards, the partition is evaluated by the calibrated model and the results get stored in a data base. This step is repeated CV times for each partition. Finally, for each model, a specific key performance indicator (see MSE in chapter “Key Performance Indicators”) gets calculated and the model with the best key performance indicator (MSE) is the one, which will be used in monitoring service.

Monitoring

With a defined monitoring frequency the service checks, if new records got stored in the database. If there are new measurements, the monitoring service starts a new calculation for each active model and evaluates automatically new results. The results are stored in its database.

Retraining

Every time, new data gets stored, the computing retrains a certain model (using the specific state space), if the period between the last (re-)training and the actual time of data is greater than the size* of the partitions in the training phase.

Last updated