Methods
Unsupervised ans Supervised Learning Methods in the Context of Predictive Maintenance
Last updated
Unsupervised ans Supervised Learning Methods in the Context of Predictive Maintenance
Last updated
The following table illustrate the differnet methodologies that are provided by the SysWatch framework.
We use clustering methods to determine operating mode conditions. Subsequently, the cluster centers are assigned predefined states.
Therefore, the number of clusters has to be set. Notice that the number has to be greater than or equal to the number of operating modes. SysWatch calculates the centres of all k clusters and categorizes each data point to an operating mode automatically. Notice again, that for each mode the found centres has to be assigned by the user.
Step 1: Define your operating modes
Step 2: Choose your training preferences
Step 3: Assign calculated cetenters to the predifend modes
In the monitoring process every measurement record is flagged with the above defined modes. Naturally, there is a rule, if the sensor measurements are not transferred to the server. In this case there is a default category called “DownTime”. If there is a gap that is ten times bigger than the sampling rate we flag the not existent mesh points as “DownTime”.
The appropriate generation of a state space is elementary. The point cloud is generated by the following points:
parameters
aggregation
operating mode filter
Step 0: Chooce an anomaly detection approach
Step 1: Choose your parameters and the appropriate aggregation
Step 2: Choose your training preferences
The anomaly detection method is mainly affected by two parameters. The parameter validation days (VD) and the history days (HD) parameters. Starting at the first element of a time series, every VD-step the process renews its training/reference and validation set. The first part is the reference set, which includes the data of the last HD days (the maximum size of the reference set is the amount of the VD value). The second part is called the validation set and includes the data of the next VD days. Each data point in this set gets evaluated and the results are archived in a data base. In a periodic rhythm (depending on the parameter VD) the reference set gets updated and the new validation set gets evaluated. This routine repeats until the last training record is reached. With this procedure an automatic retraining is available.
Training and validation process in case of anomaly detection.
With a defined monitoring frequency the service checks, if new records have entered the database. If there are new measurements, the monitoring service starts a new calculation for each active model and automatically evaluates new results.
The Pattern Recognition technique enables us learn conspicuous operating occurrences.
First the user has to flag an amount of similar patterns in a limited time span (see figure 10).
Subsequently our software shifts the relevant timely patterns until those are synchronized (see figure 11).
A pattern recognition model can be determined (see figure 12).
A pattern recognition model learns the pattern that represent a pre-critical behaviour. It is possible to transfer the model to other assets (of same type).
Figure 12: The shown training results manifest by the red probability curve representing the identification oft he learned pattern.
All properties are the same as the ones in the supervised learning case. The only difference is the property “horizon”. This property hast to be zero and therefore we are not looking for pre-critical states, but rather for the flagged behaviour.
The training step is the same as supervised forecast with only one restriction.
With a defined monitoring frequency the monitoring service checks, if new records have been entered into the database. If there are new measurements, the monitoring service starts a new calculation for each active model and evaluates automatically new results. The results are stored in its database.
Every time, new data gets stored, the computing retrains a certain model (using the specific state space), if the period between the last (re-)training and the actual time of data is greater than the size* of the partitions in the training phase.
Figure 8: State Space Definitions describing the most meaningful model properties.
The definition of the state space is very similar to the case of anomaly detection. For the purpose of pattern recognition, there are three important differences to the anomaly detection. The first difference is with regard to the validation procedure (see subchapter “Training”). The second difference concerns the relation between inputs and outputs. The goal of supervised forecast is to find a relation between all chosen inputs and one or more selected output(s). The embedding window influences the input vector, while and this is the third difference the horizon only influences the output vector. For every point in time, the algorithm tries to predict the behaviour of the output(s) in a time period (=SS*H) that depends on time unit (TU), horizon (H) and step size (SS).
Figure 9: Training process in case of supervised forecast.
The data set is divided into in a number of cross validation (CV) partitions of same size (Notice that this parameter is taken for retraining). For a certain partition, the algorithm will be trained on the CV-1 other ones. Afterwards, the partition is evaluated by the calibrated model and the results get stored in a data base. This step is repeated CV times for each partition. Finally, for each model, a specific key performance indicator (see MSE in chapter “Key Performance Indicators”) gets calculated and the model with the best key performance indicator (MSE) is the one, which will be used in monitoring service.
With a defined monitoring frequency the service checks, if new records got stored in the database. If there are new measurements, the monitoring service starts a new calculation for each active model and evaluates automatically new results. The results are stored in its database.
Every time, new data gets stored, the computing retrains a certain model (using the specific state space), if the period between the last (re-)training and the actual time of data is greater than the size* of the partitions in the training phase.
Figure 11: Automatic synchronization of the flagged timely patterns.
Properties
Description
Comment
Step Size
Numerical value of the mesh size dependent on the time unit (default minutes)
The smaller the value, the more reference points arise. This has a negative impact on the performance
Time Unit
Time parameter for the aggregation of the data set (seconds, minutes, hours)
The time unit sets the bin width
Aggregation
Aggregation for each reference data point in time dependent on time unit, step size and embedding window
The aggregation has a strong effect on the state space
Training Range
Start and end datetime of the reference set in the training phase
Algorithms
clustering algorithms (K-Nearest, DBSCAN)
The parameter K represents the number of clusters
Properties
Description
Comment
Step Size
Numerical value of the mesh size dependent on the time unit (default minutes)
The smaller the value, the more reference points arise. This has a negative impact on the performance
Time Unit
Time parameter for the aggregation of the data set (seconds, minutes, hours)
The time unit sets the bin width
Embedding Window
Number of embedded points in time for each datetime
A high value increases the autoregressive influence, which increases the accuracy and decreases the performance
Aggregation
Aggregation for each reference data point in time dependent on time unit, step size and embedding window
The aggregation has a strong effect on the state space
Training Range
Start and end datetime of the reference set in the training phase
Algorithms
auto-associative kernel regressions (Closest, K-Nearest, Kernel) and neural networks (autoencoder)
Reference Days
Number of months, which are in the reference set for the next evaluation
It is a memory time span value that determines how far the algorithm looks into the past
Validation Days
Number of months, which get evaluated before retraining
Retraining rhythm
Properties
Description
Comment
Step Size
Numerical value of the mesh size dependent on the time unit (default minutes)
The smaller the value, the more reference points arise. This has a negative impact on the performance
Time Unit
Time parameter for the aggregation of the data set (seconds, minutes, hours)
The time unit sets the bin width
Embedding Window
Number of embedded points in time for each datetime
A high value increases the autoregressive influence, which increase the accuracy and decreases the performance
Training Start
Start datetime for the reference set in the training phase
Training End
Last datetime, which gets evaluated in the training phase
Horizon
Prognosis horizon for output signals dependent on the time unit and actual period
A high value increases the prognosis horizon, however it decreases the performance
Algorithms
See description of nonlinear regressions
Cross
Validation
number of cross validation/ number of data sets (reference and validation)
The separation of the data set prevents the learning algorithm from overfitting