Anomaly Detection Service¶
Idea¶
The Anomaly Detection Service aims to automatically detect unexpected behaviour of processes and assets using time series data.
For a given asset and for a specified period, the user is notified if the asset behaves abnormally in any way. Using this information the user is able to monitor their assets (e.g. by setting notification/warning thresholds).
Typical use cases for the Anomaly Detection Service are process and condition monitoring, early warning applications, and fault condition detection without an explicit definition.
Note
This is not applicable for Private Cloud.
Access¶
For accessing this service you need to have the respective roles listed in Analytics roles and scopes.
Basics¶
Anomaly Detection Service uses a density-based clustering approach (DBSCAN) to train models for anomaly detection (model training). The algorithm works unsupervised and uses historic training data, that exhibits mostly normal behaviour. It generates a cluster landscape, which is a model for the normal behaviour of an asset.
This model is applied to new data sets by checking whether a specific data point belongs to one of the clusters or not (model application). Data points belonging to a cluster are considered normal. Data points not belonging to a cluster are assigned a score, which represents their distance to the closest cluster. The higher this score is, the more probable it is that the data point is an anomaly. This requires that training and evaluation data have the same features (amount, type, name).
Modes¶
Anomaly Detection Service can be used in three modes:
Interactive Mode¶
This mode is intended to be used for small data sets. All the required configuration and data is provided in the request. The API calls are carried out synchronously and results will be available immediately in the response.
The Model Management Service is used for model storage and automatically sets the expiration date of a model to 14 days. This parameter might be changed in the future.
Batch Mode¶
Info
Batch mode is currently only available in region Europe 1.
In contrast to interactive mode, this mode is for processing big data sets. API calls are carried out asynchronously to obtain data from the IoT Time Series Service and pass the results to the Data Exchange Service.
The model training job can process up to 1 million items from IoT Time Series Service. Batch jobs are terminated, if they take more than 6 hours to complete.
Direct Interactive Mode¶
This mode is intended to be used when the user does not want to pass time series data in the request body. In direct interactive mode, the Anomaly Detection Service can communicate with IoT time series to obtain data for model training and reasoning. The user should provide the asset details and time range to Anomaly Detection Service API's instead of passing time series data in the request body.
The Model Management Service is used for model storage and automatically sets the expiration date of a model to 14 days. This parameter might be changed in the future
In direct interactive mode, the service can process a maximum of 20000 time series records obtained from IoT Time Series Service.
Features¶
The Anomaly Detection Service exposes its API for realizing the following tasks:
- Unsupervised training of models
- Model application
- Detection of anomalies in time series data
- Ability to ommunicate with IoT time series to obtain data for model training and reasoning for the given asset details and time range
Limitations¶
- Time series data must not contain more than 10 variables.
- Information about completed jobs is available for 1 day, after that it is automatically deleted (this limitation might be changed in the future).
- In Interactive mode, the service can process a maximum of 20000 time series records.
- In Direct Interactive mode, the service can process a maximum of 20000 time series records containing a maximum of 3 variables obtained from IoT Time Series service.
- Check the Limitations from Model Management Service
- All environment have 100 GB storage allocated by default irresepctive of the offering
Example Scenario¶
The operator of a brewery wants to train a model of the factory's production line using a big dataset and later detect some abnormal characteristics.
Batch Mode¶
The operator uses the batch mode for training the model so they can use a big dataset as training set. They select a dataset stored by the IoT Time Series Service and specify the target asset, aspect and sensors as parameters to be trained. They also define the time range from which to get time series data and in which folder of the Data Exchange Service to save the results.
After launching the job, its status can be checked to know if the training has been completed. The operator downloads the results using the Data Exchange API and evaluates them.
Interactive Mode¶
The manager collects time series data of a relevant sensor of the production line using the IoT Time Series Service and feeds them into the Anomaly Detection Service to compare them with a previously trained model. The response from the Anomaly Detection Service provides candidates for anomalies.
Related Links¶
Except where otherwise noted, content on this site is licensed under the Development License Agreement.