Anomaly Detection Service – Parameter Estimation¶
The Anomaly Detection Service does not provide functionalities for parameter estimation. This section gives an overview of suitable methods to be applied for this task.
Manual Parameter Estimation¶
Parameter values can be estimated in a semi-automatic and interactive manner. The Anomaly Detection Service can to find out the optimal values which are highly dependent of the data and the targets he wants to reach.
Use a dataset D of size N as input and go through the following process:
- Determine
eps_nn
:- Calculate the distance to the nearest neighbor for each data point in D.
- Sort the distance in descending order and select an edge in the plot
eps_nn
visually (see Fig. 1).
- Determine the cluster size
minPts
:- For each point, count the number of neighbors, which are reachable within distance
eps_nn
. - Plot the distribution of the number of neighbors and select a suitable value for the cluster size, e.g. the maximum (see Fig. 2).
- For each point, count the number of neighbors, which are reachable within distance
- Determine the distance threshold
eps
:- Calculate the distance to the nearest neighbor for each data point.
- Sort the distance in descending order, visually select the “break point” eps (see Fig. 3).
Last update: November 27, 2023
Except where otherwise noted, content on this site is licensed under the Development License Agreement.