Can anomaly detection be implemented in Python

Anomaly detection and forecasting in Azure Data Explorer

  • 4 minutes to read

Azure Data Explorer continuously collects telemetry data from cloud services or IoT devices. Azure Data Explorer performs on-going collection of telemetry data from cloud services or IoT devices. The data is analyzed for various insights, e.g. to monitor the integrity of services, to physical This data is analyzed for various insights such as monitoring service health, physical production processes, usage trends, and load forecast. The analysis takes place in time series of selected metrics in order to determine pattern deviations of the metrics from their typical baseline pattern analysis is done on time series of selected metrics to locate a deviation pattern of the metric relative to its typical normal baseline pattern. manipulation, and It can create and analyze thousands of time series in seconds, enabling near real time monitoring solutions and workflows.

This article details the Azure Data Explorer time series anomaly detection and forecasting capabilities. The applicable time series functions are based on a robust, well-known analytical model in which each original time series The applicable time series functions are based on a robust well-known decomposition model, where each original time series is decomposed into seasonal, trend, and residual components. Anomalies are recognized on the basis of outliers in residual components Anomalies are detected by outliers on the residual component, while forecasting is done by extrapolating the seasonal and trend components. Implementation of Azure Data Explorer significantly extends the basic analytical model with au The Azure Data Explorer implementation significantly enhances the basic decomposition model by automatic seasonality detection, robust outlier analysis, and vectorized implementation to process thousands of time series in seconds.

RequirementsPrerequisites

Read Time series analysis in Azure Data Explorer for an overview of time series capabilities.

Time series decomposition model

Azure Data Explorer native implementation for time series prediction and anomaly detection uses a well-known decomposition model. This model is applied to time series of metrics from This model is applied to time series of metrics expected to manifest periodic and trend behavior, such as service traffic, component heartbeats, and IoT periodic measurements to forecast future metric values ​​and detect anomalous ones. This regression process assumes that the time series are randomly distributed differently than the previously known seasonal and trend behavior is that other than the previously known seasonal and trend behavior, the time series is randomly distributed. You can then forecast future metric values ​​based on the seasonal and trend components, collectively referred to as the baseline from the seasonal and trend components, collectively named baseline, and ignore the residual part. You can also detect anomalous values ​​based on outlier analysis using only the residual portion. To create a decomposition model, use the function. The function takes a set of time series and automatically breaks them down into seasonal, trend, residual, and baseline components each time a set of time series and automatically decomposes each time series to its seasonal, tren d, residual, and baseline components.

For example, you can decompose traffic of an internal web service by using the following query:

[Click to run the query][Click to run query]

  • The original time series is with num (in red) The original time series is labeled num (in red).
  • The process starts with the automatic detection of the seasonality using the function and extracts that seasonal Pattern (in purple). The process starts by auto detection of the seasonality by using the function and extracts the seasonal pattern (in purple).
  • The seasonal pattern is taken from the original time series and a linear regression is performed with the function to obtain the Trend component (shown in light blue). The seasonal pattern is subtracted from the original time series and a linear regression is run using the function to find the trend component (in light blue).
  • The function takes the trend and the remainder is that Residual component (in green). The function subtracts the trend and the remainder is the residual component (in green).
  • Finally, the function adds the seasonal and trend components to the Baseline (in blue) Finally, the function adds the seasonal and trend components to generate the baseline (in blue).

Time series anomaly detection

The function finds anomalous points on a set of time series.This function calls to build the decomposition model and then runs on the residual component. calculates the anomaly scores for each point of the residual component using Tukey's fence test. calculates anomaly scores for each point of the residual component using Tukey's fence test. Anomaly scores above 1.5 or below -1.5 indicate a mild anomaly rise or Anomaly scores above 3.0 or below -3.0 indicate a strong anomaly.

The following query allows you to detect anomalies in internal web service traffic:

[Click to run the query][Click to run query]

  • The original time series (in red).
  • The baseline (seasonal + trend) component (in blue).
  • The anomalous points (in purple) on top of the original time series.The anomalous points significantly deviate from the expected baseline values.

Time series forecasting

The function predicts future values ​​of a set of time series.This function calls to build the analytical model and then extrapolates the baseline component into the future for each time series to build the decomposition model and then, for each time series, extrapolates the baseline component into the future.

The following query allows you to predict next week's web service traffic:

[Click to run the query][Click to run query]

  • Original metric (in red). Future values ​​are missing and set to 0, by default.
  • Extrapolate the baseline component (in blue) to predict next week's values.

Scalability

Azure Data Explorer query language syntax enables a single call to process multiple time series. Its uniquely streamlined implementation enables fast processing, making for effective anomaly detection and prediction while monitoring thousands Its unique optimized implementation allows for fast performance, which is critical for effective anomaly detection and forecasting when monitoring thousands of counters in near real-time scenarios.

The following query shows the processing of three time series simultaneously:

[Click to run the query][Click to run query]

SummarySummary

This document details native Azure Data Explorer functions for time series anomaly detection and forecasting. Each original time series is used for anomaly detection and / or forecasting in seasonal Each original time series is decomposed into seasonal, trend and residual components for detecting anomalies and / or forecasting. These functions can be used for near real-time monitoring scenarios such as fault detection, preventive maintenance and demand and utilization forecasts These functionalities can be used for near real-time monitoring scenarios, such as fault detection, predictive maintenance, and demand and load forecasting.

Next steps

Learn about Machine learning capabilities in Azure Data Explorer.