An anomaly is a deviation from an expected norm or pattern. Now that machine learning, advanced statistics and event processing are all being used in the ITOA domain, users have the opportunity for the first time to use a range of different techniques to identify issues before they become problems.
A recent survey found that only 27 per cent of application problems were detected by application performance monitoring tools, leaving the majority unnoticed. This large gap could be closed by the use of anomaly detection techniques. However, as there are various ways of defining an anomaly, perhaps the different types can be distinguished with different labels.
Statistical Anomaly Detection
Using standard statistical techniques, it is possible to identify values in a data set which are unusual. For example, if the CPU on a heavily loaded server is always expected to be a certain value and it suddenly drops significantly, it flags an anomaly and could indicate an issue with that server.
Pattern Based Anomaly Detection
With modern machine learning capabilities, it is possible to identify a pattern in a stream of data using time series analysis. After this period of learning, the engine can spot how far from ‘normal’ the current pattern is. This is powerful for detecting missing data, which the statistical approach can’t identify. For example, if a client usually trades X amount every Tuesday at 1pm GMT, then if on a given Tuesday no trades are submitted, it might indicate a technical problem or a client based issue that they are not yet aware of.
Event Based Anomaly Detection
Complex Event Processing (CEP) engines are able to look for event sequences and detect when an expected sequence hasn’t happened. Suppose event A, when followed by event B should always trigger event C within 5 minutes, then the CEP engine can watch for these events and alert when an expected sequence has not been detected. This could be due to a lull in trade volumes just prior to a big market announcement, such as non-farm payroll, or by a significant market move based on such announcements.
No one of these techniques is better than any other, as each has different uses for different situations. However, to have all three available to use in the management of large IT estates is a powerful tool to have at your disposal.
Guy Warren is CEO of ITRS