For data-driven businesses, the answer to “How well are we doing?” is contained in key performance indicators (KPIs) and various other business metrics. Although while those data do in fact contain the answer, it must be teased out by using the right tools. Up to now, those tools were limited to trend lines, percentiles, traditional analytics dashboards, manual review of time series data, and then manually set thresholds of crude outlier detection systems which are anything but real-time.
Modern outlier detection systems – those that use machine learning and scale to millions of metrics – not only answer that question better but fundamentally change it to something much more useful, along with the lines of: “What opportunity can we take advantage of right now?” It’s often tempting to discuss outlier detection in time series data only in terms of its ability to detect serious problems right as they occur, which thus give businesses the early warning and clues they need to correct problems which are costing them money by the minute.
The types of outliers found in time series data
Before we discuss how outlier detection can help companies make more money, it’s helpful to step back a bit and talk more about the types of outliers most likely to be found in time series data. Outliers in time series data can take many forms – perhaps two of the most obvious are abrupt changes (spikes or dips), and missing seasonal cycles.
A sudden increase or decrease in a metric stands out visually when the time series data is plotted. The sharp rise or fall of the value being monitored sometimes produces what is called global outliers (a data value significantly different from the entire rest of the data set). Even if that blip doesn’t break a historical record, it still may be considered a contextual outlier (a data value significantly different from the neighboring data points) if the anomalous data value is outside the expected statistical variance of the most recent data points.
Which brings us to the second type of outlier most easily spotted in time series data: missing seasonal cycles. For example, if your metric exhibits a cyclical pattern with a period of 24 hours, but unexpectedly flatlines, those low values would be obvious outliers, even though those exact same values would be normal if they occurred during the predictable low points of your daily cycle. That long period of unexpected low values would then be considered a collective outlier (a subset of the data which is anomalous when considered as a group).
So how does this help companies make more money?
Adding revenue to adtech
Consider the example of an adtech company which sells digital impressions to advertisers. One of the online publishers who works with this adtech company posts an article which quickly goes viral. As a result, there’s a spike in page views for this content, but the CPM (cost per thousand impressions) doesn’t surge along with the increase in traffic. Since this viral content – and all ads on it – are now in front of a lot more eyeballs, the adtech company can now increase the CPM on this content to some higher, yet still reasonable amount. The advertisers will still happily pay the higher rate for the increased exposure, and thus both the adtech company and the publisher end up increasing their revenue.
In order for the adtech company and the publisher to take advantage of this viral content, they first must detect an abrupt increase in page views the moment it occurs. Real-time detection of this (great!) outlier is important because every second of delay results in impressions being sold for less than what the market could bear, and thus both the publisher and the adtech company would miss out on the additional revenue.
Quantifying the opportunities
An advanced, automated real-time outlier detection system not only would have identified the outlier in the page view metric, but would also determine its significance given the historical data of page views. In other words, such a system would quickly determine if this spike was a global outlier (the highest number of page views ever recorded for this publisher) or a contextual outlier (an unusually high number of page views given the seasonal patterns of web traffic for this particular publisher). This level of analytics would be very useful to our adtech company – they would be able to charge a much higher CPM.
Far more than merely an indicator of business problems, outlier identification systems can find the revenue-increasing opportunities hidden in your time series data, quantify them, and thereby give you actionable guidance how best to take advantage them.
So, what opportunities are you taking advantage of today?