Using Statistical Pattern Learning to Find the Method in Attackers’ Madness

With attackers getting more sophisticated and diabolical by the day, one of the scariest problems is that some of the most devastating intrusions may not initially even look like attacks at all. Such is often the case with IoT-based attacks, for example, which exploit hard-to-secure endpoints, and according to the 451 report Securing the Internet of Things may threaten the physical safety of thousands. The most dangerous hackers have moved past Pearl Harbor-styled assaults to more subtle invasions, where companies don’t even know they’ve been compromised until it’s too late–according to Ponemon Institute 66% of security breaches remain unnoticed for an average of eight months.  Moving into the future, therefore, one of the most important methods for identifying attacks that might otherwise go unnoticed will involve detecting anomalies in data streams that indicate such compromise.

The operating principle here is simple: if streaming data patterns don’t look like what they’re supposed to look like there may be signs of trouble. But in practice it’s a little more complicated because unlike your monthly electric bill, anomalies in massive data streams can’t necessarily be eyeballed. Furthermore, even if able to detect hidden anomalies, you could waste a lot of time investigating deviations that turn out to be harmless. Somewhat more sophisticated statistical methods must be employed to identify problematic patterns that might only become evident when multiple, very large streams are analyzed.

Though this be madness, yet there is method in it

In order to detect suspicious patterns, you must be first be able to identify “seasonality” or repetition that can be detected in the data over some period of time, and “trends” or overall movements of the data which also must be factored in for an accurate picture. There’s an adage that you’ve got to stand a few feet away from the TV to clearly see the picture, and that holds true for detecting statistical patterns in data. You’ve got to be able to “stand back” and look at very large amounts of data before a story emerges. Therefore, the ability to detect deviations in data being streamed in real-time often requires the ability to process and analyze extremely large amounts of data in very short time periods.  

It also requires the ability to look at this data through “time sliced windows”, or specified time frames. For example you might look at data being streamed from sensors on refrigerators in a certain geographic region from July 1-December 31, 2016. If you were able to process and statistically analyze enough of this data, clear patterns would emerge.

Holt-winters: mapping out the “expected” future, to plan for the unexpected

One of the best ways to predict what future patterns should look like under normal circumstances involves Triple Exponential Smoothing, also known as the Holt-winters method, an algorithm used to forecast data points in a set that exhibits seasonality. The math behind the algorithm has its roots in a 1957 paper by MIT graduate and CMU professor Charles C Holt on forecasting industrial trends describing “double exponential smoothing” of the base level and trend components of a data set. In 1960 his student Peter R. winters added a third component, seasonality, to the mix in Forecasting sales by exponentially weighted moving averages.

This method uses “observed” values to forecast “expected” values, and can be applied to any data set that exhibits seasonality. Thanks to recent improvements in the ability to ingest, store and analyze massive amounts of data, it can be detected in areas where it might have previously gone unnoticed. Where it is detected, accompanied by an overall trend, Holt-Winters can create a clear picture of what the data should look like under normal future circumstances.

Detecting seasonality to stem cyber attacks

With the ability to quickly find patterns in large amounts of data and predict how the data streams should behave in the future, you can quickly identify deviations that may indicate a cyber threat. Going back to the “sensors on refrigerators” example, if your security team was alerted to an unusual amount of activity for a certain time period in East Cincinnati, they can immediately investigate. It doesn’t end there though. If you had access to a full range of data from other regions, and the patterns associated with previous attacks had been identified, statistical pattern learning algorithms could be employed to draw a real-time correlation between those patterns and current deviant patterns. This kind of information would not only vastly hasten response time by immediately flagging a critical event, but also provide insight that could be used to identify origin, method and other valuable forensic insight.

In sum, in cyber security statistical pattern learning can be employed over time-sliced windows to uncover seasonality and trends in data that companies can use to:

  • Discover the relationships between attackers, their methods and their targets
  • Understand the evolution of attack patterns over time
  • Detect threatening deviations on data being streamed in real-time
  • When combined with the ability to set up automatic alerts regarding deviations that indicate threats, companies can act quickly to stem attacks before they fully materialize

Case in point

Logtrust and its partner Panda have used this technology to address scenarios just like this, providing clients with the ability to automate the storage and correlation of information coming from a full range of data endpoints. They’re able to immediately visualize and automatically pinpoint unusual behaviors that indicate threats, capabilities made possible thanks in no small part to Logtrust’s ability to:

  • Rapidly ingest massive amounts of data coming at high speed from multiple sources
  • Continually apply statistical pattern learning through Holt-winters-based algorithms to identify worrisome deviations
  • Leverage pre-built correlation libraries that greatly expand the range of problems that can be quickly identified
  • Provide automatic alerts when troublesome patterns emerge
  • Visually present these findings so that they can be rapidly put to use to stem the attack before it materializes

Up to now cyber security has been fairly reactive in nature, evidenced throughout industry marketing literature replete with terms like “defense”, “wall”,  “shield” and “lock”. However, as we move into the future companies need to employ every tool available–one of the most powerful of which is statistical pattern learning. So while anyone who uses their tremendous gifts and talents in software engineering to perpetrate cyber crime may rightfully be described as “mad”, statistical pattern learning can be a powerful tool for stopping them in their tracks.  

2017-06-23T22:00:51+00:00 June 23rd, 2017|