At its simplest level, machine learning is defined as “the ability (for computers) to learn without being explicitly programmed.” Using mathematical techniques across huge datasets, machine learning algorithms essentially build models of behaviors and use those models as a basis for making future predictions based on newly input data. It is Netflix offering up new TV series based on your previous viewing history, and the self-driving car learning about road conditions from a near-miss with a pedestrian.
So, what are the machine learning applications in information security?
Subsequently, machine learning in security is a fast-growing trend. Analysts at ABI Research estimate that machine learning in cyber security will boost spending in big data, artificial intelligence (AI) and analytics to $96 billion by 2021, while some of the world’s technology giants are already taking a stand to better protect their own customers.
Google is using machine learning to analyze threats against mobile endpoints running on Android -- as well as identifying and removing malware from infected handsets, while cloud infrastructure giant Amazon has acquired start-up harvest.AI and launched Macie, a service that uses machine learning to uncover, sort and classify data stored on the S3 cloud storage service.
Simultaneously, enterprise security vendors have been working towards incorporating machine learning into new and old products, largely in a bid to improve malware detection. “Most of the major companies in security have moved from a purely “signature-based” system of a few years ago used to detect malware, to a machine learning system that tries to interpret actions and events and learns from a variety of sources what is safe and what is not,” says Jack Gold, president and principal analyst at J. Gold Associates. “It’s still a nascent field, but it is clearly the way to go in the future. Artificial intelligence and machine learning will dramatically change how security is done.”
Though this transformation won’t happen overnight, machine learning is already emerging in certain areas. “AI -- as a wider definition which includes machine learning and deep learning -- is in its early phase of empowering cyber defense where we mostly see the obvious use cases of identifying patterns of malicious activities whether on the endpoint, network, fraud or at the SIEM,” says Dudu Mimran, CTO of Deutsche Telekom Innovation Laboratories (and also of the Cyber Security Research Center at Israel’s Ben-Gurion University). “I believe we will see more and more use cases, in the areas of defense against service disruptions, attribution and user behavior modification.”
Here, we break down the top use cases of machine learning in security.
1. Using machine learning to detect malicious activity and stop attacks
Machine learning algorithms will help businesses to detect malicious activity faster and stop attacks before they get started. David Palmer should know. As director of technology at UK-based start-up Darktrace – a firm that has seen a lot of success around its machine learning-based Enterprise Immune Solution since the firm’s foundation in 2013 – he has seen the impact on such technologies.
Palmer says that Darktrace recently helped one casino in North America when its algorithms detected a data exfiltration attack that used a “connected fish tank as the entryway into the network.” The firm also claims to have prevented a similar attack during the Wannacry ransomware crisis last summer.
“Our algorithms spotted the attack within seconds in one NHS agency’s network, and the threat was mitigated without causing any damage to that organization,” he said of the ransomware, which infected more than 200,000 victims across 150 countries. “In fact, none of our customers were harmed by the WannaCry attack including those that hadn’t patched against it.”
2. Using machine learning to analyze mobile endpoints
Machine learning is already going mainstream on mobile devices, but thus far most of this activity has been for driving improved voice-based experiences on the likes of Google Now, Apple’s Siri, and Amazon’s Alexa. Yet there is an application for security too. As mentioned above, Google is using machine learning to analyze threats against mobile endpoints, while enterprise is seeing an opportunity to protect the growing number of bring-your-own and choose-your-own mobile devices.
In October, MobileIron and Zimperium announced a collaboration to help enterprises adopt mobile anti-malware solutions incorporating machine learning. MobileIron said it would integrate Zimperium’s machine learning-based threat detection with MobileIron’s security and compliance engine and sell the combined solution, which would address challenges like detecting device, network, and application threats and immediately take automated actions to protect the company’s data.
Other vendors are looking to bolster their mobile solutions, too. Along with Zimperium, LookOut, Skycure (which has been acquired by Symantec), and Wandera are seen to be the leaders in the mobile threat detection and defense market. Each uses its own machine learning algorithm to detect potential threats. Wandera, for example, recently publicly released its threat detection engine MI: RIAM, which reportedly detected more than 400 strains of repackaged SLocker ransomware targeting businesses' mobile fleets.
3. Using machine learning to enhance human analysis
At the heart of machine learning in security, there is the belief that it helps human analysts with all aspects of the job, including detecting malicious attacks, analyzing the network, endpoint protection and vulnerability assessment. There’s arguably most excitement though around threat intelligence.
For example, in 2016, MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) developed a system called AI2, an adaptive machine learning security platform that helped analysts find those ‘needles in the haystack’. Reviewing millions of logins each day, the system was able to filter data and pass it onto the human analyst, reducing alerts down to around 100 per day. The experiment – carried by CSAIL and start-up PatternEx -- showed that the attack detection rate rose to 85 percent with a five-fold decrease in false positives.
4. Using machine learning to automate repetitive security tasks
The real benefit of machine learning is that it could automate repetitive tasks, enabling staff to focus on more important work. Palmer says that machine learning ultimately should aim to “remove the need for humans to do repetitive, low value decision making activity, like triaging threat intelligence. “Let the machines handle the repetitive work and the tactical firefighting like interrupting ransomware, so that the humans can free up time to deal with strategic issues — like modernizing off Windows XP — instead.”
Booz Allen Hamilton has gone down this route, reportedly using AI tools to more efficiently allocate human security resources, triaging threats so workers could focus on the most critical attacks.
5. Using machine learning to close zero-day vulnerabilities
Some believe that machine learning could help close vulnerabilities, particularly zero-day threats and others that target largely unsecured IoT devices. There has been proactive work in this area: A team at Arizona State University used machine learning to monitor traffic on the dark web to identify data relating to zero-day exploits, according to Forbes. Armed with this type of insight, organizations could potentially close vulnerabilities and stop patch exploits before they result in a data breach.
Hype and misunderstanding muddies the landscape
However, machine learning is no silver bullet, not least for an industry still experimenting with these technologies in proof of concepts. There are numerous pitfalls. Machine learning systems sometimes report false positives (from unsupervised learning systems where the algorithms infer categories based on data), while some analysts have spoken candidly about how machine learning in security can represent a “black box” solution, where CISOs aren’t totally sure what’s “under the hood.” They are thus forced to place their trust and responsibility on the shoulders of the vendor – and the machines.
This idea of trust isn’t ideal in a world where some security solutions may not even be doing machine learning, after all. “Most of the machine learning inventions that have been touted aren’t really doing any learning ‘on the job’ within the customer’s environment,” said Palmer. “Instead, they have models trained on malware samples in a vendor’s cloud and are downloaded to customer businesses like antivirus signatures. This isn’t particularly progressive in terms of customer security and remains fundamentally backward looking.”
Furthermore, on these training data samples — required for the algorithms to learn their models before being put to use in the ‘real’ world — there’s the suggestion that poor data and implementation will result in even poorer results. “Machine learning is only as good as the input information you provide it (garbage in, garbage out),” says Gold. “So, if your machine learning algorithms are not well designed, the results won’t be very useful. Having algorithms that work on training data sets in the lab is one thing, but one of the biggest challenges around machine learning cyber defense is getting it working at scale in live, complex networks.”