Metadata has been in the news over the last couple of years. The term has been used by businesses and government ministers but there doesn’t seem to be a consensus about what metadata actually is and how to can be used. Metadata is simply data.
“Metadata is summary information about data. Think of a photograph shot with your phone. The content of the photo is the data. The metadata is information about where the photo was taken, the time it was taken and how large the file is,” says Johnnie Konstantas, the Director of Security Solutions, Gigamon’s Director of Security Solutions.
Over recent years, the amount of information moving through our networks has grown exponentially and with the advent of the Internet of Things we’ll see the volume of data moving through our networks explode. That create significant challenges, not just for network engineers but for security teams.
How can we tell the difference between a malicious payload and one that is necessary for business when there’s so much information moving through the network?
The answer lies in the metadata - not just the payload, which might be encrypted - but in the destination, certificate and other information pertaining to that payload. By looking at the metadata associated with a flow of network traffic it can be easier to tell the difference between legitimate and bad traffic rather than trying to examine the detailed contents of every data packet.
“Why do we want to aggregate raw data, the packets that are flowing inside networks? They give us a lot of information to analyse so we can find out how the network is performing or if there’s a security issue. But there’s a lot of raw data and it takes a long time to analyse. Metadata, because it’s summarized, is smaller so gathering it and analysing it is a lot faster,” says Konstantas.
With security analytics, starting with metadata can help target investigations.
“Why start with an analysis of metadata for security purposes? Metadata can be used to approximate where you have a problem. Then you can use raw data to dig deeper,” explains Konstantas.
Metadata analysis in security analysis provides a way of targeting the effort required to secure a network.
Ian Farquhar, Gigamon’s Worldwide Virtual Security Team Lead explains.
“The traditional approach to security was about detecting threats in networks has been about finding things we already knew were bad. But that approach has never worked particularly well. It doesn’t scale to all the known ways things can be bad. There are attacks that occur once and we’ll never see again. Therefore, they’ll never end up in a database of known bad behaviour. What we need to do is detect anomalies – differences from what we expect – that point to threats. But we have far too much data and it’s flowing far too fast”.
Farquhar says this has driven us to use analytics tools that take advantage of metadata to detect these anomalies.
“Metadata is the perfect tool here. It’s generated at high speeds and low volumes,” he says.
In a typical attack, a threat actor breaches the perimeter and sits in network, doing quiet exploration, looking laterally for weaknesses that can be exploited. When a weakness is detected, the hacker executes some sort of malicious payload resulting in data loss or some other business impact. At each stage, there may be metadata that helps detect the malicious activity and thwart the bad actor’s plans.
“For example, you can look at patterns of access to your website,” says Farquhar. “Access patterns that show increased access to corporate images could point to an attacker downloading logos in order to craft a phishing attack,” he says.
This access would look different to normal usage.
Similarly, if an attacker has breached the company’s perimeter, their internal patterns will differ from those of staff as the attacker is unlikely to know how the organisation is structured.
The kinds of metadata we’re looking for are transaction volumes, IP addresses, email addresses and new certificates for TLS and SSL circulating on the network are all pointers to potential breach activity.
So why do this within the network and not on endpoint devices? Konstantas says “Retrieving this kind of data from end-point devices like mobile phones, laptops and even servers might require installing agent software on device. That can difficult to do and the software requires upkeep. It’s a lot simpler to monitor the network via a visibility fabric and retrieve the metadata from the network”.
This doesn’t mean end-point security is not useful. “You still want to keep attackers out,” says Farquhar.
There’s a balancing act that needs to be achieved. How much raw data should you retain? Farquhar says there’s no single answer that universally applies.
“It depends on your needs,” he says. “You try to keep as much data for as long as you can. This is often a question of economics as much as technology and security. But you want to maximise your ability to detect and prevent an attack”.
With many security reports finding attackers are often inside networks for many months before they are detected.
“The low and slow attack, where the attacker takes their time, is one of the hardest to detect. The longer you can hold data, the more you can analyse to detect their presence,” says Farquhar. “This is advantage of metadata – some studies say metadata consumes just 10% as much space as raw data – you can retain it for a longer time”.
This provides opportunities not only for detection but also forensic investigation after an attack.
The best way to collect metadata, says Konstantas, is to collect un-sampled metadata.
“Each packet has summary information about it. What you want is a one-to-one ratio – you want to be collecting the metadata for each and every packet on the network. To do that you need a visibility fabric. This is a specialised solution that generates this data in a complete way. You also need a way of collecting and analysing that data,” she says.
Today’s business environment is not retained within the four walls of the corporate office. Cloud providers and web services mean most companies are running a hybrid infrastructure of some type.
“The same risks exist in the public cloud as for on-premises IT,” says Konstantas. “The same means for monitoring data flowing back and forth between critical workloads, application and systems need to exist in the public cloud. Fortunately, there are solutions that allow you to gain visibility to that traffic”.
This means you need to emulate the same infrastructure for gathering and analysing metadata in the cloud as you do for on-premises systems. Konstantas says the market-leading visibility fabrics are able to do just that – collect metadata on public cloud infrastructure as well as private systems.
So, what are the key takeaways are when it comes to metadata and security? Farquhar says they are
“You need to see all the data in your organisation. You need access to that data and metadata no matter where it is – private cloud, public cloud, traditional, terrestrial network. All of these matter. We need to be able to get it into a visibility fabric and process it as soon as it’s delivered and then deliver the intelligence into tools that are capable of processing and identifying threats”.