Wanted: Data breach risk ratings, because not all breaches are equal

We need a system for data breaches that rates the real risk associated with the compromised data.

I recently downloaded every known, recorded data breach by the Privacy Rights Clearinghouse, which has been the most thorough and stalwart public recorder of data breaches in the United States for over two decades. The data file contained just over 8,600 data breaches. I found a few dupes and some missing or erroneous information, but overall, it’s the best public, non-profit, and free source you’re going to find.

I applaud the Privacy Rights Clearinghouse for what they’ve done. Their data has been used in hundreds of thousands of news reports since their founding in 1992. I’ve used information from the site for at least a dozen of my columns over the years.

That said, I think a big component that could provide a lot of additional value is missing: the size of the risk to the stakeholders of the data from the data breach. We are often given risk ratings for other types of root-cause events, such as software vulnerabilities. Most software vulnerabilities include risk ratings. There’s even a generally accepted standard of risk rating called the Common Vulnerability Scoring System. This open-source rating system is fairly comprehensive. It’s not perfect, but at least it’s a starting point for evaluating your own risk.

Real risk ratings for data

As I highlighted in my last book, A Data-Driven Computer Security Defense, there is a gulf of difference between what you are being told you should be most afraid of and what you really should be most afraid of. Knowing the difference will make you a very valuable computer security practitioner.

For example, one-third to one-fourth of all software vulnerabilities are classified in the highest risk categories. That’s across all 5,000 to 7,000 vulnerabilities that we average each year, year-after-year. Most are easy to exploit (called low complexity in vulnerability circles). So, computer security defenders are being told to be very worried about at least 1250 separate vulnerabilities each year.

As I explained earlier, less than 1 percent of these have known exploits taking advantage of them. Far fewer than that are even achievable in your environment, because you either don’t have the software, already have the necessary patch applied, or have some other offsetting control. In reality, a handful of exploits are used against the average company in a given year. If you perform risk analysis on your company’s data, you’ll find that instead of worrying about thousands and thousands of vulnerabilities, that worrying about the right dozen is a far better use of time with a far greater chance of success.

Data-risk analytics should be applied to every type of root-cause hacking worry. Unfortunately, we’re not getting a risk rating for successful breaches of the very thing we are trying to protect: the data. Today’s vulnerability risk ratings are all about potential risk--the likelihood of some vulnerability being exploited to do bad things. It’s as if we go, “It happened! It was bad! We’ll try better next time!” without letting all the stakeholders (the providers, the owners, the custodians, regulators, etc.) know how bad a particular data breach actually was.

Data breach risk score

For example, while analyzing the Privacy Rights Clearinghouse data, I was surprised by how many records were considered compromised simply because they were sent to the wrong location or person, or maybe left behind in an office after a move. Records accidentally thrown into the garbage instead of being shredded are considered “breached.” In a world where hundreds of millions of records are compromised each year, I was astounded by how many scenarios likely didn’t really result from a malicious action or would be very unlikely to be used maliciously in the future.

For example, if my personal financial records were left in a garbage can accidentally, but discovered the same day, the chance of malicious use is far lower than if they were stolen by an intentional hacker act. Even hacker activities have different risk severities. For example, many breaches were due to ransomware compromises. Ransomware is a big deal, but in most cases the creators just want to be paid to unlock the data and aren’t looking at the data. Once you have malware taking complete ownership of a computer holding confidential data. you really don’t know what that malware will do or has done. In general, we all know that most attackers who employ ransomware just want the quick bitcoins and don’t want to be involved in selling data records on the dark web.

It’s like when a desktop admin gives a sigh of relief when they find adware on a computer instead of a backdoor Trojan. Both malware programs could have been installed on the system using the same type of vulnerability, but the adware is far less likely to look at my data than a backdoor Trojan.

How I would love it if every data breach announcement had to include the root-cause event, if known, and an estimated risk score to my “stolen” data. If I heard my records were found the same day in a garbage can or had been left behind for two weeks in an old office, I probably wouldn’t rush to sign-up for credit monitoring. At the same time, if my records were found on the dark web or belong to a larger data set that has contained other records that were found to be used in fraudulently transactions, I probably would step up my credit monitoring game.

Data breach risk rating components

I’m not 100 percent sure what should be the included components of a data breach risk rating score, but at a minimum it should include the following:

  • Type of data (e.g., PII, HII)
  • Type of breach
  • Number of breached records
  • Whether the records were breached by unauthorized third party
  • Likelihood of malicious actions against or with the data
  • The way the data was used maliciously, if known
  • How long data was accessed by unauthorized party before discovery
  • How long before the breach was reported
  • What offsetting controls (like hashing or encryption) were in place

I’m sure readers can think of more criteria. We can make it as complicated or uncomplicated as desired. The ultimate question to be answered is, “As the stakeholder, how much should I be worried about this breach of my data? Not much? A bunch? We’re not sure?”

We’ve been calculating risk for a long time about potential exploits. Isn’t it time to start rating the risk once the exploit actually happened?

Show Comments