When I talk with clients about minimizing the impact of corporate data theft, I sometimes use the analogy of football. Your most sensitive data is like your quarterback — arguably the single most valuable element on your team, and the one that everyone on the opposing team wants to get their hands on.
Even though you have five huge players on your front line to block the defensive rush, inevitably one or more opponents get through. That’s when your second level of defense comes into play: your running backs. On passing plays, these players stay back to block any opponents who have broken through (or around) the front line.
Too many corporations assume that their front line is all the protection they need against intruders intent on data theft. They essentially leave their data alone in the backfield, completely exposed to any intruders who make it through — generally with unpleasant results.
In contrast, I recommend that clients start by assuming that defenders will break through the first of line of defense, and instead develop a three-part strategy for minimizing their impact once they’re inside. Here are quick descriptions of each element of the strategy.
1. Get your data house in order.
The first and most essential strategy is to get accurate, up-to-date metadata concerning your sensitive data. You need to know exactly where it resides, even down to the level of which specific columns or fields in which databases, and who uses the data for what purposes. This is easier said than done, because many elements of personally identifiable information (PII) reside in multiple locations throughout your organization.
Metadata regarding any given data element that is extracted using technologies is not rich enough to provide a truthful and complete picture. In addition to technical metadata acquisition, you must also get the metadata that resides in peoples’ heads. To gather this type of metadata, the most effective approach is to have face-to-face, collaborative meetings with the people who use the data on a daily basis. If you rely solely on your technology team for these insights, you and your organization may be in for a world of hurt.
2. Create a data-centric culture.
Fortunately, there are ways to get people across your organization talking and collaborating about the state of your data in order to optimize its quality and accuracy — and ultimately, its security. This means getting people talking about your data all the time — even reaching across organizational division and boundaries to improve their understanding.
Through the natural course of developing technical projects, members of your technical staff may update a certain logical data model or spec sheet that describes a data element at one given moment in time. But because your business is constantly moving, by the time they’ve completed their process, there’s almost no chance the metadata they created will still be aligned to the business process.
A far better approach is to be proactive, and create “social” platforms and forums that allow your people to truly collaborate around data. This will help you gain a constantly-updated, far more accurate view of how data is actually being used.
3. Develop an early warning system.
By properly accounting for all your data assets as described above, you will gain insights into more than the physical instances of PII in various data assets (i.e. databases, repositories, warehouses or even spreadsheets). You will also learn and document the various contexts in which the data is used. Think of each of legitimate usage scenario as a “story.” In the case of an insurance company, for example, one story would be that a claims adjuster types in a policyholder’s Social Security number (SSN) to start working on a claim. There could be many such stories involving different users and uses — but the number is finite.
You can basically convert each of these stories into a mathematical formula — think of it as a three-dimensional, digital depiction that specifies who would be using the SSN, in which programs. By thoroughly documenting each of these legitimate uses, you can create a machine learning program with algorithms that monitor the use of SSNs and other sensitive data elements anywhere in your organization. If it detects them being used in any other ways, it raises a red flag.
Initially there may be false positives, and if so, your team (or the algorithm) can correct the metadata so that the next time that scenario comes up, nothing happens. In a sense, you’re creating a system that sees and responds to inappropriate use of PII the same way an immune system responds to the three-dimensional protein structures it recognizes as foreign bodies.
Impact across the enterprise
Understanding and protecting your data not only helps you minimize the impact of data theft when it occurs — it also gives you a potential for competitive advantage. It lets you understand the dynamics of how information feeds your organization and allows your people to succeed, helping you to make better decisions as to where to invest in improving data quality and security. Most importantly, when that hacker eventually breaks through your front line of defense, it helps you know which systems to shut down first.