For many organizations, the dimensions of information risk may seem straightforward. The risk is that personally identifiable information (PII) will some how become public, bringing with it a host of negative public relations, market share and financial effects.
These are certainly important aspects of information risk. Indeed, organizations should do everything they can to understand the nature of their vulnerability, and prepare both initial and backup defenses against hackers trying to access PII.
But information risk has other dimensions as well. To appreciate them, it’s helpful to look at the more traditional view of corporate risk, and see what it can teach us about data risk specifically. I spent a number of years working in the insurance and financial services space, and had a ringside view on how leading organizations view the matter of risk. In fact, insurers and financial services organizations often succeed or fail based on their ability to quantify and analyze risk, so they employ armies of professionals to create risk curves, probability distributions and other analyses to gain insights into the likelihood of bad things happening to themselves or their clients.
Limitations to the art of predicting risk
Despite the point made immediately above, there are two fundamental limitations to humans’ ability to predict. The first is intuitive: to be able to create mathematical models for a certain type of risk, you need to be able to quantify risk in terms of its financial impact. In the industries I mentioned, doing so isn’t a problem. But when most people try to analyze data risk, they’re far more likely to be making educated guesses, at best.
The second limitation is less obvious. As numerous studies have demonstrated, human beings as a rule are rather bad at estimating and guessing. Couple that limitation with our species’ widespread tendency to misunderstand randomness and causality, and it’s easy to see that understanding and predicting risk is hard enough, but even harder and less exact when we don’t know the relative values of what we’re trying to predict.
Which leads us back to the topic of information risk. The most fundamental problem for most organizations is that they lack an effective way to quantify the specific financial value of their data. Even though professionals in the broader data space talk in general about how valuable data is, it seems that no one is able to take the next step and determine the actual value of an organization’s data, let alone the value of specific data elements.
The good news is that such an estimation is possible — one just needs an analytical method and process. As we’ve discussed in an earlier post, such a process starts by creating a standard Data Certification Score (DCS), a holistic indicator about each data element, reflecting the element’s frequency of use, consumption demand, collective organizational understanding (i.e. metadata quality) and overall data quality (based on actual data values). Slightly more detailed steps of that process include creating a thorough data inventory, gathering metadata about that inventory and assigning data stewards, aggregating the DCG and finally implementing an ongoing system to track the DCS. Then you can conduct correlation studies to show in real time exactly where and how your data is adding to (or in other cases, hurting) your bottom line — thus determining the business value of data.
Three key types of information risk
Once you understand the true value of your data, you can estimate the impact of various types of risk, which generally fall into three categories. The first is operational risk — a type of risk that could be already costing your organization tremendous amounts of money. There are all kinds of drivers of operational risk: the demand put on specific data elements, the technology used to store them, and rising prices if you purchase data from external sources, to name a few.
A second category is intrinsic risk. This also can be driven by varied factors. For example, the data you use may be of poor quality, or the people using it to make decisions for your organization may not know exactly what the data signifies. Either way, it means that some of the decisions they base on it are wrong. The challenge with this type of risk is that it’s not readily apparent, other than in hindsight. To accurately measure this aspect of risk requires analyzing specific contextual use cases and determining how the data is consumed from a business and operational perspective.
A third type of risk is what I call the “risk of the unknown unknown,” and refers specifically to uncertainties involving how organizations obtain or create data for analysis. There’s a lot of buzz in the data science/predictive analytics space around various data analytic platforms being sold as “silver bullet” solutions. Just buy our product, hire a bunch of data scientists, and you’re good to go — right?
Not so fast. First of all, is your Data Supply Chain up to the task? You could set up the best big data analytics solution possible — but if you haven’t figured out how to optimize the way you get the data, enrich and understand the data, and finally supply it to your data scientists, you actually have a huge amount of work to do before you’re even close to achieving the expected ROI. The reason for this is that your software development life cycle (SDLC) for creating and delivering this data needs to be as short as possible, and very nimble. Business needs change so fast, that you can’t forecast today what data you’re going to need six months from now, let alone two years. As a result, your Data Supply Chain must meet this need faster and more nimbly than ever before to support both the supply side and demand side (data science and consumption) of the Data Supply Chain.
Better understanding of risk leads to better decisions
The big takeaway here is that today’s organizations face information risk in multiple dimensions, and from multiple directions. By understanding and quantifying these risks, including the potential positive or negative financial impact they represent, an organization will be better positioned to make smart decisions about investing technology budgets and allocating resources.