At the core of our quantitative approach is a new way to understand the impact of each data element on a business’ key performance indicators (KPIs).
Before we can get there, however, we need the ability to quantify all the different types of data within an organization — including both its data assets and liabilities. My company has created a proprietary method for doing so — a veritable “Rosetta Stone” that allows us to use a single language to describe text, numeric and alphanumeric data. This allows us to analyze all of an organization’s data through a single lens — an incredibly powerful perspective for understanding how data changes from day to day.
To explain how this works, imagine a single piece of data (data value) in a single data element, in a single data store — for example, the word “Dave” in a data element called customer first name, that lives in a table of a data warehouse. We’re able to break that data into its component pieces by looking at the geometric dimensions of each character, which enables us to look very quantitatively — and ask lots of interesting questions about the data value.
We can then aggregate that data value and create a dimensional shape for the value of “Dave.” In addition, we can aggregate it with all the values of all the other names in that field, in every instance across the enterprise. This process can be replicated again and again, to eventually visualize the vast amount of changes that occur in an organization’s data on any given day.
This ability is analogous to being able to go down to the scale of a molecule and break it into its individual atoms … even its sub-atomic components.
This gives us an ability to measure, in a highly precise way, the extent to which individual pieces of data change from day to day, and also aggregate them into larger and larger groups, mapped to the particular records, columns and tables.
Next we track how those individual shapes change over time, and measure the degree of volatility, as illustrated in Figure 1. We call this measurement of data the supply and demand volatility index (SDVI), and it can provide game-changing insights. By tracking the SDVI over time, we can create an index for each data element (see Figure 2) — similar to an index of a single company on the stock market.
We can then model each data element’s SDVI against changes in the most important overall KPIs used to run the business — whether they measure sales leads, marketing and campaign results, or other data. As a result, we get real-time insights into which types of data have the strongest alignments with critical KPIs. This provides a model to project impacts to Profit & Loss (P&L).
The next step is to conduct a regression analysis based on the relative strengths of those alignments or relationships. In Figure 3, we’ve created a heat map listing the most essential overall business KPIs on the left-hand column and the volatility indices for various data elements across the top, and color-coded the relationships according to the strength of the correlation.
The process I’ve described here allows an organization to track the volatility in not only one particular data element (and all its dozens or even hundreds of instances in the organization’s various spreadsheets and databases), but also in every one of its hundreds or thousands of data elements.
Then, by analyzing and comparing that volatility with the changes in the organization’s business KPIs over the same time period, business leaders can identify, at a granular level, which data assets can potentially have the most positive impact on future value. Even more powerful insights can be gained by modeling the impact on P&L of various “what if” scenarios. For example, what would be the impact if the current state of data were enhanced (that is, by improving quality, deepening understanding, or increasing supply, etc.).
For the CEO, knowing which of their organization’s data assets and data liabilities have the strongest correlations with KPIs is huge. Now they can say, with statistical proof, that if we enrich these particular data elements — whether by improving their quality, buying more of the same type of data, or other approaches — we should have a positive impact of x percent on revenue (or conversely, drive down costs by y). In short, they’re able to calculate how changes to the current state existence of specific data elements will positively impact future values for the firm.