I frequently hear business leaders observe what a valuable asset their data is. Personally, I tend to avoid making such broad statements — or at least, feel the need to qualify them. True, data can be an asset, but it can also be a liability, or worth nothing at all. In reality, data is no more than a commodity — one that can be good, bad or indifferent.
Few people need to be sold on the idea that data can be valuable. Certain data regarding customer buying patterns and affinities, for example, can add directly to your bottom line in terms of customer retention, psycho-demographic based segmenting, cross-selling and so on. It could also could help in new product development, or even in process improvements that benefit your profit and loss position. Of course, some data needs a little work before it can start delivering such dividends — but clearly, certain data is valuable.
A second category is data that represents a liability — in the sense that it will require you to pay for it at some point in the future.
Consider the case of an insurer who maintains a data field for policy status. This field is probably used in different ways by different departments. Billing, for instance, uses it primarily for issuance and collection of billing artifacts, while in marketing, it’s used mainly in customer relationship management, campaigning and brand development. Each month, your finance folks need to roll up this pile of confusing and even contradictory data and report on the number of policyholders. That means that some of your most specialized (and expensive) human resources may be spending a significant amount of their time massaging and making sense of what should be clear-cut data — and this is all caused by your data’s multiple meanings and multiple masters.
Another example could be a monthly financial reporting process that requires the extraction of legacy data from your mainframe. Perhaps at some point long ago, one of your developers built an extract tool to automate this process. But over time, the original developer moved on, your business changed, and your source and target technologies were tweaked. Now, as a result, it takes a team of 20 to collectively understand the underlying complexity of the data extraction and loading process, and to make sure the extraction runs on time and correctly. Both of these are examples of data that actually costs more to maintain than it adds to the bottom line.
A third category of data is that which is worthless. Consider just one example. Let’s say that back in 2006, your data people developed an extension to your Oracle database to support a new application by adding some technical columns used for referential integrity and for helping the database talk with other applications. But later your business team decided to go in a new direction, and no longer needs that particular data. The extension is still collecting the data, even though it’s not used by anyone … and it will continue to collect that data element in perpetuity until it is turned off. The data may well appear on your “data ledger” — but it is neither an asset nor a liability.
How to separate the valuable data from the rest
Assuming that you agree with the idea that not all data is an asset, your next question might be: how can we find out which data is which? The answer depends partly on what level of insight you need, and partly on the resources you’re willing to invest.
In general, there are two approaches. One is to invest some time and resources into understanding your data at a qualitative level. This can be achieved by having smart people lead a process of extracting the “tribal knowledge” from your data community regarding which data they think is most closely related to your business model (or on the flip side, which data presents the greatest vulnerability or liability).
A second approach is more involved, but also offers a far greater potential for return on your investment. This approach is quantitative in nature, and involves the use of advanced analytic and mathematic resources to analyze your data elements across the enterprise. In fact, this is the business that my company is in, and we often draw on such additional bodies of knowledge as topological mathematics, machine learning and artificial neural networks to build visualizations to represent the high dimensional problem space that is most companies’ current state data.
Regardless of which path your organization chooses to take, the old saying, “there is no free lunch,” applies. The nature of gaining insights into data is hard and incredibly complex, and the benefits aren’t realized until the job is done. Consequently, both the qualitative and quantitative approaches require an investment of resources, and a sustained focus as a significant corporate priority. With the job done, however, you gain the ability to visualize the actual relationships between your organization’s data and your business model — the “Rosetta Stone” that is required to leverage data for competitive advantage.
The time to start understanding data is yesterday
The ROI from gaining better insights into your data’s value can be enormous — even game-changing. If you’re a firm that’s been around for just a few years, it can enable you to leverage machine learning and automation to make your operational model highly efficient and capable of operating at incredibly low cost. If you’re a legacy business, it can help you eliminate data that’s not really worth anything — but still costing you a lot to manage.
In either case, it can also allow you to put controls on data that is at high risk for breach, preventing potentially disastrous publicity and associated costs. On a broader level, it will also help you identify where your data management efforts will deliver the greatest ROI.