For people who work in the data industry, recent headlines about data protection breaches and data spills are alarming, but not that surprising. My sense is that these often-preventable crises are beginning to force some CEOs to realize that they’re facing bigger problems than simply porous firewalls.
This realization is partly driven by the fact that when CEOs start digging into how a particular hack occurred, they’re often finding bigger and more complex problems involving their data, including its quality, timeliness and other factors. At the same time, they’re also starting to recognize that their data is a much more valuable asset than they ever thought — and that perhaps they’re not managing it the way they should.
When evolution was revolution
Such insights could be game-changers. As someone who studied biology before I became a data scientist, the best analogy I can think of is the impact that Charles Darwin had on the field of science in his time — and the impact he still has on the way we understand the world today.
Before Darwin published The Origin of Species in 1859, scientists had one way of understanding living things: what they could see and empirically measure. Darwin’s genius was that he looked at the same physical characteristics of diversity within a species — also knows as its phenotypes — that other biologists had, but he managed to see beyond them. In fact, he was able to use reason and intuition to theorize that there were other, invisible forces that were not only transferring traits from one generation of a species to the next, but also causing variations to occur, when compared with similar species on nearby islands. In essence, he looked at the same data set that fellow scientists were analyzing, and proposed that there was a more granular, completely separate hierarchy of data — the genotype — that was causing the variations.
To put it another way, Darwin discovered that there was another dimension to the data he was looking at, one that had completely eluded not only his contemporaries, but also all the scientists who had come before him.
Getting to the granular
Not unlike Darwin, today’s data scientists are beginning to see that the closer they can look at their organizations’ data, the more they can understand their organizations on an entirely new level. I suggest that data protection breaches are forcing many executives to realize that there is an entirely new, far more granular level of organizational data that they must learn how to manage and leverage.
This change is nothing short of revolutionary. Since the dawn of the data age (roughly 1980), I believe that most business leaders have been operating under a false assumption. By and large, CEOs have assumed that if they can solve their constraint models for people, process, and technology, then they’ve covered all the relevant organizational variables. The problem is that this mindset ignores the other independent variable, data — the unique variable on which all the other variables depend.
The reason most business leaders haven’t thought about data in the right way is that business models have never demanded it, since most organizations didn’t leverage or use a lot of data, relatively speaking. When leaders did make decisions based on data, they were doing so at the phenotypic level, not the genomic. Now that we can look at our data at a much more granular level, it’s becoming increasingly clear that a company’s success or failure can be dictated — and often predicted as well — by effective analysis of data. It’s roughly the equivalent of learning to use genetic markers to make predictions about an individual’s health.
Where we go from here
In my opinion, this new ability to essentially “look inside” data should be driving a major shift in funding priorities from technology/IT to data. We need to see an inversion, such that projects executed in data will influence the funding of related projects in technology, rather than vice versa. What’s more, I personally believe that cyber security and data protection breaches are going to be the organizational drivers that get us there.
It’s amazing, and humbling, to think that all of the data that Charles Darwin collected in his lifetime could probably fit on a few thumb drives. Yet the impact of that data — or more precisely, the metadata created by Charles Darwin’s interpretation of his findings — continues to define our view of nature and ourselves more than 150 years later. We may well be on the verge of a similar breakthrough in the understanding of data itself.