Table of contents
- Data and oil need to be refined
- Oil is a finite resource, while more and more data is created every day
- Data and oil are not always available to everyone
The concept of data as a strategic asset has been gaining momentum in the past years, however, regular people aren’t able to see the real value in data.
We know big tech companies have been collecting data for a long time. We know that year after year new regulations about the use of data are created. That said, most of us still don’t understand the impact data could have on our society.
A few years ago The Economist published an article called “The world’s most valuable resource is no longer oil, but data.” However, for regular folks, it’s still hard to understand how data can be the new oil.
Data and oil have some similarities, but also some differences. Here are some of them.
Data and oil need to be refined
Data and oil are rarely used in their raw state.
If oil is unrefined, it cannot be used. For oil to be useful, it has to be extracted, refined, and distributed. The same happens with data. We don’t use the data as soon as it’s extracted, but we have to process it first before it’s ready for analysis.
Here’s how Clive Humby, the data science entrepreneur who coined the phrase “data is the new oil,” compares oil and data.
“Data is the new oil. Like oil, data is valuable, but if unrefined, it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity. So, must data be broken down, analysed for it to have value.”
This is true. Once data is collected, it needs to be cleaned and transformed to get it in the desired format. Why? Well, real-world data is messy, so there might be inaccurate or missing data that we need to deal with.
To put it simply, imagine you have collected data from a survey. You can be confident that the results obtained from the multiple choice questions don’t need much preprocessing, but things change with the open-ended questions because people can answer whatever they want (sometimes without following a common pattern) and even leave an answer blank.
Real-world data is sometimes as messy as those open-ended questions.
This is why raw data isn’t enough. Only after the data is “refined” we can make the most of it by making reports, doing analysis, and creating something valuable.
Oil is a finite resource, while more and more data is created every day
One of the things that make oil so valuable is the concept of scarcity. There might be undiscovered oil reserves out there, but, the thing is, oil is a finite resource. One day there won’t be any oil on this planet and we have to find some other forms of energy.
That doesn’t happen with data.
There isn’t only plenty of data possessed by companies and even publicly available on the Internet, but more and more data is being created by people every day. How? Every time you watch a movie on Netflix, buy a product on Amazon, or listen to a song on Spotify, a new data point is created.
These data points are created every second around the world!
Thanks to millions of data points, big tech companies can develop a good recommender system that can predict what movie or song you might like or suggest products to buy based on purchase history.
In addition to that, unlike oil, data can be reused without losing its quality. One engineer can use a dataset for one purpose, while another can use the same dataset for a completely different purpose.
But if data is infinite, how can it be so valuable? The value depends on the eye of the beholder. A dataset about sports statistics can be utterly useless for an e-commerce company but can be extremely valuable for a professional football club.
Data and oil are not always available to everyone
Yes, data is infinite, but it’s not available to everyone.
There isn’t a company that would share data that probably took them years to collect (at least not for free). Something similar happens with data available on websites. The data is there and you could somehow extract it, but it’s protected by privacy guidelines and terms and conditions. This means that, although you can extract the data, you should think twice about how you use this data.
Let’s consider the HiQ and LinkedIn case as an example.
HiQ extracted publicly-available data from LinkedIn. LinkedIn invoked the CFAA in a cease-and-desist letter to HiQ. Although the US Court of Appeals denied LinkedIn’s request, this didn’t grant HiQ the freedom to use the data extracted for commercial purposes.
As you can see, there are ethical issues around data collection and they’re quite different from oil extraction.
Even companies that collect data from their customers can’t use it as they want. As an example, there’s the General Data Protection Regulation (GDPR) that imposes obligations on organizations that collect data related to people in the European Union. Those who violate its privacy and security standards could pay fines that reach tens of millions of euros.
Like oil, data needs to be refined. Otherwise, it can’t be used as it’s not so valuable.
Data is everywhere, but it isn’t always available to everyone. You could extract data, but then you should think twice about how you use it.
Unlike oil, data is an infinite resource that is created every day by people. It can even be reused and it rarely loses its quality.
If you enjoy reading stories like these and want to support me as a writer, consider buying me cup of coffee Buy Me a Coffee . Thanks for reading. I would appreciate a like, comment, and a follow. Be sure to look out for more content on more amazing topics . I will be writing a lot on these plus planning on starting a podcast about some stuffs like this blog at a time.