Data are Oompa Loompas
In 2006, British mathematician and entrepreneur Clive Humby coined the phrase “Data is the new oil”. This analogy has often been repeated since, including by myself. It’s not an analogy that stands up to a lot of scrutiny though. Sure, as with oil there’s big money involved, and controlling data gives you power similar to controlling oil, and as such it is worth putting thought and effort into. There is definitely a whole new swathe of business created in the ecosystem of ‘processing data’, be it the curation of training data sets, helping people annotate data, or simply storing and organising it. And what data and oil do have in common is that developing data-driven AI can be energy-intensive. That is why we consider the impact of our technology on the planet. At BlueSkeye we specialise in creating energy-efficient, low-compute AI. We also look to the sustainability of our data hosts, using smarter, more efficient data centres which are more modern and twice as efficient as a typical enterprise data server. We aim to be carbon-negative by the end of the decade. However, if you start looking even a little bit closer, you’ll see a number of big discrepancies between oil and data:
Oil is used up in every process it’s involved in. Be it creating a plastic toy, powering a jet engine, or turned into paint, once processed it’s gone, its value transferred to the new thing that was created with it. Data on the other hand can be used and re-used infinitely for AI, and often its value grows over time - when you come up with a better ML architecture, you can reuse the old data and get better results. Data is often linked to other (meta) data in its use, for example by annotation or correlation analysis, and in doing so the value of such data points goes up. Delft University’s main library says in big letters that knowledge is the only resource that grows in its use. I think we can now add AI Data to that.
Data is infinitely copiable and the generation of data is never-ending. In fact, data can’t legally be owned (see GDPR on data ownership), instead companies and individuals have data processing rights. Oil is of course not infinitely copiable. In fact, it is a scarce resource. This means that ownership is much more straightforward, as is its trade. As a result of being so easily copiable and its weak rules on ownership, keeping data secure and hidden is important. Which brings us to visibility:
Ever seen an oil tanker? Big uh? Same with those storage silos near (air)ports. Can’t miss 'em. All the infrastructure is gigantic. Data, on the other hand, is invisible. This allows secrecy - the amount of data that you can store on a thumb drive is insane. This is a very useful property for companies.
Besides issues around ownership and processing rights, there are many other aspects of data that make it hard to trade it. In particular aspects around personal data severely restrict or slow down all business that involves data. Legal costs stack up very quickly. Oil on the other hand is very straightforward - a barrel of oil is pretty much tradable anywhere in the world in the same way.
So what is Data if not oil? Well, to keep things light-hearted in what for many people is the end of a well-deserved work break, I think you could do worse than comparing Data with Willy Wonka’s Oompa Loompas. Like Data, the Oompa Loompas™ are Willy Wonka™’s secret to creating wonderful products such as Ice Creams that Never Melt and Square Sweets that Look Round. As people are finding out, it is becoming ever easier to develop AI systems. They are hard to patent, and patent infringement is hard to detect. Data is your way of staying ahead of the curve, much as Oompa Loompas were Willy Wonka’s way of staving off his competition. Oompa Loompas require taking good care of, and you can’t just trade them - they’re intelligent, emotional creatures after all (data, by the way, is neither intelligent nor emotional - I wasn’t going for a perfect analogy). I suspect Oompa Loompas get better at what they do all the time, so they too grow in value in their use. Oompa Loompas can be effectively hidden, just like data. Perhaps the one thing where they’re different is that I don’t think they can be arbitrarily copied. That said, we never did find out how Oompa Loompas are made, so….
Call to action: what would you think is a better analogy for data than either Oil or Oompa Loompas?
A very happy 2024 to everyone, and remember to see the lighter side of things from time to time
Photo generated by Bing
Written by BlueSkeye Founding CEO, Prof Michel Valstar
See original post on LinkedIn HERE