This fall, Harvard will offer a master’s degree in data science for the first time. Data is gushing from everywhere today: TVs, cars, door-bells, pressure valves, and even persimmons and pineapples (hooked up via low cost RFID tags). But most corporate leaders don’t have time to study their own data – they have quarterly sales targets to meet. What separates the corporate leaders from the laggards when it comes to data, then, is the understanding that this new kind of “black gold” presents greater potential than an R&D breakthrough thanks to the power of the models it can fuel.
Models made by data scientists are already powering our future. Just this earnings season, CEOs of Applied Materials, Dropbox (Sequoia is on the board), Electronic Arts, Tencent and Vodafone all extolled the virtues of data science models—equations, rendered in software, that generate predictions and recommendations from historical data. They are among the enlightened businesses that have driven a six-fold increase in the roles open for data scientists in the last five years.
We’re surrounded by the benefits and consequences of the 2.5 quintillion bytes of data spewed out around the world every day. The ways we can now analyze huge increases in meteorological data has made forecasts twice as accurate as in 2005. Nonprofit organization Thorn uses data models to help combat child sex trafficking, analyzing language from online escort advertisements to alert law enforcement of potential victims. And computer models are now helping predict life expectancy for terminally ill patients, which promises to lead to a transformation in how we approach palliative care.
Models can also help optimize longstanding enterprises. UBS estimates predictive modeling and AI will generate 4% cost-savings for banks by 2020. Research by MIT’s Andrew McAfee and Erik Brynjolfsson found companies that integrate analytics with their operations are about 6% more profitable and productive than their peers. Those percentages compound.
For all their recent progress, predictive models are the continuation of a long trend. Businesses have always tried to better understand market dynamics and maximize profitability in an uncertain world. Life insurance companies employed actuarial tables in the late 17th century (the first table was co-created by Edmond Halley, of Halley’s comet fame) to predict premiums. Quantitative investors have been using models to forecast market movements for decades. And one of the first great breakthroughs in public sanitation, following a widespread cholera outbreak in mid-19th century London, was precipitated by an early application of data science.
If models are so powerful, why isn’t every company using them? To begin with, it’s a hard fight to get access to the best data science talent. It’s also taxing. Building a single predictive model — or five, or fifteen — won’t guarantee that a business becomes more efficient or profitable. Nor will hiring a few data scientists. It took more than a webmaster and a website in the late 1990s for a company to be cutting-edge — dotcom founders had to build an entirely new organizational capability. The same holds true for predictive models today. To achieve results takes time, money and a commitment to treat data science with the same level of respect as sales, marketing and engineering.
Data science can also lead to unintended consequences. Knight Capital, a high-frequency trading firm, lost $440 million in less than an hour because of a faulty update to a trading model in August 2012. Some question the morality of models. The author and data scientist Cathy O’Neil has argued that models can be biased and discriminatory — automatically denying someone a loan because of their zip code, for example.
Predictive models are therefore understandably hard to do well. But companies must make the investment to do so in the short term or face extinction in the long. Many established companies are rushing to bottle the genie. Boeing is applying data science to improve its massive supply chain. Amazon’s Jeff Bezos wrote in a 2016 letter to shareholders that models drove “demand forecasting, product search ranking, product and deals recommendations, merchandising placements, fraud detection, translations, and much more” — the core Amazon experience, in other words. For Netflix, data science is worth an estimated $1 billion a year and drives 80% of its content consumption. For Uber and Lyft, efficient routing of their fleets has helped them run rings around the taxi companies.
Three years ago former Cisco Chairman and CEO John Chambers boldly predicted that more than 40% of businesses would disappear over the next decade. The pace of technological change is staggering, and collecting data is easy. Understanding the data is hard. Corporate leaders who fail to fuel competency in data science risk finding themselves in that 40%.