Data Science — Why Is Something So Obviously Good But So Difficult To Scale?



It should not be complex, complicated or even costly to scale. For some industries, it just needs a different approach.


Many of us have already learnt about the enormous potential of data science and the value it can bring to organizations. However, most companies are still scratching the surface and haven’t caught hold of the potential. It has sure transformed consumer tech companies such as Google, Facebook or Amazon that has amassed hundreds of millions of users and a massive amount of data. Industries such as manufacturing, oil and gas, and healthcare still have gaps to fill for this capability to work for them. And here is the problem: The playbook or strategy that those tech-based companies use to build their data capabilities — where a highly centralized one-size-fits-all approach that can be used to develop their products or services to serve the mass — won’t work for other industries.

Getting data science right need not be complex or complicated, you just need to find the right approach in your industry

These legacy industries often require many bespoke solutions that can be adapted to their diverse use cases. This doesn’t mean that data science is a holy grail that they can never get hold of, but it just means that they need to adopt a different approach and it does not mean that it is more expensive.

I have previously written a few other articles on the need to adopt a differentiated approach to data science for non-tech companies. You may read more here

Why is adopting data science in a non-tech company is so hard?

The value of data is so obvious and there is an insatiable appetite among many commercial leaders to build those capabilities. Below are some of the top challenges faced by many non — tech companies:

  • Limited Dataset: Think of Netflix with millions and millions of subscribers that generate millions of data points. Or Amazon platform having billions of transactions happening monthly. But in other industries, the dataset is much smaller. For example, in pharma companies, the number of monthly sales transactions is probably a few hundred to thousands. To detect a rare disease, it probably only has a sample size of 100 diagnoses. Techniques built for millions of data points will not work when there are only hundreds and thousands of data points.

You can still reap a whole lot of value from data, even if you do not have the infrastructure put in place yet
  • Cost: Consumer tech companies convene hundreds of skilled data professionals to build products or services that account for a large part of their businesses — The Facebook marketing system generates more than USD10 billion in revenue per year. In a non — tech industry, many single-digit million-dollar projects require a custom solution. For example, each factory manufacturing a different product requires a different set of systems and processes that cannot be simply standardized. But these projects might not make economic sense to hire a dedicated group of data professionals.

Data models built in your local market will be reusable when it comes to scaling. Do not kill off any projects in the name of scale without understanding the details

  • Communication Gap: Data science, like digital transformation, is an elusive term that even many data professionals do not have a common alignment. For an enterprise to be data-centric, we need to minimize the communication and understand the gap between various functions. The gap in consumer tech companies is less because the core revenue-generating business is technology. But the gap becomes wider when it comes to other industries.

Naturally, IT colleagues and marketers speak different languages
  • Scaling Local Successes: Even when there are specific used cases developed in a local country or a specific business unit, there is a huge amount of effort and collaboration across more stakeholders to scale that success. This problem is exacerbated by the fact that there is a shortage of data talents in this space.

You might be interested in another article on how to build a data science team in the commercial function.

So what can we do about it?

Create a common understanding of what does your organization mean when it comes to data

Start to ask what does data-driven means to your organization? Senior leaders can read about use cases and learn from companies or industries, but they should seek to understand the broader context behind it. The same use cases might require a different type of approach. For example, consumer tech companies can reap huge success in developing a recommendation system from a centralized team but similar recommendation systems in other industries might need a little more customization and decentralization of those capabilities to be successful. So while it is good to learn the best practice from other industries, it is also important to evaluate and evolve those best practices to fit in the context.


Make the use case successful and repeatable. Cost-wise, it can always be optimized at later stages with so much advancement in the tech space.

Improve data literacy and make people understand how they can tangibly contribute to building that data-driven organization

Data science is a team sport, and it requires more than a few smart cookies running the show. In the past decade, a lot of research in data science was driven by a model-centric development in which teams are attempting to optimize their models to mine for insights from a set of data. It has benefitted consumer tech companies that have a rich source of data.


For companies with a limited dataset, it is very important to ensure they have reasonably good data quality and sound business assumptions to feed into the model. And this is the area where the larger part of the organization can play a significant part in the democratization of data. We could build a simple process or tools for other stakeholders to help to engineer the data, especially in a time with great data talent shortages. There are already modules and workflows available to implement human reviews to machine learning predictions. One example is Amazon A2I (Amazon Augmented AI).


Involving other stakeholders early in the process would also improve data literacy and increase the odds of success in harnessing the value of data.

Data is the new electricity

Data science can thrive in non-tech companies, but most are still scratching the surface. With rapid digitization and exponential growth of data, the ability to harness value from data is a major differentiator from your competitors.

In all industrial revolutions, it produces winners and a disproportional number of losers. Don’t be left behind in this data revolution

The opinions expressed in this article are my own and I do not represent or speak on behalf of any organization. If you enjoyed reading this and would like to have a conversation on this topic, please feel free to reach out to me on LinkedIn. Also, stop by to visit some of the videos around digital and data.

9 views0 comments