Why machine learning models fail

by Joseph K. Clark

Machine Learning is quickly becoming an essential tool for automation, but failing models and improper background knowledge create more issues than solving them. 

“I think to build a good machine learning model… if you’re trying to do it repeatedly, you need great talent and an outstanding research process. Then finally, you need technology and tooling that’s up-to-date and modern,” said Matthew Granade, co-founder of machine learning platform provider Domino Data Lab. He explained how all three elements must unite and operate in unity to create the best possible model. However, Granade placed a particular emphasis on the second aspect. “The research process determines how you’re going to identify problems to work on, find data, work with other parts of the business, test your results, and deliver those results to the business,” he explained. 

According to Granade, so many organizations face failing models without the essential combination of those aspects. “Companies have high expectations for what data science can do, but they’re struggling to bring those three different ingredients together,” he said. According to a study conducted by Domino Data Lab, 97% say data science is crucial to long-term success. However, many say organizations lack the staff, skills, and tools to sustain that success. This raises the question: why are organizations investing so much into machine learning models but failing to invest in what will make their models an ultimate victory?

Granade traces this problem back to the tendency to look for shortcuts. “I think the mistake a lot of companies make is that they kind of look for a quick fix,” he began, “They look for a point solution or this idea of ‘I’m going to hire three or four smart PHDs, and that’s going to solve my problem.’ ’’ According to Granade, these quick fixes never work long-term because the issues run deeper. Having the best minds on your team will always be essential, but they cannot exist independently. Without the best processes and tech to back them up, it becomes futile to utilize data science. 

Domino Data Lab’s study also revealed that 82% of executives polled said they thought that leadership needed to be concerned about bad or failing models as the consequences of those models could be astronomical. “Those models could lead to bad decisions that produce lost revenue, it could lead to bad key performance indicators, and security risks,” Granade explained. 


Granade predicts that those companies that find themselves behind the curve on data science and machine learning practices will quickly correct their mistakes. Organizations that have tried and failed to implement this kind of technology will keep their eye on the others that have succeeded and take tips where they can get them. Not adapting to this practice isn’t an option in most industries as it will inevitably lead to certain companies falling behind as a business. Granade goes back to a comprehensive approach as the key to remedying the mistakes he has seen. “I think you can say, ‘We’re going to invest as a company to build out this capability holistically. ‘We’re going to hire the right people, we’re going to put a data science process in place, and the right tooling to support that process and those people,’ and I think if you do that, you can see great results,” he said. 

Jason Knight, co-founder and CPO at OctoML, believes that another aspect of creating a successful data science and machine learning model is a firm understanding of the data you’re working with. “You can think you have the right data, but because of underlying issues with how it’s collected or annotated or generated in the first place, it can create problems where you can’t generate a model out of it,” he explained. When there is an issue with the data that generates a successful model, no matter what technique an organization uses, it will not work as intended. This is why it is so important not to skip steps when working with this kind of technology; assuming that the source data will work without adequately understanding its details will spark issues down the line.

Vaibhav Nivargi, founder and CTO of the cloud-based AI platform Move Works, emphasized good data as essential to creating a successful model. “It requires everything from the right data to represent the real world, to the right understanding of this data for a given domain, to the right algorithm for making predictions,” he said. Combining these will help ensure that the data in creating the model is dependable and will produce the desired results.

OctoML’s Knight also said that while specific organizations have not seen the success they initially intended with data science and machine learning, he thinks the future is bright. “In terms of the future, I remain optimistic that people are pushing forward the improvements needed to give solutions for the problems we have seen,” he said.

Related Posts