From prototype to production - The journey of a successful data science project (Part 2)

In our previous post we covered everything from defining the objective for a data science project through to deployment of a machine learning model. Now we’ll look at post-production model monitoring through MLOps (Machine Learning Operations) and how to maintain model stability and performance.

Post-production model monitoring through MLOps

Deploying a machine learning model to production is just the first step in the journey of creating a successful ML solution. The work continues, as it is important to continuously monitor the model's performance and make adjustments as necessary. It's vital to ensure that the model provides accurate and reliable predictions, addresses the specific business problem it was designed to solve, and meets the desired outcomes established at the outset.

Monitoring can also help identify issues such as data drift, which can occur when the input data distribution changes over time. Without continuous monitoring, detecting and addressing these issues can be difficult, leading to poor model performance and unreliable results.

A robust MLOps practice allows teams to track model performance, detect issues, and make the necessary adjustments to the model to meet business requirements and deliver the desired outcomes. Here are a few tips that have helped us do this effectively:

1. Set up the pipeline infrastructure to compare models

When developing and deploying machine learning models, establish a pipeline infrastructure that allows for concurrent evaluation of multiple model versions. This approach enables iterative improvements and ensures that ultimately the best-performing model is deployed in production.

There are different methods to evaluate new models, but two proven methods include:

Offline experiments: When it's possible to run an experiment without surfacing the output in production, running an offline experiment is preferable. For example, for a classifier, where you have access to the labels, having a staging flow is sufficient. Running an offline experiment first ensures that any performance degradations won't impact users. For instance, a major update to the model can run in a staging flow for a few weeks before promoting it to production.

Online A/B test: An online A/B test works well in most cases. By exposing a random group of users to the new version of the model, it's possible to get a clear view of its impact relative to the baseline. For example, for a recommendation system where the key metric is a user engagement, comparing the user engagement levels exposed to the new model version to users seeing the baseline recommendations can indicate whether or not there's a significant improvement.

2. Avoid overfitting

It is crucial to avoid overfitting when optimising a model, and one way to do so is to develop a reliable comparison framework. However, there is a risk of introducing bias when using a fixed test data set, as the model may become too specialised to perform well on those specific examples. To mitigate this problem, it is recommended to incorporate various practices into your comparison strategy, such as cross-validation, using different test data sets, employing a holdout approach, implementing regularisation techniques, and conducting multiple tests in cases where random initialisations are involved. These measures help to ensure that the model performs well on a wider range of data and is not overly optimised to a specific set of examples.

Ensuring model stability and performance

Ensuring the stability of a model's predictions over time is a critical factor that is often neglected in the pursuit of improving its performance. While optimising a model's accuracy and precision is a desirable objective, it is equally crucial to assess its ability to produce consistent predictions over extended periods. Failing to do so can undermine the reliability and usefulness of the model, rendering it inadequate for practical applications.

In other words, it is crucial to ensure that the model's predictions remain consistent and reliable for individual subjects, even as the overall performance improves.

  1. Weigh the cost and the improvement: When considering changes to a model, it is important to weigh the potential improvement in performance against the impact that changes may have on the model's predictions. Consider the costs of managing these changes. Avoid major changes to a model unless the performance improvements justify these costs.
  2. Keep it shallow and simple: Shallow models are generally preferred over deep models, particularly in classification problems. This is because changes in the training dataset are more likely to cause a deep model to update its decision boundary in local areas, which can lead to unpredictable changes in the model's predictions. Only use deep models when the potential performance gains are significant.
  3. Evaluate model performance using multiple metrics: A model may perform well on one metric such as accuracy but poorly on others. It is important to consider multiple metrics to evaluate its overall performance
  4. Consider the cost of errors: In some cases, errors may have a significant impact on the outcome. In healthcare, misdiagnosing a patient can have severe consequences. It is important to consider the cost of errors when designing and evaluating models.
  5. Regularisation: It is also important to check for objective function conditions and regularisation. A poorly conditioned model may have a decision boundary that changes drastically, even when the training conditions only change slightly. Regularisation can help to mitigate this issue and ensure the stability of the model's predictions over time.

In conclusion, taking a strategic approach to machine learning can bridge the widening gap between organisations that effectively utilise data science and those that struggle to do so. This involves identifying key business challenges, setting clear goals and objectives, and building a team with the right skills and expertise to execute data science projects.

However, to truly succeed with machine learning, organisations must not only implement a process but also foster a culture of experimentation and innovation. This requires a willingness to invest in data science as a core component of business strategy and to a commitment to continuous improvement.

Summing up: How to deliver a successful data science project

  1. Define the business objective for your machine learning project to ensure it addresses a specific problem and has clear objectives.
  2. Use a data-centric approach to collect and assess relevant data to inform model development and training.
  3. Start with a simple baseline solution that can be easily iterated upon and improved over time with more complex algorithms and additional features.
  4. Develop an end-to-end prototype of the model that includes all the necessary components and can be tested with real-world data.
  5. Test and refine the model to ensure it meets performance requirements and solves the business problem effectively.
  6. Deploy the model into production once it has been thoroughly tested and refined.
  7. Monitor the model's performance post-production using MLOps to ensure it continues to meet business requirements and remains stable over time.
  8. Regularly review and update the model to ensure it performs optimally over time.

By following these checks, you can increase the chances of success for your machine learning project and help ensure it effectively delivers the desired business objectives.


We make software better every day

Get in touch

Copyright © 2024 Sonalake Limited, registered in Ireland No. 445927. All rights reserved.Privacy Policy

Nr Cert. 18611 ISO/IEC 27001