MLOps 101

Abhijit Singh
5 min readMar 7, 2021

MLOps. — A set of best practices aimed at automating the ML lifecycle, bringing together the ML system development and ML system operations.

ML Development + ML Operations= MLOps

ML Project Lifecycle

A typical ML workflow includes steps like data ingestion, pre-processing, model building & evaluation, and finally deployment. However, this lacks one key aspect i.e. feedback. The primary motivation of any “model monitoring” framework thus is to create this all-important feedback loop post-deployment back to the model building phase. This helps the ML model to constantly improve itself by deciding to either update the model or continue with the existing model. To enable this decision the framework should track & report various model metrics under two possible scenarios described below.

  1. Scenario I: The training data (70% data which I’ve mentioned in the above case study) is available and the framework computes the said model metrics both on training data and production (inference) data post-deployment and compares to make a decision.
  2. Scenario II: The training data is not available and the framework computes the said model metrics based only on the data that is available post-deployment.

Why use MLOps?

As you move from running individual artificial intelligence and machine learning (AI/ML) projects to using AI/ML to transform your business at scale, the discipline of ML Operations (MLOps) can help. MLOps accounts for the unique aspects of AI/ML projects in project management, CI/CD, and quality assurance, helping customers improve delivery time, reduce defects, and make data scientists more productive. MLOps refers to a methodology that is built on applying DevOps practices to machine learning workloads.

Like DevOps, MLOps relies on a collaborative and streamlined approach to the machine learning development lifecycle where the intersection of people, process, and technology are required to optimize the end-to-end activities required to develop, build, and operate machine learning workloads.

--

--