Romain Avouac (Insee), Thomas Faria (Insee), Tom Seimandi (Insee)
Difficulty of transitioning from experiments to production-grade machine learning systems
Leverage best practices from software engineering


Reproducibility
Versioning
Automation
Monitoring
Collaboration
Multiple frameworks implement the MLOps principles
Pros of MLflow
1️⃣ Introduction to MLFlow
2️⃣ A Practical Example: NACE Code Prediction for French companies
3️⃣ Deploying a ML model as an API
4️⃣ Distributing the hyperparameter optimization
5️⃣ Maintenance of a model in production
Preparation of the working environment
It is assumed that you have a Github account and have already created a token. Fork the training repository by clicking here.
Create an account on the SSP Cloud using your professional mail address
Launch a MLflow service by clicking this URL
Launch a VSCode-python service by clicking this URL
Open the VSCode-python service and input the service password
In VSCode, open a terminal and clone your forked repository (modify the first two lines):
Install the necessary packages for the training (with uv):
```
You’re all set !




Introduction to MLflow concepts
VSCode, open the notebook located at formation-mlops/notebooks/mlflow-introduction.ipynbMLflow UI and try to build your own experiments from the example code provided in the notebook. For example, try to add other hyperparameters in the grid search process.MLflow simplifies the tracking of model training
NACE
At Insee, previously handled by an outdated rule-based algorithm
Common problematic to many National Statistical Institutes
“Bag of n-gram model” : embeddings for words but also n-gram of words and characters
Very simple and fast model
OVA: One vs. All
Part 1: Using a custom model
src folder. Check them out. In particular, the train.py script is responsible for training the model. What are the main differences compared to application 1?MLflow model integrates preprocessing?Part 2 : From notebooks to a package-like project
The train.py script is also responsible for logging experiments in MLFlow. Note how the parameters of each experiment are passed to the training function when the script is called.
To make the model training procedure more reproducible, MLFlow provides the mlflow run command. The MLproject file specifies the command and parameters that will be passed to it. Inspect this file.
Run a model training using MLFlow. To do this, open a terminal (-> Terminal -> New Terminal) and execute the following command:
In the MLflow interface, examine the results of your previous run:
Experiments -> nace-prediction -> <run_name>You trained the model with certain default parameters. In the MLproject file, check the available parameters. Retrain a model with different parameters (e.g., dim = 25).
MLflow, compare the 2 models by plotting the accuracy against one parameter you have changed (i.e. dim)
Select the 2 runs -> Compare -> Scatter Plot -> Select your X and Y axisfasttext to make it easily queryable from Python.Part 3: Querying the locally trained model
predict_mlflow.py in the src folder of the project. This script should:
fasttext model["vendeur d'huitres", "boulanger"]).💡 Don’t forget to read the documentation of the predict() function from the custom class (src/fasttext_wrapper.py) to understand the expected input format!
predict_mlflow.py script."COIFFEUR" et "coiffeur, & 98789".k parameter and try to understand how the output structure has changed accordingly.MLflow is versatile
MLproject)Simplicity: single entry point that hides the underlying complexity of the model
Standardization: HTTP requests -> agnostic to the programming language used
Scalability: adapts to the load of concurrent requests
Modularity: separation of model management and its availability
Container: self-contained and isolated environment that encapsulates the model, its dependencies and the API code
Advantages:
Technical prerequisites for deploying on Kubernetes
KubernetesPart 1: Exposing a ML model locally as an API
app folder. Check them.VSCode./docs to your URL.Part 2 : Deploying manually a machine-learning model as an API
Dockerfile to see how the image is built. The image is automatically rebuilt and published via Github Actions, if interested have a look to .github/workflows/build_image.yml. Dans le cadre de cette formation, nous allons tous utiliser cette même image.kubernetes/deployment.yml and modify the highlighted lines accordingly:deployment.yml
kubernetes/ingress.yml and modify (two times) the URL of the API endpoint to be of the form <your_firstname>-<your_lastname>-api.lab.sspcloud.frKubernetes contracts contained in the kubernetes/ folder in a terminal to deploy the APIingress.yml fileMLFLOW_MODEL_NAME or MLFLOW_MODEL_VERSION (if you didn’t modify the model name) environment variable in the deployment.yml fileKubernetes contracts to update the APIPart 3 : déploiement continu d’un modèle de ML en tant qu’API
⚠️ The previous applications must have been created with the Git option to be able to follow this one.
Previously, you deployed your model manually. Thanks to ArgoCD, it is possible to deploy a model continuously. This means that every modification of a file in the kubernetes/ folder will automatically trigger redeployment, synchronized with your GitHub repository. To convince yourself, follow the steps below:
ArgoCD service by clicking on this URL. Open the service, enter the username (admin), and the service’s password.argocd/template-argocd.yml and modify the highlighted lines:template-argocd.yml
New App and then Edit as a YAML. Copy and paste the content of argocd/template-argocd.yml, and click on Create.ingress.yml file/docs to your URLMLFLOW_MODEL_NAME or MLFLOW_MODEL_VERSION (if you didn’t modify the model name) environment variable in the deployment.yml fileArgoCD to automatically synchronize the changes from your GitHub repository, or force synchronization. Refresh your API and check on the homepage that it is now based on the new version of the model.Part 4: Querying your deployed model
predict_api.py. This script should:
predict_api.py
import pandas as pd
import requests
# Function to make a request to the API
def make_prediction(api_url: str, description: str):
params = {"description": description, "nb_echoes_max": 2}
response = requests.get(api_url, params=params)
return response.json()
# Data URL
data_path = "https://minio.lab.sspcloud.fr/projet-formation/diffusion/mlops/data/data_to_classify.parquet"
# Load the Parquet file into a pandas DataFrame
df = pd.read_parquet(data_path)
# API URL
api_url = "https://<your_firstname>-<your_lastname>-api.lab.sspcloud.fr/predict"
# Make the requests
responses = df["text"].apply(lambda x: make_prediction(api_url, x))
# Display the DataFrame with prediction results
print(pd.merge(df, pd.json_normalize(responses),
left_index=True,
right_index=True))predict_api.py script.In ArgoCD, open your application and click on your pod that should start with "codification-api-...". Observe the logs.
What information do you have? Is it sufficient?
Important
We performed a series of GET requests here as we have a single entry point to our API. To perform batch queries, it is preferable to use POST requests.
➡️ Communication essential between teams to monitor the model in production
⚠️ The term monitoring of an application/model has different definitions depending on the team.
Part 1: Logging business metrics
app/main.py file.main.py
Commit your changes and push them to your remote repository.
Whenever you make a change to your API, it needs to be redeployed for the changes to take effect. In theory, it would be necessary to rebuild a new image for our API containing the latest adjustments. To simplify, we have already built the two images with and without logs in the API. Until now you have used the image without logs, redeploy your API using the image with logs tagged as logs.
kubernetes/deployment.yml file, replace the no-logs tag with the logs tag:deployment.yml
Commit your changes and push them to your remote repository.
Wait 5 minutes for ArgoCD to automatically synchronize the changes from your Github repository or force synchronization.
predict-api.py script."codification-api-...". Observe the logs..parquet format.parquet filesPart 2: Creating a monitoring dashboard
We will use Quarto Dashboards. Open the dashboard/index.qmd file and inspect the code. To retrieve the data needed to create the dashboard, we use a serverless DBMS: DuckDB. DuckDB allows us to run SQL queries on a .parquet file containing parsed logs. This file contains one row per prediction, with the variables timestamp, text, prediction_1, proba_1, prediction_2, and proba_2.
To visualize the dashboard, enter the following commands in a Terminal from the project root and click on the generated link.
Currently, the percentage of predictions with a probability greater than 0.8 does not correspond to reality. Modify the SQL query to obtain the pct_predictions variable to display the correct value.
daily_stats variable to display the correct charts.Kubernetes

apiVersion: argoproj.io/v1alpha1
kind: Workflow # new type of k8s spec
metadata:
generateName: hello-world- # name of the workflow spec
spec:
entrypoint: whalesay # invoke the whalesay template
templates:
- name: whalesay # name of the template
container:
image: docker/whalesay
command: [ cowsay ]
args: [ "hello world" ]


apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-parameters-
spec:
entrypoint: whalesay
arguments:
parameters:
- name: message
value: hello world
templates:
- name: whalesay
inputs:
parameters:
- name: message # parameter declaration
container:
image: docker/whalesay
command: [cowsay]
args: ["{{inputs.parameters.message}}"]steps or dag)apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello-hello-hello
# This spec contains two templates: hello-hello-hello and whalesay
templates:
- name: hello-hello-hello
# Instead of just running a container
# This template has a sequence of steps
steps:
- - name: hello1 # hello1 is run before the following steps
template: whalesay
- - name: hello2a # double dash => run after previous step
template: whalesay
- name: hello2b # single dash => run in parallel with previous step
template: whalesay
- name: whalesay # name of the template
container:
image: docker/whalesay
command: [ cowsay ]
args: [ "hello world" ]





Part 1 : introduction to Argo Workflows
Argo Workflows service by clicking this URL. Open the service and input the service password (either automatically copied or available in the README of the service)VSCode, create a file hello_world.yaml at the root of the project with the following content:hello_world.yml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-
labels:
workflows.argoproj.io/archive-strategy: "false"
annotations:
workflows.argoproj.io/description: |
This is a simple hello world example.
You can also run it in Python: https://couler-proj.github.io/couler/examples/#hello-world
spec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest
command: [cowsay]
args: ["hello world"]Hello world workflow via a terminal in VSCode :Argo Workflows. Find the logs of the workflow you just launched. You should see the Docker logo .Part 2 : distributing the hyperparameters optimization
argo_workflows/workflow.yml file. What do you expect will happen when we submit this workflow ?workflow.yml
MLflow UI to check what has been done.An introduction to MLOps with MLflow