By Michael Bitar
Welcome to this getting started guide on deploying and utilizing MLflow locally. While this guide was created on a macOS environment, the concepts translate perfectly to Windows and Linux as well.
If you are an MLOps or Machine Learning engineer, you likely spend a vast amount of time experimenting with different parameters, hyperparameters, and preprocessing strategies on your local workstation. Historically, tracking these experiment iterations often involved manual spreadsheets—a tedious, error-prone, and unscalable approach.
The machine learning ecosystem desperately needed a toolchain to bring DevOps-like standard practices to model development, giving rise to MLOps. Among these tools, MLflow stands out as a leading open-source framework designed to organize machine learning development workflows.
MLflow helps orchestrate the ML lifecycle across four core capabilities:
| Capability | Description |
|---|---|
| Experiment Tracking | Log, compare, and visualize runs, metrics, and parameters |
| Reproducible Packaging | Bundle code into reproducible formats for consistent execution |
| Model Versioning | Version model artifacts and metadata in a centralized registry |
| Collaboration & Deployment | Share models and facilitate easy hand-off to production |
While MLflow supports robust centralized, cloud-hosted configurations, you might not always need a remote tracking server. Deploying MLflow locally on your workstation is an incredibly lightweight way to begin. It allows you to reliably log metrics, compare parameters, and save artifacts—guaranteeing you never have to guess which hyperparameter combination produced your best-performing model.
This guide will walk you through a step-by-step local implementation on a laptop. Once you've mastered the basics, you'll be well-prepared to scale into advanced, cloud-hosted deployments for broader team collaboration.
Here is a high-level overview of the local machine learning pipeline we will be executing:
flowchart TD
subgraph Data Preparation
D1[(Wine Quality Dataset)] --> D2[Load CSV via Pandas]
D2 --> D3[Split Train & Test Data]
end
subgraph Model Training
T1[Define Hyperparameters<br/><code>alpha, l1_ratio</code>] --> T2{Start MLflow Run}
D3 --> T2
T2 --> T3[Train ElasticNet Model]
T3 --> T4[Predict on Test Set]
T4 --> T5[Evaluate: RMSE, MAE, R2]
end
subgraph MLflow Tracking
T5 --> M1(Log Parameters)
M1 --> M2(Log Performance Metrics)
M2 --> M3(Log Model Artifact)
end
subgraph UI & Analysis
M3 --> U1[[Launch MLflow UI]]
U1 --> U2{Analyze Results}
U2 -->|Tune Params| T1
U2 -->|Select Best| U3[Register Model]
end
| Requirement | Version | Purpose |
|---|---|---|
mlflow |
2.x | Experiment tracking and model registry |
pandas |
latest | Data loading and manipulation |
numpy |
latest | Numerical operations |
scikit-learn |
latest | Model training and evaluation metrics |
You can use the code snippets in this guide within a Jupyter Notebook or a standard Python script. First, ensure you have an active internet connection to download the required packages.
# Install mlflow and other necessary packages
!pip install mlflow pandas numpy scikit-learnWith our environment ready, let's import the necessary libraries. We'll use pandas for data manipulation, scikit-learn for generating our model and calculating metrics, and mlflow to track our experiment runs.
import os
import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearnimport logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)To measure our model's performance, we define a helper function that evaluates test predictions against actual data, outputting three standard regression metrics:
| Metric | Symbol | Description |
|---|---|---|
| Root Mean Squared Error | RMSE |
Penalizes large errors more than small ones |
| Mean Absolute Error | MAE |
Average magnitude of prediction errors |
| R² Score | R2 |
Proportion of variance explained by the model |
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2Next, we download an open-source Wine Quality dataset. We'll then split this data into training sets (for model learning) and test sets (for model evaluation).
warnings.filterwarnings("ignore")
np.random.seed(40)
csv_url = (
"http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
)
try:
data = pd.read_csv(csv_url, sep=";")
except Exception as e:
logger.exception(
"Unable to download training & test CSV, check your internet connection. Error: %s", e
)
# Split into training and test sets using (0.75, 0.25) split
train, test = train_test_split(data)
# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]Now comes the core of the MLOps process. We will train an ElasticNet model — a linear regression model combining L1 and L2 regularizations.
Notice the with mlflow.start_run(): context block below. By executing our training within this block, MLflow will automatically monitor the run. We use log_param for hyperparameters and log_metric for our evaluation scores.
Tip: Try running the cell below multiple times with different values between 0 and 1 for
alphaandl1_ratio. Every execution will automatically be logged to your local MLflow tracking database.
| Hyperparameter | Suggested Range | Effect |
|---|---|---|
alpha |
0.0 – 1.0 | Multiplies the penalty terms; higher = stronger regularization |
l1_ratio |
0.0 – 1.0 | Mix of L1 vs. L2; 0 = pure Ridge, 1 = pure Lasso |
alpha = .35 # change this value for each run
l1_ratio = .45 # change this value for each run
with mlflow.start_run():
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
if tracking_url_type_store != "file":
# See https://mlflow.org/docs/latest/model-registry.html#api-workflow
mlflow.sklearn.log_model(lr, "model", registered_model_name="ElasticnetWineModel")
else:
mlflow.sklearn.log_model(lr, "model")Example output:
Elasticnet model (alpha=0.350000, l1_ratio=0.450000):
RMSE: 0.7616514499663437
MAE: 0.5936841528680933
R2: 0.17804834226795552
After executing the training cell a few times with varying hyperparameters, it's time to visualize our tracking data. MLflow comes with an intuitive, built-in graphical dashboard.
Launch the UI from your terminal:
mlflow uiThen open your browser and navigate to:
http://localhost:5000
In the MLflow dashboard (pictured below), you'll find a visual record of every run you executed.
| Feature | What you can do |
|---|---|
| Compare Performance | Sort the grid by metrics like rmse or r2 to identify the best run |
| Inspect Metadata | Click into any experiment to review the hyperparameter combinations that produced those results |
| Manage Models | Explore rich artifact logs and register models of interest for future deployment |
When you click on one of the experiment runs, you can see a detailed view of its parameters and related artifacts.
In the same view, you can inspect structural artifact metadata such as the registered model schema and its environment dependencies.
By leveraging MLflow directly on your local workstation, you can rapidly prototype models and systematically track every change without relying on spreadsheets or disparate notes. Even when working independently, this discipline brings immediate productivity gains and structure to your Machine Learning initiatives.
What we covered in this guide:
| Step | Action |
|---|---|
| 1 | Set up a local MLflow environment |
| 2 | Embedded metrics and parameter logging into a scikit-learn training pipeline |
| 3 | Used the MLflow UI to compare models and investigate runs visually |
What's Next?
My upcoming MLOps guides will cover remote MLflow deployments — diving into the platform's more advanced features like centralized tracking servers, persistent backend stores, and remote artifact storage to fully support cloud pipelines and cross-functional team collaboration.
Thank you for following along!



