Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions examples/machine-learning_ClassicAI/mlflow-server/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# MLflow Tracking Server Template

This template demonstrates how to use MLflow to track, log, and manage machine learning experiments in a single notebook. It trains a Random Forest model on the Diabetes dataset, logs parameters, metrics, and artifacts, and enables viewing and reloading runs locally or through a remote MLflow tracking server.

You can deploy MLflow-tracked models via platforms like **Saturn Cloud**, refer to Saturn’s documentation for deployment guidance.

---

## References

* [MLflow Documentation](https://mlflow.org/docs/latest/index.html)
* [Saturn Cloud Docs](https://saturncloud.io/docs/)
* [Scikit-learn RandomForestRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html)
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "JwE9FH9oFOMb"
},
"source": [
"# MLflow Tracking Server\n",
"\n",
"**MLflow** is an open-source platform that simplifies the tracking, comparison, and deployment of machine learning experiments.\n",
"\n",
"In this sample example template, you’ll use MLflow to **track training runs**, **log parameters and metrics**, and **store models** for future reuse — all within a single notebook.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oV2djjnvFOMj"
},
"source": [
"Install **MLflow**, **Gradio**, and supporting libraries including **scikit‑learn**, **matplotlib**, and **pandas**.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "b5yEqAXTFOMk"
},
"outputs": [],
"source": [
"!pip install -q mlflow scikit-learn matplotlib pandas gradio\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IppJqmrgFOMm"
},
"source": [
"Import MLflow, perform a quick GPU check with PyTorch, and load helper libraries used throughout.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "F31VfPfuFOMn"
},
"outputs": [],
"source": [
"import mlflow, os, torch, pandas as pd, matplotlib.pyplot as plt, gradio as gr\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.datasets import load_diabetes\n",
"from sklearn.ensemble import RandomForestRegressor\n",
"\n",
"device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
"print(f'✅ Using device: {device}')\n",
"if device == 'cpu':\n",
" print('⚠️ Running on CPU — switch to GPU for faster performance if available.')\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fyS8Q66nFOMp"
},
"source": [
"By default, MLflow saves runs to the local **`mlruns/`** directory. You can switch to a **remote tracking server** later by setting a different tracking URI.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Le6Ec_F_FOMq"
},
"outputs": [],
"source": [
"mlflow.set_tracking_uri('file:///content/mlruns')\n",
"mlflow.set_experiment('mlflow_tracking_demo')\n",
"print('🎯 Tracking URI:', mlflow.get_tracking_uri())\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eTSpUqXLFOMs"
},
"source": [
"It fetches experiment metadata, parameters, and metrics from your local `mlruns/` directory (or a remote server if configured).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tNSyaQmHFOMu"
},
"outputs": [],
"source": [
"from mlflow.tracking import MlflowClient\n",
"\n",
"def show_mlflow_runs_table(experiment_name=\"mlflow_tracking_demo\"):\n",
" \"\"\"Display all MLflow runs (similar to MLflow UI Table).\"\"\"\n",
" client = MlflowClient()\n",
" experiment = client.get_experiment_by_name(experiment_name)\n",
"\n",
" if not experiment:\n",
" return pd.DataFrame({\"Info\": [\"No experiment found. Run a training cell first.\"]})\n",
" runs = client.search_runs([experiment.experiment_id])\n",
" if not runs:\n",
" return pd.DataFrame({\"Info\": [\"No runs logged yet.\"]})\n",
"\n",
" rows = []\n",
" for r in runs:\n",
" row = {\n",
" \"Run ID\": r.info.run_id,\n",
" \"Status\": r.info.status,\n",
" \"Start Time\": pd.to_datetime(r.info.start_time, unit=\"ms\"),\n",
" \"End Time\": pd.to_datetime(r.info.end_time, unit=\"ms\"),\n",
" \"Duration (s)\": round((r.info.end_time - r.info.start_time) / 1000, 2)\n",
" if r.info.end_time else None,\n",
" }\n",
" row.update(r.data.params)\n",
" row.update(r.data.metrics)\n",
" rows.append(row)\n",
"\n",
" df = pd.DataFrame(rows)\n",
" main_cols = [\"Run ID\", \"Status\", \"Start Time\", \"End Time\", \"Duration (s)\"]\n",
" other_cols = [c for c in df.columns if c not in main_cols]\n",
" df = df[main_cols + other_cols]\n",
" print(f\"✅ Showing {len(df)} runs from experiment '{experiment_name}'\")\n",
" return df\n",
"\n",
"runs_df = show_mlflow_runs_table()\n",
"display(runs_df)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oMeP7H8qFOMw"
},
"source": [
"Let's train a small **Random Forest** on the Diabetes dataset and log parameters, metrics, and the model artefact to MLflow.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tqqgobq6FOMy"
},
"outputs": [],
"source": [
"from mlflow.models.signature import infer_signature\n",
"\n",
"with mlflow.start_run() as run:\n",
" db = load_diabetes()\n",
" X_train, X_test, y_train, y_test = train_test_split(db.data, db.target, test_size=0.2, random_state=42)\n",
"\n",
" model = RandomForestRegressor(n_estimators=100, max_depth=6, random_state=42)\n",
" model.fit(X_train, y_train)\n",
" preds = model.predict(X_test)\n",
" signature = infer_signature(X_test, preds)\n",
"\n",
" mlflow.log_params({'n_estimators': 100, 'max_depth': 6})\n",
" mlflow.log_metric('mean_prediction', float(preds.mean()))\n",
" mlflow.sklearn.log_model(model, 'model', signature=signature)\n",
"\n",
" print(f'Run ID: {run.info.run_id}')\n",
" print('✅ Training and logging complete!')\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KwHrJpK9FOMz"
},
"source": [
"Use the run ID to load the stored model for inference.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "iB6HPf3kFOM0"
},
"outputs": [],
"source": [
"run_id = run.info.run_id\n",
"loaded_model = mlflow.sklearn.load_model(f'runs:/{run_id}/model')\n",
"print('✅ Model loaded successfully!')\n",
"print('Sample predictions:', loaded_model.predict(X_test[:5]))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7_ULM2cqFOM0"
},
"source": [
"So, you've configured MLflow tracking locally (can be configure for MLflow remote server too), logged parameters, metrics, and model artifacts.\n",
"\n",
"Additionally, you can reload a trained model from specific run using the `run Id`. Guide on deployment on saturn can be found in the [saturn documentation](https://saturncloud.io/docs)."
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading