MLflow Flashcards

📊 MLflow Flashcards


MLflow MLOps overview with Tracking, Projects, Models, and the Registry

Want a fast way to learn MLflow? This flashcards guide highlights the essentials so you can track experiments, package code, version models, and ship to production. As a result, you move from notebooks to reliable workflows without getting stuck in tooling.

Moreover, the platform plays nicely with your stack. You can log runs from Python, use the UI to compare metrics, and store artifacts on cloud or local backends. Consequently, teams align on a single source of truth for parameters, metrics, and models.

Before you start, set up a clean environment. First, create a virtual environment and install the client. Next, launch a tracking server (local or remote). Then, run a small experiment and log a few metrics. Finally, register a model so you can promote it from staging to production with confidence.

Key Concepts at a Glance

📌 What is MLflow?
An open-source platform for the ML lifecycle: experiment tracking, reproducibility, packaging, and deployment.
🧪 Tracking
Record and query experiments, parameters, metrics, and artifacts. Compare runs visually to pick the best model.
📦 Projects
A packaging format for reproducible code using Git repos and conda or Docker environments.
🏷️ Runs
A single execution of your experiment; it logs parameters, code versions, metrics, and outputs.
🧠 Models
Manage and package models for diverse formats (pyfunc, sklearn, XGBoost, and more).
🚀 Model Registry
Centralize versions and lifecycle states (e.g., staging, production) with rich metadata and annotations.
🔄 Versioning
Promote and roll back versions with tags and comments. Keep a clean history of deployments.
⚙️ Serving
Expose models via REST, or integrate with platforms such as SageMaker and Azure ML for managed deployment.
🔗 Integrations
Works with TensorFlow, PyTorch, scikit-learn, XGBoost, H2O, Databricks, Kubernetes, and more.
🔍 Visualizing metrics
Use the UI or APIs to plot learning curves, compare runs, and download artifacts for deeper analysis.
🧩 pyfunc format
A standard model interface that wraps different libraries so you can serve and score consistently.
📁 Artifact storage
Store outputs locally or on S3, GCS, Azure Blob, or remote servers by configuring artifact URIs.

Getting Started & Next Steps

First, install the client and spin up a local tracking server. Next, log a simple run and capture metrics and artifacts. Then, register your best model and set its stage to “staging.” Finally, add a deployment target and write a quick README so others can reproduce your steps.

As your workflow grows, consider a remote backend, role-based access, and automated CI/CD. In addition, pin package versions, cache datasets, and use model signatures to avoid breaking changes in downstream services.