Monitoring in MLOPS: Reflections on Module 5 of the MLOps Zoomcamp

July 02, 2025

In today’s fast‑paced ML landscape, deploying a model is only half the story. Continuous monitoring ensures your system stays healthy, accurate, and reliable long after go‑live. I’ve just wrapped up Module 5: Model Monitoring in the DataTalksClub MLOps Zoomcamp, and here’s what I learned—and how you can apply it to your own projects.

Why Model Monitoring Matters

Drift Detection: Data distributions evolve. What trained your model yesterday may not reflect today’s reality.

Quality Assurance: Spot issues like missing values or unexpected outliers before they impact end‑users.

Reliability & Trust: Stakeholders need confidence that your predictions remain valid and service levels remain high.

Core Components of the Monitoring Stack

1. Docker Compose Services

PostgreSQL for storing time‑series metrics
Adminer for lightweight database management
Grafana for rich, interactive dashboards
Spinning these up with a single docker-compose up command made the setup a breeze.

2. Evidently for Data & Concept Drift

Compute drift metrics (e.g., feature‑level drift on model predictions)
Monitor data quality (missing values, data ranges, quantiles)
Generate ad‑hoc test suites and reports to debug anomalies on demand

3. Prefect for Automation

Automate batch metric collection at regular intervals
Orchestrate data loading, metric computation, and database writes in a managed flow
(In this module, Prefect was used for demonstration—keep in mind it’s optional in some course editions.)

4. Grafana Dashboards

Pre‑configured panels visualize your metrics over time:
- Missing‑Value Counts
- Data Drift Scores
- Quantile Trends (e.g., median fare amount)
Dashboards are exported as JSON and saved under 05-monitoring/dashboards/ for version control and easy reloads.

Step‑by‑Step Workflow

1. Prepare Data

Train a baseline model and generate a reference dataset.
Simulate “current” batches (e.g., sliding daily or monthly windows).

2. Compute Metrics

Run a Python script to calculate Evidently metrics in a loop (every 10 seconds for demo, representing daily batches).
Insert results into PostgreSQL.

3. Visualize & Alert

Open Grafana at localhost:3000 (default admin/admin credentials).
Browse the “Home → New Dashboard” folder for the pre‑built monitoring dashboard.
Review panels for drift, data quality, and test failures.

4. Debug on Demand

Use the ad‑hoc debugging_nyc_taxi_data.ipynb notebook to run Evidently TestSuites.
Drill into unexpected metric spikes or failing tests for root‑cause analysis.

Key Takeaways

Modular Architecture: Decouple metric computation (Evidently) from storage (PostgreSQL) and visualization (Grafana).
Automation Is Crucial: Even a simple Prefect flow ensures metrics are fresh and consistent.
Version‑Controlled Dashboards: Saving dashboard JSON alongside code makes reproducibility and collaboration seamless.
Proactive Debugging: Integrating TestSuites lets you catch issues before they cascade into production incidents.

Search This Blog

Learning from Zoomcamps at DataTalks.Club