Monitoring in MLOPS: Reflections on Module 5 of the MLOps Zoomcamp

In today’s fast‑paced ML landscape, deploying a model is only half the story. Continuous monitoring ensures your system stays healthy, accurate, and reliable long after go‑live. I’ve just wrapped up Module 5: Model Monitoring in the DataTalksClub MLOps Zoomcamp, and here’s what I learned—and how you can apply it to your own projects.


Why Model Monitoring Matters

  • Drift Detection: Data distributions evolve. What trained your model yesterday may not reflect today’s reality.

  • Quality Assurance: Spot issues like missing values or unexpected outliers before they impact end‑users.

  • Reliability & Trust: Stakeholders need confidence that your predictions remain valid and service levels remain high.


Core Components of the Monitoring Stack


1. Docker Compose Services

  • PostgreSQL for storing time‑series metrics

  • Adminer for lightweight database management

  • Grafana for rich, interactive dashboards
    Spinning these up with a single docker-compose up command made the setup a breeze.


2. Evidently for Data & Concept Drift

  • Compute drift metrics (e.g., feature‑level drift on model predictions)

  • Monitor data quality (missing values, data ranges, quantiles)

  • Generate ad‑hoc test suites and reports to debug anomalies on demand 


3. Prefect for Automation

  • Automate batch metric collection at regular intervals

  • Orchestrate data loading, metric computation, and database writes in a managed flow
    (In this module, Prefect was used for demonstration—keep in mind it’s optional in some course editions.)


4. Grafana Dashboards

  • Pre‑configured panels visualize your metrics over time:

    • Missing‑Value Counts

    • Data Drift Scores

    • Quantile Trends (e.g., median fare amount)

  • Dashboards are exported as JSON and saved under 05-monitoring/dashboards/ for version control and easy reloads.


 Step‑by‑Step Workflow

1. Prepare Data

  • Train a baseline model and generate a reference dataset.

  • Simulate “current” batches (e.g., sliding daily or monthly windows).


2. Compute Metrics

  • Run a Python script to calculate Evidently metrics in a loop (every 10 seconds for demo, representing daily batches).

  • Insert results into PostgreSQL.

3. Visualize & Alert

  • Open Grafana at localhost:3000 (default admin/admin credentials).

  • Browse the “Home → New Dashboard” folder for the pre‑built monitoring dashboard.

  • Review panels for drift, data quality, and test failures.

4. Debug on Demand

  • Use the ad‑hoc debugging_nyc_taxi_data.ipynb notebook to run Evidently TestSuites.

  • Drill into unexpected metric spikes or failing tests for root‑cause analysis.



Key Takeaways

  • Modular Architecture: Decouple metric computation (Evidently) from storage (PostgreSQL) and visualization (Grafana).

  • Automation Is Crucial: Even a simple Prefect flow ensures metrics are fresh and consistent.

  • Version‑Controlled Dashboards: Saving dashboard JSON alongside code makes reproducibility and collaboration seamless.

  • Proactive Debugging: Integrating TestSuites lets you catch issues before they cascade into production incidents.

Comments

Popular posts from this blog

My midterm project at MLZoomcamp led by Alexey Grigorov for DataTalksClub

Starting my Data Engineering journey with a foundational insight on Docker, Terraform and Google Cloud Platform

Logistic Regression: A walkthrough by Alexey Grigorev