Posts

Model Deployment in a MLOps Workflow: The Various Ways

In MLOps pipelines, deployment is the pivotal phase where machine learning models transform from development artifacts into production-ready assets. The MLOps Zoomcamp Module 4: Deployment outlines three primary deployment strategies: 1. Web-services: Flask + Docker 🐍 Flask app loads model artifacts  from local disk or cloud storage Containerization ensures  identical environments across dev/prod Key Course Tool : Docker for dependency isolation

My Tryst with Out of Memory (OOM) Error: Taming High-Volume ML Pipelines on Limited Hardware

Image
  How I Fixed Memory Bloat in a Prefect-Orchestrated Workflow Without RAM Upgrades

My Rendezvous with Experiment Tracking & Model Management at the DataTalksClub's MLOps Zoomcamp

I recently finished Module 2 of the MLOps Zoomcamp (hands-on with experiment tracking and model management). The homework was intense – a real grind – but very educational. Rather than sifting through disorganized files for metrics and models, we used MLflow to automatically log and organize all experiment runs. Hyperopt handled our search space, and the best model got neatly registered. Below I share how each step helped turn chaotic experimentation into a clear, reproducible process. Experiment Tracking with MLflow Experiment tracking is about systematically recording every training run so you can reproduce and compare results. MLflow makes this easy. In practice we wrapped our training code (in train.py ) with MLflow’s run API and enabled MLflow’s autologging ( mlflow.sklearn.autolog() ). This meant every model parameter, metric, and artifact was captured automatically. For example, once MLflow autologging was on, we track all hyperparameters and metrics without manual logging . Wit...

Why MLOps ?: Automating the Machine Learning Lifecycle

Introduction A few months ago, I completed the  Machine Learning Zoomcamp  by DataTalksClub—an intensive five-month journey that transformed me from a curious novice to someone confident in building, evaluating, and deploying machine learning models. But as I soon discovered, the real world of production-grade AI isn’t just about training a high-accuracy model. It’s about ensuring that model survives—and thrives—in the chaotic, ever-changing landscape of real-world data. This realization led me to enroll in DataTalksClub’s  MLOps Zoomcamp , a course designed to tackle the very challenges that kept me awake after my first foray into ML. In this blog post, I’ll share why I’m diving into MLOps, the gaps it fills in my knowledge, and what I hope to achieve through this journey. From Notebook to Production: The Challenges The ML Zoomcamp taught me the fundamentals about machine learning and machine learning engineering including deployment of the trained models. But if these...

Bridging the Gap: How Analytics Engineering Transforms Raw Data into Business Insight

In today’s data-driven world, turning raw data into actionable business insights is more critical than ever. Analytics engineering plays a pivotal role in this transformation, serving as the bridge between data ingestion and meaningful analytics. In this article, we’ll explore how analytics engineering—using modern tools like BigQuery and dbt—can streamline your data workflow and empower organizations to make informed decisions.

Data Ingestion From APIs to Warehouses and Data Lakes with dlt

  In today’s data-driven world, building efficient and scalable data ingestion pipelines is more critical than ever. Whether you’re streaming data from public APIs or consolidating data into warehouses and data lakes, having a robust system in place is key to enabling quick insights and reliable reporting. In this blog, we’ll explore how dlt (a Python library that automates much of the heavy lifting in data engineering) can help you construct these pipelines with ease and best practices built-in. Why dlt? dlt is designed to help you build robust, scalable, and self-maintaining data pipelines with minimal fuss. Here are a few reasons why dlt stands out: Rapid Pipeline Construction: With dlt, you can automate up to 90% of the routine data engineering tasks, allowing you to focus on delivering business value rather than wrangling code. Built-In Data Governance: dlt comes with best practices to ensure clean, reliable data flows, reducing the headaches associated with data quality an...