My midterm project at MLZoomcamp led by Alexey Grigorov for DataTalksClub
Predicting Patient No-Shows: A Data-Driven Approach Hospital no-shows significantly disrupt healthcare systems, wasting resources and delaying care for those in need. My midterm project for the MLZoomcamp , led by Alexey Grigorev and hosted by DataTalksClub, tackles this challenge using machine learning to predict no-show probabilities for appointments in Brazilian hospitals. Here's how I approached the problem: The Challenge The dataset, sourced from Kaggle, includes over 110,000 appointments and diverse features such as patient demographics, appointment details, and medical history. However, achieving reliable predictions is complex due to: Imbalanced Data : About 80% of appointments were attended, while 20% were no-shows. Dependence on Feature Engineering : Key predictors like patient history (previous/missed appointments) were engineered from the raw data. Bias Mitigation : Socioeconomic factors, such as neighborhood, required careful handling to ensure fairness. The Solution ...