Predicting Patient No-Shows at a Hospital

The goal of this post is to investigate potential features for predicting patient no-shows. The following are the details of the No-Show Prediction Challenge and the Clinic Booking log Optimization Solution:

Business Pain Definition:

  1. No-Show Patients are defined as “patients who did not keep nor cancelled scheduled appointments”.
  2. According to academic research, No-Show Patients rates have been shown to range from 11%–31% in general medicine clinics.
  3. No-show appointments lead to a reduction in the quality of patient care, reduction in productivity, financial losses to the hospital and impaired outcomes of patient care.
  4. A clinic suffering from 15K no-shows within one year is looking at an approximately $1 million in annual loss. This scenario is an optimistic one.
  5. With the avg. no-show rate of 62 per day, the annual cost of missed appointments is 3 million USD per hospital.

Proposed Solution:
A Machine Learning model that predicts the probability of patients’ no-show. It will be based on various patterns of no-show cases and patient data such as: 
Age, gender, previous admissions days, appointment age, appointment type, address, marital status, previous no-show rate, demographic data, date, etc. 
In addition we will examine the factor-in of external datasets (such as weather and public transportations).

The ML model scoring results will be part of a smart MVP solution that will optimize clinics’ appointment lists.

We collected electronic medical record (EMR) data and appointment data which include patient, provider and clinical visit characteristics over a 6-year period from 12 various clinics:

  1. Patients Demographic Data (2013–2019)
  2. Visits (2013–2019)
  3. No Shows (2013–2019)
  4. External data source (2013–2019).

High Level Architecture:

A hybrid architecture that enables optimized appointment listing based on the probability which had been obtained from the real-time prediction machine learning model

A hybrid architecture for optimized clinic reservation list

The above diagram describes a hybrid architecture that integrates to the existing clinic reservation application. Each time a patient asks to reserve a new slot the predictive ML model will find the optimal slot based on the API score result.

Technology Stack:

Azure Machine Learning Studio: AutoML, Notebooks, Designer
Rapid Miner

NoShow Classification Model Steps:

  1. Data Preparation & Data Exploration.
  2. Feature Engineering.
  3. Imbalanced Data Techniques.
  4. Feature Importance & Feature Selection.
  5. Training & Cross Validation.
  6. Scoring & Model Deployment.

In the first step we combined 3 datasets together.
Then we extracted some date fields and rearranged all columns as a single dataset.

Marital Status vs NoShow

Next step: exploring our data set and visualizing our aggregated data by column graph.

We can notice a normal growth of total appointments per month during the 2013–2019 timeline period

Marital Status vs NoShow

Marital status doesn’t seem to correlate to no-show rate.

Marital Status vs NoShow
NoShow Frequency per Day of Week and per Month
NoShow Rate By Hour per Day
NoShow Rate By Clinic by Year
Total visits by hour per day for Specific Clinic
Age Histogram
Group By Show(0) / No-Show(1)

Feature Engineering:

Following the exploration step we are ready to engineer our attributes and find a strongly relevant feature (one that holds information not found in any other feature) and relevant features, such as: patient previous no-show- total value.

Previous no-show per patient

Imbalanced Data:
Imbalanced data typically refers to a problem with classification problems where the classes are not represented equally”.

You can read more about tactics to combat imbalanced data here:8 Tactics to Combat Imbalanced Classes in Your Machine Learning DatasetHas this happened to you? You are working on your dataset. You create a classification model and get 90% accuracy…

Anyhow, this is the average no-show rate prior the imbalanced data fix.

NoShow vs Show rate

The tactic that I chose was Random Under-Sampling for the majority class (‘show’ records).

Under-sampling our training data only
Our Model Performance
Predictions vs True Results

Confusion Matrix & ROC:

Confusion Matrix

Our receiver operating characteristic curve that illustrate our binary classifier, here we can understand our true positive rate against the false positive rate:


Cross Validation:

Cross Validation results

Baseline Comparison (Azure AutoML):

Leave a comment

Your email address will not be published. Required fields are marked *