Skip to content
Predict hospital readmissions
0
  • AI Chat
  • Code
  • Report
  • Reducing hospital readmissions

    📖 Background

    You work for a consulting company helping a hospital group better understand patient readmissions. The hospital gave you access to ten years of information on patients readmitted to the hospital after being discharged. The doctors want you to assess if initial diagnoses, number of procedures, or other variables could help them better understand the probability of readmission.

    They want to focus follow-up calls and attention on those patients with a higher probability of readmission.

    The Problem

    A hospital readmission is an episode when a patient who had been discharged from a hospital is admitted again within a specified time interval. Readmission rates have increasingly been used as an outcome measure in health services research and as a quality benchmark for health systems. Generally, higher readmission rate indicates ineffectiveness of treatment during past hospitalizations.

    💾 The data

    You have access to ten years of patient information (source):

    Information in the file
    • "age" - age bracket of the patient
    • "time_in_hospital" - days (from 1 to 14)
    • "n_procedures" - number of procedures performed during the hospital stay
    • "n_lab_procedures" - number of laboratory procedures performed during the hospital stay
    • "n_medications" - number of medications administered during the hospital stay
    • "n_outpatient" - number of outpatient visits in the year before a hospital stay
    • "n_inpatient" - number of inpatient visits in the year before the hospital stay
    • "n_emergency" - number of visits to the emergency room in the year before the hospital stay
    • "medical_specialty" - the specialty of the admitting physician
    • "diag_1" - primary diagnosis (Circulatory, Respiratory, Digestive, etc.)
    • "diag_2" - secondary diagnosis
    • "diag_3" - additional secondary diagnosis
    • "glucose_test" - whether the glucose serum came out as high (> 200), normal, or not performed
    • "A1Ctest" - whether the A1C level of the patient came out as high (> 7%), normal, or not performed
    • "change" - whether there was a change in the diabetes medication ('yes' or 'no')
    • "diabetes_med" - whether a diabetes medication was prescribed ('yes' or 'no')
    • "readmitted" - if the patient was readmitted at the hospital ('yes' or 'no')

    Acknowledgments: Beata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore, "Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records," BioMed Research International, vol. 2014, Article ID 781670, 11 pages, 2014.

    suppressPackageStartupMessages(library(tidyverse)) 
    df <- readr::read_csv('data/hospital_readmissions.csv', show_col_types = FALSE)
    # libraries
    suppressPackageStartupMessages({
    library(skimr)
    library(tidymodels)
    if(!require(ggcorrplot)) install.packages("ggcorrplot")
    if(!require(vip)) install.packages("vip")
    if(!require(patchwork)) install.packages("patchwork")
    library(ggcorrplot)
    library(vip)
    library(patchwork)
    })
    
    # set theme
    theme_set(theme_classic())

    💪 Competition challenge

    Create a report that covers the following:

    1. What is the most common primary diagnosis by age group?
    2. Some doctors believe diabetes might play a central role in readmission. Explore the effect of a diabetes diagnosis on readmission rates.
    3. On what groups of patients should the hospital focus their follow-up efforts to better monitor patients with a high probability of readmission?

    Findings

    • 47% of patients were readmitted
    • Most of the patients were aged between 60 and 80 years old
    • The proportion of in patient is high for patients who were readmitted compared to those were not.
    • The proportion of readmitted patients is above the mean for patients aged 60 to 70, 70 to 80 and 80 to 90 years old.
    • There are only 10% of patients aged 40 to 50, but this group represents 45% of readmitted rate.
    • The dataset represents 54% of patients readmitted among those who were diagnosed diabetes.
    • Also, patients with diabetes, digestive and circulatory problems in the first diagnostic have high readmitted rates (greater the mean).
    • For the second diagnostic, patients with circulatotary and respiratory problems tend to have high readmitted rates.
    • The same situation is observed for the third diagnostic.
    • Diabetes medication increases the proportion of readmitted rate.
    • Patients with normal and high glucose test tend to have a high readmitted rate.
    • Patient with no A1C test tend to have a high readmitted rate.
    • Patients diagnosed diabetes in the first diagnostic tend to have a high readmitted rate, compared to those who have not been diagnosed diabetes. The result is confirmed with a Chi-2 test (p < 0.001).
    • Most of the common primary diagnostic is circulatory, chiefly for those aged 50 to 80 years old.
    • Patients with high number of days in hospital tend to have a high number of mediacations.
    • Similarly, patients with a high number of procedure tend to have a high number of medications.

    Recommandations

    The hospital may focus their follow-up efforts to monitor :

    • patients aged 60 to 70, 70 to 80 and 80 to 90 years old;
    • patients with diabetes diagnostic;
    • patents with digestive and circulatory problems
    • patients with diabetes medication

    Exploratory data analysis

    medical_specialty to readmitted were converted to factors. medical_specialty, diag_1, diag_2 and diag_3 contained missing values recorded as Missing. Medical_specialty contained 50% of missings and other three variables contained less than 1% of them.

    I decided to remove medical_specialty from the dataframe and remove missing values from other variables. After removing, the dataset contained 24779 obervations and 16 variables (one variable and 221 observations were removed).

    glimpse(df)
    ‌
    ‌
    ‌