Aumentar Letra Diminuir Letra Normalizar Letra Contraste
Título: MACHINE LEARNING TO PREDICT HIGH-COST HOSPITALIZATIONS
Instituição: PONTIFÃCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO - PUC-RIO
Autor(es): ADRIAN MANRESA PEREZ
Colaborador(es): FERNANDA ARAUJO BAIAO AMORIM - Orientador
SILVIO HAMACHER - Coorientador
Data da catalogação: 25 11:10:20.000000/08/2020
Tipo: THESIS Idioma(s): ENGLISH - UNITED STATES
Referência [pt]: https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/DEI/serieConsulta.php?strSecao=resultado&nrSeq=49137@1
Referência [en]: https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/DEI/serieConsulta.php?strSecao=resultado&nrSeq=49137@2
Referência DOI: https://doi.org/10.17771/PUCRio.acad.49137

Resumo:
Healthcare providers are evolving their management models, developing proactive programs to improve the quality and efficiency of their health services, considering the available historical information. Proactive strategies seek not only to prevent and detect diseases but also to enhance hospitalization outcomes. In this sense, one of the most challenging tasks is to identify which patients should be included in proactive health programs. To this end, forecasting and modeling cost-related variables are among the most widely used approaches for identifying such patients, since these variables are potential indicators of the patients hospitalization risk, their severity, and their medical resources consumption. Most of the existing research works in this area aim to model cost variables from an overall perspective and predict cost variations for specific periods. In contrast, this work focuses on predicting the costs of a particular event. Specifically, this thesis prescribes a solution for identifying high-cost hospitalizations, to support health service managers in their proactive actions. To this end, the Design Science Research (DSR) methodology was combined with the Data Science life cycle in a real scenario of a health consulting company. The data provided describes patients hospitalizations through their demographic characteristics and their medical resource consumption. Different statistical and Machine Learning techniques were used to predict high-cost hospitalizations, such as Ridge Regression (RR), Least Absolute Shrinkage and Selection Operator (LASSO), Classification and Regression Trees (CART), Random Forest (RF), and Extreme Gradient Boosting (XGB). The experimental results showed that RF and XGB presented the best performance, reaching an Area Under the Curve Precision-Recall (AUCPR) of 0.732 and 0.644, respectively. In the case of RF, the model was able to detect, on average, 72 percent of the high-cost hospitalizations with a 33 percent of Precision, which represents 78.7 percent of the total cost generated by the high-cost hospitalizations. Moreover, the obtained results showed that the use of prior cost and aggregated variables of resource consumption increased the model s ability to predict high-cost hospitalizations.
Descrição: Arquivo:
COMPLETE PDF

<< voltar