Comparison of Predictive Models for Transferring Stroke In-Patients to Intensive Care Unit
DOI:
https://doi.org/10.14738/tmlai.43.2051Keywords:
Predictive Model, Artificial Neural Network, Support Vector Machine, Decision Tree, Generalized Boosted Model, Random Forest,Abstract
The cost for Intensive Care Unit (ICU) resources is extremely high and it affects healthcare budget that provides quality healthcare service for patients. Thus, the need for a predictive model for the decision to transfer stroke in-patients to the ICU is very important to utilize the resources effectively. Also, it will help to lower morbidity and mortality rates through earlier detection and intervention. In this research, initially, a Decision Tree (DT) model, an Artificial Neural Network (ANN) model, a Support Vector Machine (SVM) model, and a Logistic Regression (LR) model are evaluated for predicting the need to transfer the stroke in-patients to the ICU or not. The study is conducted on a clinical dataset consisting of 1,415 observations using the vital signs with six variables. This original dataset was having data imbalance and hence the result was misleading. In order to overcome this situation the Synthetic Minority Oversampling Technique was applied on the dataset. A DT model, an ANN model, a SVM model, and a LR model are evaluated again on the balanced dataset for prediction. Tree-based ensemble approaches such as Generalized Boosted Model, Adaptive Boosting (AdaBoost.M1), Random Forest and Bagged AdaBoost (Adabag) are used to improve the accuracy and performance of models. These methods were trained and tested on the balanced stroke in-patients dataset. The boosting model, AdaBoost.M1, and bagging model, Random Forest, achieved better accuracy compared to the other models. Therefore, these two models could be used for helping healthcare professionals in decision-making.
References
A. Guazzelli, 'Predictive analytics in healthcare', Ibm.com, 2015. [Online]. Available: https://www.ibm.com/developerworks/library/ba-ind-PMML3/. [Accessed: 21- Sep- 2015]
P. Ghavami and K. Kapur, 'Prognostics and Prediction of Patient Health Status Using a Multi-Model Artificial Intelligence Framework', Public Health Frontier, vol. 2, no. 2, pp. 46-60, 2013
S. Dreiseitl and L. Ohno-Machado, 'Logistic regression and artificial neural network classification models: a methodology review', Journal of Biomedical Informatics, vol. 35, no. 5-6, pp. 352-359, 2002
H. Abd Rahman, Y. Wah, Z. Khairudin and N. Abdullah, 'Comparison Of Predictive Models To Predict Survival Of Cardiac Surgery Patients', Statistics in Science, Business, and Engineering (ICSSBE), 2012 International Conference on, pp. 1-5, 2012
R. Longadge, S. Dongre and L. Malik, 'Class Imbalance Problem in Data Mining: Review',International Journal of Computer Science and Network (IJCSN), vol. 2, no. 1, 2013
Y. Freund and R. Schapire, “A short introduction to boosting,” J. Japan. Soc. for Artif. Intel., vol. 14(5), pp. 771–780, 1999
L. Breiman, "Bagging predictors", Mach Learn, vol. 24, no. 2, pp. 123-140, 1996
N. Kasabov, V. Feigin, Z. Hou, Y. Chen, L. Liang, R. Krishnamurthi, M. Othman and P. Parmar, 'Evolving spiking neural networks for personalised modelling, classification and prediction of spatio-temporal patterns with a case study on stroke', Neurocomputing, vol. 134, pp. 269-279, 2014
P. de Toledo, P. Rios, A. Ledezma, A. Sanchis, J. Alen and A. Lagares, 'Predicting the Outcome of Patients With Subarachnoid Hemorrhage Using Machine Learning Techniques', IEEE Transactions on Information Technology in Biomedicine, vol. 13, no. 5, pp. 794-801, 2009
R. Longadge, S. Dongre and L. Malik, 'Class Imbalance Problem in Data Mining: Review',International Journal of Computer Science and Network (IJCSN), vol. 2, no. 1, 2013
H. Nguyen, E. Cooper and K. Kamei, 'Borderline over-sampling for imbalanced data classification', International Journal of Knowledge Engineering and Soft Data Paradigms, vol. 3, no. 1, p. 4, 2011
Xu-Ying Liu, Jianxin Wu and Zhi-Hua Zhou, 'Exploratory Undersampling for Class-Imbalance Learning', IEEE Trans. Syst., Man, Cybern. B, vol. 39, no. 2, pp. 539-550, 2009
N. Chawla, K. Bowyer, L. Hall and W. Kegelmeyer, 'SMOTE: Synthetic Minority Over-sampling Technique', Journal of Artificial Intelligence Research, vol. 16, no. 321357, 2002
D. Opitz and R. Maclin, 'Popular Ensemble Methods: An Empirical Study', Journal of Artificial Intelligence Research 11, vol. 169-198, 1999
Y. Freund and R. Schapire, “A short introduction to boosting,” J. Japan. Soc. for Artif. Intel., vol. 14(5), pp. 771–780, 1999
L. Rokach, 'Ensemble-based classifiers', Artificial Intelligence Review, vol. 33, no. 1-2, pp. 1-39, 2009
N. Alotaibi and S. Sasi, 'Predictive Model for Transferring Stroke In-Patients to Intensive Care Unit', Symposium on Machine Learning Algorithms and Applications (MLAA'15) in IEEE International Conference on Computing and Network Communications (CoCoNet - 2015), pp. 136-141, 2015
J. Friedman, "Stochastic gradient boosting", Computational Statistics & Data Analysis, vol. 38, no. 4, pp. 367-378, 2002
J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression:a statistical view of boosting,” Ann. Statist., vol. 28, no. 2, pp. 337–407, 2000
A. Liaw and M. Wiener, "Classification and Regression by randomForest", vol. 23, 2002
E. Alfaro, M. Gámez and N. García, "adabag : An R Package for Classification with Boosting and Bagging", Journal of Statistical Software, vol. 54, no. 2, 2013
C. Chang and C. Chen, 'Applying decision tree and neural network to increase quality of dermatologic diagnosis', Expert Systems with Applications, vol. 36, no. 2, pp. 4035-4041, 2009
G. Uttreshwar and A. Ghatol, 'Hepatitis B Diagnosis Using Logical Inference and Self-Organizing Map', J. of Computer Science, vol. 4, no. 12, pp. 1042-1050, 2008
C. Lung Chan, C. Li Chen and H. Wei Ting, 'An Excellent Mortality Prediction Model Based on Support Vector Machine (SVM)-a Pilot Study', International Symposium on Computer, Communication, Control and Automation, 2010
S. Vairavan, L. Eshelman, S. Haider, A. Flower and A. Seiver, 'Prediction of Mortality in an Intensive Care Unit using Logistic Regression and a Hidden Markov Model', Computing in Cardiology, vol., no. 39-393-396, 2012