An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms |
| |
Authors: | Jorge LM Amaral Agnaldo J Lopes José M Jansen Alvaro CD Faria Pedro L Melo |
| |
Affiliation: | 1. Department of Electronics and Telecommunications Engineering, State University of Rio de Janeiro, Rio de Janeiro, Brazil;2. Pulmonary Function Laboratory, Pedro Ernesto University Hospital, State University of Rio de Janeiro, Rio de Janeiro, Brazil;3. Biomedical Instrumentation Laboratory, Institute of Biology Roberto Alcantara Gomes and Laboratory of Clinical and Experimental Research in Vascular Biology (BioVasc) State University of Rio de Janeiro, Rio de Janeiro, Brazil |
| |
Abstract: | The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN = 0.89 and SVM = 0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN = SVM = 0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. |
| |
Keywords: | Clinical decision support Early diagnosis Artificial intelligence Forced oscillation technique Smoking Chronic obstructive pulmonary disease |
本文献已被 ScienceDirect 等数据库收录! |
|