Publications:Random forests based monitoring of human larynx using questionnaire data

From ISLAB/CAISR

Do not edit this section

Keep all hand-made modifications below

Title Random forests based monitoring of human larynx using questionnaire data
Author Marija Bacauskiene and Antanas Verikas and Adas Gelzinis and Aurelija Vegiene
Year 2012
PublicationType Journal Paper
Journal Expert systems with applications
HostPublication
Conference
DOI http://dx.doi.org/10.1016/j.eswa.2011.11.070
Diva url http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:461859
Abstract This paper is concerned with soft computing techniques-based noninvasive monitoring of human larynx using subject’s questionnaire data. By applying random forests (RF), questionnaire data are categorized into a healthy class and several classes of disorders including: cancerous, noncancerous, diffuse, nodular, paralysis, and an overall pathological class. The most important questionnaire statements are determined using RF variable importance evaluations. To explore data represented by variables used by RF, the t-distributed stochastic neighbor embedding (t-SNE) and the multidimensional scaling (MDS) are applied to the RF data proximity matrix. When testing the developed tools on a set of data collected from 109 subjects, the 100% classification accuracy was obtained on unseen data in binary classification into the healthy and pathological classes. The accuracy of 80.7% was achieved when classifying the data into the healthy, cancerous, noncancerous classes. The t-SNE and MDS mapping techniques applied allow obtaining two-dimensional maps of data and facilitate data exploration aimed at identifying subjects belonging to a “risk group”. It is expected that the developed tools will be of great help in preventive health care in laryngology.