Publications:Learning Low-Dimensional Representation of Bivariate Histogram Data
From ISLAB/CAISR
Title | Learning Low-Dimensional Representation of Bivariate Histogram Data |
---|---|
Author | Evaldas Vaiciukynas and Matej Uličný and Sepideh Pashami and Sławomir Nowaczyk |
Year | 2018 |
PublicationType | Journal Paper |
Journal | IEEE transactions on intelligent transportation systems (Print) |
HostPublication | |
Conference | |
DOI | http://dx.doi.org/10.1109/TITS.2018.2865103 |
Diva url | http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:1260571 |
Abstract | With an increasing amount of data in intelligent transportation systems, methods are needed to automatically extract general representations that accurately predict not only known tasks but also similar tasks that can emerge in the future. Creation of low-dimensional representations can be unsupervised or can exploit various labels in multi-task learning (when goal tasks are known) or transfer learning (when they are not) settings. Finding a general, low-dimensional representation suitable for multiple tasks is an important step toward knowledge discovery in aware intelligent transportation systems. This paper evaluates several approaches mapping high-dimensional sensor data from Volvo trucks into a low-dimensional representation that is useful for prediction. Original data are bivariate histograms, with two types--turbocharger and engine--considered. Low-dimensional representations were evaluated in a supervised fashion by mean equal error rate (EER) using a random forest classifier on a set of 27 1-vs-Rest detection tasks. Results from unsupervised learning experiments indicate that using an autoencoder to create an intermediate representation, followed by $t$-distributed stochastic neighbor embedding, is the most effective way to create low-dimensional representation of the original bivariate histogram. Individually, $t$-distributed stochastic neighbor embedding offered best results for 2-D or 3-D and classical autoencoder for 6-D or 10-D representations. Using multi-task learning, combining unsupervised and supervised objectives on all 27 available tasks, resulted in 10-D representations with a significantly lower EER compared to the original 400-D data. In transfer learning setting, with topmost diverse tasks used for representation learning, 10-D representations achieved EER comparable to the original representation. |