Publications:Learning Low-Dimensional Representation of Bivariate Histogram Data

From ISLAB/CAISR
Revision as of 18:17, 24 July 2019 by Slawek (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Do not edit this section

Keep all hand-made modifications below

Title Learning Low-Dimensional Representation of Bivariate Histogram Data
Author Evaldas Vaiciukynas and Matej Uličný and Sepideh Pashami and Sławomir Nowaczyk
Year 2018
PublicationType Journal Paper
Journal IEEE transactions on intelligent transportation systems (Print)
HostPublication
Conference
DOI http://dx.doi.org/10.1109/TITS.2018.2865103
Diva url http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:1260571
Abstract With an increasing amount of data in intelligent transportation systems, methods are needed to automatically extract general representations that accurately predict not only known tasks but also similar tasks that can emerge in the future. Creation of low-dimensional representations can be unsupervised or can exploit various labels in multi-task learning (when goal tasks are known) or transfer learning (when they are not) settings. Finding a general, low-dimensional representation suitable for multiple tasks is an important step toward knowledge discovery in aware intelligent transportation systems. This paper evaluates several approaches mapping high-dimensional sensor data from Volvo trucks into a low-dimensional representation that is useful for prediction. Original data are bivariate histograms, with two types--turbocharger and engine--considered. Low-dimensional representations were evaluated in a supervised fashion by mean equal error rate (EER) using a random forest classifier on a set of 27 1-vs-Rest detection tasks. Results from unsupervised learning experiments indicate that using an autoencoder to create an intermediate representation, followed by $t$-distributed stochastic neighbor embedding, is the most effective way to create low-dimensional representation of the original bivariate histogram. Individually, $t$-distributed stochastic neighbor embedding offered best results for 2-D or 3-D and classical autoencoder for 6-D or 10-D representations. Using multi-task learning, combining unsupervised and supervised objectives on all 27 available tasks, resulted in 10-D representations with a significantly lower EER compared to the original 400-D data. In transfer learning setting, with topmost diverse tasks used for representation learning, 10-D representations achieved EER comparable to the original representation.