Data Mining In a Warehouse Inventory

Title Data Mining In a Warehouse Inventory
Summary A study of feature selection and distance measures for clustering big number of categories (>1000) and novelty detection in warehouse environment.
Keywords object recognition, signal processing, feature selection, unsupervised clustering, large scale many class classification, data mining.
TimeFrame October 2017 to June 2018, with possible extension to September 2018
References Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid. Good Practice in Large-Scale Learning for Image Classi cation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2014, 36 (3), pp.507-520.<10.1109/TPAMI.2013.146>.<hal-00835810>

Florent Perronnin, Zeynep Akata, Zaid Harchaoui, Cordelia Schmid. Towards Good Practice in Large-Scale Learning for Image Classification. CVPR 2012 - IEEE Computer Vision and Pattern Recognition, Jun 2012, Providence (RI), United States. IEEE, pp.3482-3489, 2012,<10.1109/CVPR.2012.6248090>.<hal-00690014>

Raphael Puget, Nicolas Baskiotis, Patrick Gallinari. Sequential Dynamic Classi cation for Large Scale Multi-class Problems. Extreme Classi cation Workshop at ICML, Jul 2015, Lille,France. 2015.<hal-01207428>

Prerequisites Programming skills, Machine Learning, Computer Vision, Data Mining.
Supervisor Björn Åstrand
Level Master
Status Open

Object recognition in problems entailing many classes is a challenging task. One example of such problems is the inventory list of warehouse. The inventory of typical warehouses often contain up to 10K different classes of objects. In this project we intend to develop inventory list maintanance method that is able to learn the number of classes of objects and train a classifier from the data. Towards this objective, we employ the background knowledge (e.g. from the Warehouse Management System - WMS) to constrain the complexity of the problem.
To develop an incremental clustering algorithm, that learns new classes of object through novelty detection. The background knowledge (e.g. WMS), which is an important source of information for constraining the problem, should be exploit towards a more robust system design.
Research Questions
What is the optimal feature space and clustering technique for object identification in large-scale many classes? How to use background knowledge as clustering cues? How to employ novelty detection for learning new classes incrementally?
dataset from a real-world warehouses.