Zero-Shot Learning for Semantic Segmentation

From ISLAB/CAISR
Revision as of 19:07, 8 October 2020 by Tiago (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Title Zero-Shot Learning for Semantic Segmentation
Summary Zero-Shot Learning for Semantic Segmentation
Keywords Deep Learning, Computer Vision, Semantic Segmentation
TimeFrame
References https://arxiv.org/pdf/1707.00600.pdf

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41473.pdf https://arxiv.org/pdf/1906.00817.pdf https://openaccess.thecvf.com/content_ICCVW_2019/papers/MDALC/Kato_Zero-Shot_Semantic_Segmentation_via_Variational_Mapping_ICCVW_2019_paper.pdf https://github.com/daooshee/Few-Shot-Learning

Prerequisites Excellent Programming Skills; Excellent Grasp of Deep Learning and Pytorch
Author Tiago Cortinhal
Supervisor Tiago Cortinhal, Eren Erdal Aksoy
Level Master
Status Internal Draft


Currently we are working of a GAN approach to be able to generate segmentation maps from a sensor modality to another (from/to LiDAR and RGB).

The currently available datasets are not perfectly aligned (SemanticKitti and Cityscape), to try to overcome this problem techniques like Zero-Shot Learning could be applied.


Zero-Shot Learning can be seen as a type of Domain Adaptation, where a given set of classes are used for training and we have another set of unseen classes to which we wish to have segmentation as well.

To be able to perform this, a kind of embedding of the classes need to be done, as to be able to adapt the pre-trained model to unseen classes by combining the embeddings of the learnt classes.

The project consists of using the pre-trained GAN generator model and apply Zero-Shot Learning.


 Research Questions:
   Can Zero-Shot Learning besides of learning unseen classes also improve the overall IoU (Intersection over Union)?
   Should Zero-Shot or N-Shot be applied in this scenario (N-Shot: N examples are showed during training)
   Compared with Domain transfer/finetuning techniques which could provide faster/better results?