Interactive Anomaly Detection

From ISLAB/CAISR
Title Interactive Anomaly Detection
Summary Anomalies can be relevant or irrelevant to the end-user. The goal of this thesis is to propose a new interactive anomaly detection method to leverage the user-feedback and learn to suggest more relevant anomalies.
Keywords Interactive Anomaly Detection, Deviation Detection, Streaming Data, Data Mining, Machine Learning
TimeFrame 4th of November 2019 to 29th May 2020
References - Lamba, H. and Akoglu, L., 2019, May. Learning On-the-Job to Re-rank Anomalies from Top-1 Feedback. In Proceedings of the 2019 SIAM International Conference on Data Mining (SDM), pp. 612-620. Society for Industrial and Applied Mathematics.

https://epubs.siam.org/doi/pdf/10.1137/1.9781611975673.69

- Ding, K., Li, J. and Liu, H., 2019, January. Interactive anomaly detection on attributed networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 357-365. ACM. http://www.public.asu.edu/~jundongl/paper/WSDM19_GraphUCB.pdf

- Arnaldo, I., Veeramachaneni, K. and Lam, M., 2019. ex2: a framework for interactive anomaly detection. In ACM IUI Workshop on Exploratory Search and Interactive Data Analytics (ESIDA). http://ceur-ws.org/Vol-2327/IUI19WS-ESIDA-2.pdf

- Zhu, Y. and Yang, K., 2019. Tripartite Active Learning for Interactive Anomaly Discovery. IEEE Access. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8707963

Prerequisites Requires very good understanding of ML and data mining techniques (especially for anomaly detection). Good programming skills (preferably in Python) are also required.
Author
Supervisor Mohamed-Rafik Bouguelia, Onur Dikmen
Level Master
Status Ongoing


NOTE: please note that this thesis *requires* a strong prior knowledge of ML and data mining techniques (especially for anomaly detection). In addition, good programming skills (e.g. in Python) are also required.

Contacts:

  • mohamed-rafik.bouguelia@hh.se
  • onur.dikmen@hh.se

Description:

Anomaly detection allows to find patterns that deviate significantly from the majority of reference data, indicating e.g. a system fault. Conventional anomaly detection methods focus on statistical features of the data; they are unsupervised (due to the expensive labeling costs of ground truth anomalies) and do not interact with a human-expert. However, from the user perspective, a detected anomaly can either be relevant (i.e. an actual anomaly) or irrelevant (e.g. an atypical but normal system behavior). In interactive anomaly detection, the goal is to maximize the true (relevant) anomalies presented to the human expert (user). This is done by allowing the algorithm to proactively communicate with the user and leverage the user-feedback to refine results and learn to suggest more relevant anomalies. Several challenges comes with this. For example, one want to learn to distinguish relevant from irrelevant anomalies, but without presenting a lot of irrelevant anomalies to the user. This raises the question of how to handle the exploration-exploitation dilemma when querying anomalies of different kinds.

The goal of this thesis is to:

  1. Perform a state-of-the-art literature review of existing methods for interactive anomaly detection.
  2. Identify the advantages and limitations of these existing methods.
  3. Propose a new interactive anomaly detection method wich solves some of the identified limitations.
  4. Perform extensive experiments on artificial and real-world datasets to show the advantage of the proposed method over existing methods for interactive anomaly detection.

Some relevant recent (2019) papers:

  • Lamba, H. and Akoglu, L., 2019, May. Learning On-the-Job to Re-rank Anomalies from Top-1 Feedback. In Proceedings of the 2019 SIAM International Conference on Data Mining (SDM), pp. 612-620. Society for Industrial and Applied Mathematics.

https://epubs.siam.org/doi/pdf/10.1137/1.9781611975673.69

  • Ding, K., Li, J. and Liu, H., 2019, January. Interactive anomaly detection on attributed networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 357-365. ACM.

http://www.public.asu.edu/~jundongl/paper/WSDM19_GraphUCB.pdf

  • Arnaldo, I., Veeramachaneni, K. and Lam, M., 2019. ex2: a framework for interactive anomaly detection. In ACM IUI Workshop on Exploratory Search and Interactive Data Analytics (ESIDA).

http://ceur-ws.org/Vol-2327/IUI19WS-ESIDA-2.pdf

  • Zhu, Y. and Yang, K., 2019. Tripartite Active Learning for Interactive Anomaly Discovery. IEEE Access.

https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8707963