Thesis MYahya – Knowledge Discovery and Intelligent Systems – KDIS

NEW DEEP LEARNING APPROACHES IN ANOMALY DETECTION. APPLICATIONS.

BASIC INFORMATION

Ph.D. Student: Mohammed Yahya
Advisor: Sebastián Ventura
Started on: December 2019
Keywords: Deep learning, Time series, Anomaly detection

THESIS PROPOSAL

In the real-world, there is a common requirement through checking the data-set to locate which instances stand out as being different from all others. Such types are known as anomalies, and determining all such instances in a data-driven style is the purpose of anomaly detection. Errors in the data can be the reason for the anomaly, but sometimes are indicative of a new underlying process previously unknown. The anomaly could be defined as a deviation of specific observations of the data from the other that raise a suspicion that those observations were created with a different mechanism. In statistics literature and data mining, there are different terms refer to anomalies include but are not limited to outliers, deviants, or abnormalities. Many reasons cause it, such as system failures, intentional fraud, and malicious actions. Thrilling insights can be detected by these anomalies about the data and often transfer valuable information about data. Therefore, anomaly detection deems a primary step in several decision-making systems.

Deep learning is a subset of machine learning that achieves good performance and flexibility by learning to represent the data as nested hierarchies of concepts within layers of the neural network. In the past few years, anomaly detection algorithms, which are based on deep learning, have become increasingly common. Moreover, these algorithms have been applied on variety set of tasks. Researches have shown that deep learning completely outperforms traditional methods for detecting anomalies.

The main goal is the development of new anomaly detection methods and their application to different real problems. The working hypothesis is that deep learning is an excellent methodology to reach this objective, so our primary interest will be the development of deep learning methods for that purpose. The second objective of this research will be the validation of the proposed models to solve a series of real-world problems.

More specifically, the following objectives are detailed:

The outlier’s behavior detecting in conventional algorithms performance is substandard on sequence datasets and the image (e.g. medical images) because it fails to catch complex structures in the data.
Requirements for large-scale anomaly detection: As the size of data raises to gigabytes then, the traditional methods become nearly impossible to scale such large size data to detect the anomaly.
From data, a deep anomaly detection mechanism learns hierarchical discriminative features. This automatic feature learning ability removes the need for developing manual features by domain experts, therefore, advocates to solve the issues end-to-end taking raw input data in fields like speech and text recognition.
The line between normal and abnormal attitudes is often not exactly known in many data fields and is continually developing. The representation of normal boundary poses defy for both conventional and deep learning-based algorithms because of the lack of well-defined.