Thesis GAhmed – Knowledge Discovery and Intelligent Systems – KDIS

GRAPH REPRESENTATION BY USING MACHINE LEARNING.

BASIC INFORMATION

Ph.D. Student: Ghaidaa Ahmed Ali
Advisor: Sebastián Ventura
Started on: February 2021
Keywords: Machine Learning, Graphs, Bipartite Graph Neural Networks

THESIS PROPOSAL

Graphs that describe pairwise relations between entities are essential representations for real-world data from many different domains, including social science, linguistics, chemistry, biology, and physics. Graphs provide a unique structure unlike typical tables or matrices, the order is not given much priority. Every element is dependent on the other to form a relationship. This relation is a core for all hypotheses and predictions based on it. There are many variants of graphs in the world like Directed Graphs, Heterogeneous Graphs and Graphs with Edge Information. We will use Bipartite Graph inference is to predict the students or materials on our dataset.

A convenient way to represent graphs is through an adjacency matrix. To represent a graph with an adjacency matrix, we order the nodes in the graph so that every node indexes a particular row and column in the adjacency matrix. As a unique non-Euclidean data structure for machine learning, graph analysis focuses on node classification, link prediction, and clustering. Graph neural networks (GNNs) are deep learning-based methods that operate on graph domain. Due to its convincing performance and high interpretability, GNN has been a widely applied graph analysis method recently. GNNs can be viewed as a process of representation learning on graphs. For node focused tasks, GNNs target on learning good features for each node such that node focused tasks can be facilitated. For graph-focused tasks, they aim to learn representative features for the entire graph where learning node features is typically an intermediate step. The process of learning node features usually leverages both the input node features and the graph Structure. The standard neural networks like Convolutional neural networks (CNNs) and Recurrent neural networks (RNN) cannot handle the graph input properly in that they stack the feature of nodes by a specific order. However, there isn’t a natural order of nodes in the graph.

The motivations of Graph neural networks are the following. Firstly, most of the objects around us are explicitly or implicitly connected with each other; in other words, we are living in a world of graphs. Secondly, Graph learning has the capability to learn complex relations. As one of the most promising machine learning techniques, GL has shown great potential in deriving knowledge embedded in different kinds of graphs. Specifically, many GL techniques, such as random walk and Graph Neural Networks, have been developed to learn the particular type of relations modelled by graphs, and have been demonstrated to be quite effective. Lastly, the formalization in Graph can be defined in various forms, depending on the data type. It can be homogeneous sequences or heterogeneous networks.

The main goal of the thesis is the development of new Graph methods and their application to different real problems. The bipartite graph inference to predict the nodes in Graph neural networks in machine learning is an excellent methodology to reach this objective, so our primary interest will be the development of machine learning methods for that purpose. The second objective of this research will be the validation of the proposed models to solve a series of real-world problems.