Καθηγητής Αθανάσιος Λιάβας (επιβλέπων)
Καθηγητής Γεώργιος Καρυστινός
Καθηγητής Μιχάλης Ζερβάκης
Graph alignment is a computational problem which aims to find a correspondence between vertices of graphs that minimizes their node and edge disagreements. This problem arises in many fields, such as computational biology, where finding conserved functional components between species can lead to gene-disease associations or drug discoveries, social sciences, where unveiling unique users on different platforms can remove bots, and computer vision for recognising objects. Compared with the exact graph (sub)isomorphism problem often considered in a theoretical setting, the inexact graph alignment problem is often cast as a Quadratic Assignment Problem (QAP), which has attracted significant research interest. This thesis presents a comprehensive review of the recent research activity concerning the global pairwise one-to-one alignment, detailing the methodologies and formulations of four state-of-the-art graph alignment algorithms.
Specifically, Umeyama's method solves the orthogonal relaxation of the QAP by using spectral embeddings, IsoRank applies random walks with restarts on the normalized Kronecker product graph, LowRank-Align solves a rank-k approximation of the orthogonal relaxation of the QAP using an alternating optimization framework, and CONE-Align uses embeddings obtained with the NetMF method and aligns their subspaces using a Wasserstein-Procrustes framework.
To evaluate these algorithms, a standard experimental setup is adopted. A graph, G1, is aligned with a “noisy” and permuted version of itself, G2, in order to create an instance of the graph alignment problem. The noise level is the percentage of extra edges in G2 with respect to the total edges in G1. Only simple (unweighted, undirected, with no loops or multiple edges) graphs are considered. Several widely-used quality measures are used, such as the Node Correctness (NC), the Edge Correctness (EC), the Induced Conserved Structure (ICS), the Symmetric Substructure Score (S3), the Matched Neighborhood Consistency (MNC), and the wall time.
The experiments suggest that CONE-Align performs well for both synthetic and real-world networks, LowRank-Align performs well in some occasions, Umeyama's method has moderate performance, but steady over datasets, and IsoRank is the overall worst. As for the quality measures, MNC and S3 seem to be the most fair.