Efficient processing of complex highly selective queries over large datasets: There and back again
Modern so-called big data systems have popularised the idea of massively parallel access and processing of large data sets over distributed infrastructures, and have dominated the (big) data storage/processing scene over the last 15 years. Whereas these paradigms are not without merit, they typically fail to deliver on their promises when processing highly selective complex queries. In these cases, the key to efficient and timely query processing, is the ability to quickly identify data items of interest while filtering out those that would not contribute to the final result, thus reducing storage, network and processing overheads. This talk will explore a number of such queries over a multitude of data types, examine the issues arising during their processing, and describe novel semi-centralised but scalable methods which manage to clearly outperform the state of the art (often by orders of magnitude).
Nikos Ntarmos is the Director of the Database Lab at the Edinburgh Research Centre of Huawei Technologies R&D (UK) Ltd, and also a Senior Lecturer (Associate Professor) at the School of Computing Science, University of Glasgow, where he leads the Information, Data, Events, Analytics, at Scale (IDEAS) research lab. He received his PhD and MSc from the University of Patras, Greece and his undergraduate diploma from the Technical University of Crete, Greece. He is a member of the IEEE and the ACM, and a Fellow of the UK Higher Education Academy (HEA). His research interests lie in the areas of distributed computing and large-scale data management systems, with a special interest in storage, indexing and query processing in distributed NoSQL data stores, graph databases, geo-distributed data management infrastructures, and joint in-rest and streaming data systems.