Can Portability Improve Performance?

Varbanescu, A.L.; Verstraaten, Merijn; de Laat, Cees; Penders, A.B.; Iosup, A.; Sips, H.J.

Can Portability Improve Performance?

An Empirical Study of Parallel Graph Analytics

Conference paper (2015)

Authors

A.L. Varbanescu Universiteit van Amsterdam

Merijn Verstraaten Universiteit van Amsterdam

Cees de Laat Universiteit van Amsterdam

A.B. Penders Data-Intensive Systems -

A. Iosup Data-Intensive Systems -

H.J. Sips Data-Intensive Systems -

Research Group

Data-Intensive Systems () (TU Delft)

To reference this document use:

http://resolver.tudelft.nl/uuid:f875aa05-184b-40df-a35d-24fea086e305

More Info

expand_more

Published Date

2015

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Research Group

Data-Intensive Systems

Abstract

Due to increasingly large datasets, graph analytics - traversals, all-pairs shortest path computations, centrality measures, etc. - are becoming the focus of high-performance computing (HPC). Because HPC is currently dominated by many-core architectures (both CPUs and GPUs), new graph processing solutions have to be defined to efficiently use such computing resources. Prior work focuses on platform-specific performance studies and on platform-specific algorithm development, successfully proving that algorithms highly tuned to GPUs or multi-core CPUs can provide high performance graph analytics. However, the portability of such algorithms remains an important concern for many users, especially the many companies without the resources to invest in HPC or concerned about lock-in in single-use parallel techniques.
In this work, we investigate the functional portability and performance of graph analytics algorithms. We conduct an empirical study measuring the performance of 3 graph analytics algorithms (a single code implemented in OpenCL and targeted at many-core CPUs and GPUs), on 3 different platforms, using 11 real-world and synthetic datasets. Our results show that the code is functionally portable, that is, applications can run unchanged on both CPUs and GPUs. The large variation in their observed performance indicates that portability is necessary not only for productivity, but, surprisingly, also for performance. We conjecture that the impact of datasets on performance is too high to allow platform-specific algorithms to outperform the portable algorithms by large margins, in all cases. Our conclusion is that portable parallel graph analytics is feasible without significant performance loss, and provides a productive alternative to the expensive trial-and-error selection of one algorithm for each (graph,platform) pair.