GPGPU Linear Complexity t-SNE Optimization

Pezzotti, N.; Thijssen, J.G.L.; Mordvinstev, Alexander; Höllt, T.; Van Lew, B.; Lelieveldt, Boudewijn; Eisemann, Elmar; Vilanova Bartroli, A.

doi:10.1109/TVCG.2019.2934307

GPGPU Linear Complexity t-SNE Optimization

Journal article (2020)

Authors

N. Pezzotti Computer Graphics and Visualisation - , Google AI

J.G.L. Thijssen Computer Graphics and Visualisation -

Alexander Mordvinstev Google AI

T. Höllt Computer Graphics and Visualisation - , Leiden University Medical Center

B. Van Lew Leiden University Medical Center

Boudewijn Lelieveldt Leiden University Medical Center, Pattern Recognition and Bioinformatics -

Elmar Eisemann Computer Graphics and Visualisation -

A. Vilanova Bartroli Computer Graphics and Visualisation -

Research Group

Computer Graphics and Visualisation () (TU Delft)

DOI: https://doi.org/10.1109/TVCG.2019.2934307

GPGPU Dimensionality Reduction Approximate Computation High Dimensional Data Progressive Visual Analytics

To reference this document use:

http://resolver.tudelft.nl/uuid:52823dd9-a8c1-470e-a4ec-77a1a59577dc

More Info

expand_more

Published Date

2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Intelligent Systems

Research Group

Computer Graphics and Visualisation

Abstract

In recent years the t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm has become one of the most used and insightful techniques for exploratory data analysis of high-dimensional data. It reveals clusters of high-dimensional data points at different scales while only requiring minimal tuning of its parameters. However, the computational complexity of the algorithm limits its application to relatively small datasets. To address this problem, several evolutions of t-SNE have been developed in recent years, mainly focusing on the scalability of the similarity computations between data points. However, these contributions are insufficient to achieve interactive rates when visualizing the evolution of the t-SNE embedding for large datasets. In this work, we present a novel approach to the minimization of the t-SNE objective function that heavily relies on graphics hardware and has linear computational complexity. Our technique decreases the computational cost of running t-SNE on datasets by orders of magnitude and retains or improves on the accuracy of past approximated techniques. We propose to approximate the repulsive forces between data points by splatting kernel textures for each data point. This approximation allows us to reformulate the t-SNE minimization problem as a series of tensor operations that can be efficiently executed on the graphics card. An efficient implementation of our technique is integrated and available for use in the widely used Google TensorFlow.js, and an open-source C++ library.

Files

08811606.1.pdf

(pdf | 14.3 Mb)

Unknown license

08811606.pdf

(pdf | 2.33 Mb)

Unknown license

Download not available