Robert Birke | TU Delft Repository

Cross-Facility Federated Learning

Journal article (2024) - Iacopo Colonnelli (author), Robert Birke (author), Giulio Malenza (author), Gianluca Mittone (author), Alberto Mulone (author), Jeroen Galjaard (author), J.M. Galjaard (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author), Sanzio Bassini (author), Gabriella Scipione (author), More Authors..., More authors...

In a decade, AI frontier research transitioned from the researcher's workstation to thousands of high-end hardware-accelerated compute nodes. This rapid evolution shows no signs of slowing down in the foreseeable future. While top cloud providers may be able to keep pace with thi ...

FCT-GAN

Enhancing Global Correlation of Table Synthesis via Fourier Transform

Conference paper (2023) - Z. Zhao (author), Zilong Zhao (author), Zilong Zhao (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

An alternative method for sharing knowledge while complying with strict data access regulations, such as the European General Data Protection Regulation (GDPR), is the emergence of synthetic tabular data. Mainstream table synthesizers utilize methodologies derived from Generative ...

GDTS: GAN-Based Distributed Tabular Synthesizer

Conference paper (2023) - Z. Zhao (author), Zilong Zhao (author), Zilong Zhao (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

Generative Adversarial Networks (GANs) are typically trained to synthesize data, from images and more recently tabular data, under the assumption of directly accessible training data. While learning image GANs on Federated Learning (FL) and Multi-Discriminator (MD) systems has ju ...

CTAB-GAN+

Enhancing tabular data synthesis

Journal article (2023) - Zilong Zhao (author), Zilong Zhao (author), Z. Zhao (author), Z. Zhao (author), Zilong Zhao (author), Zilong Zhao (author), Aditya Kunar (author), A. Kunar (author), Robert Birke (author), Hiek Van der Scheer (author), Y. Chen (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author), Lydia Chen (author)

The usage of synthetic data is gaining momentum in part due to the unavailability of original data due to privacy and legal considerations and in part due to its utility as an augmentation to the authentic data. Generative adversarial networks (GANs), a paragon of generative mode ...

Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

Conference paper (2023) - S. Ghiassi (author), Amirmasoud Ghiassi (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness o ...

Memory-aware and context-aware multi-DNN inference on the edge

Journal article (2022) - B.A. Cox (author), Bart Cox (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

Deep neural networks (DNNs) are becoming the core components of many applications running on edge devices, especially for real time image-based analysis. Increasingly, multi-faced knowledge is extracted by executing multiple DNNs inference models, e.g., identifying objects, faces ...

Trusted Loss Correction for Noisy Multi-Label Learning

Journal article (2022) - Amirmasoud Ghiassi (author), S. Ghiassi (author), Cosmin Octavian Pene (author), Robert Birke (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author)

Noisy and corrupted labels are shown to significantly undermine the performance of multi-label learning, which has multiple labels in each image. Correcting the loss via a label corruption matrix is effective in improving the robustness of single-label classification against nois ...

Multi Label Loss Correction against Missing and Corrupted Labels

Journal article (2022) - S. Ghiassi (author), Amirmasoud Ghiassi (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

Missing and corrupted labels can significantly ruin the learning process and, consequently, the classifier performance. Multi-label learning where each instance is tagged with variable number of labels is particularly affected. Although missing labels (false-negatives) is a well- ...

Permutation-Invariant Tabular Data Synthesis

Conference paper (2022) - Yujin Zhu (author), Z. Zhao (author), Zilong Zhao (author), Zilong Zhao (author), Robert Birke (author), Y. Chen (author), Lydia Chen (author), Lydia Y. Chen (author)

Tabular data synthesis is an emerging approach to circumvent strict regulations on data privacy while discovering knowledge through big data. Although state-of-the-art AI-based tabular data synthesizers, e.g., table-GAN, CTGAN, TVAE, and CTAB-GAN, are effective at generating synt ...

Masa

Responsive Multi-DNN Inference on the Edge

Conference paper (2021) - Bart Cox (author), B.A. Cox (author), Jeroen Galjaard (author), S. Ghiassi (author), Amirmasoud Ghiassi (author), Robert Birke (author), Lydia Chen (author), Y. Chen (author), Lydia Y. Chen (author)

Deep neural networks (DNNs) are becoming the core components of many applications running on edge devices, especially for real time image-based analysis. Increasingly, multi-faced knowledge is extracted via executing multiple DNNs inference models, e.g., identifying objects, face ...

Preface

Journal article (2021) - Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author), Robert Birke (author)

TrustNet

Learning from Trusted Data Against (A)symmetric Label Noise

Conference paper (2021) - S. Ghiassi (author), Amirmasoud Ghiassi (author), Robert Birke (author), Lydia Y. Chen (author), Y. Chen (author), Lydia Chen (author)

Big Data systems allow collecting massive datasets to feed the data hungry deep learning. Labelling these ever-bigger datasets is increasingly challenging and label errors affect even highly curated sets. This makes robustness to label noise a critical property for weakly-supervi ...

Artifact

Masa: Responsive Multi-DNN Inference on the Edge

Conference paper (2021) - B.A. Cox (author), Bart Cox (author), Jeroen Galjaard (author), J.M. Galjaard (author), S. Ghiassi (author), Amirmasoud Ghiassi (author), Robert Birke (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author)

This artifact is a guideline how the Edgecaffe framework, presented in [1], can be used. Edgecaffe is an open-source Deep Neural Network framework for efficient multi-network inference on edge devices. This framework enables the layer by layer execution and fine-grained control d ...

Enhancing Robustness of On-line Learning Models on Highly Noisy Data

Journal article (2021) - Zilong Zhao (author), Robert Birke (author), Rui Han (author), Bogdan Robu (author), Sara Bouchenak (author), Sonia Ben Mokhtar (author), Sonia Ben Mokhtar (author), Sonia Ben Ben Mokhtar (author), Sonia Mokhtar (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author)

Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the wild can be ...

Online label aggregation

A variational bayesian approach

Conference paper (2021) - C. Hong (author), Chi Hong (author), S. Ghiassi (author), Amirmasoud Ghiassi (author), Yichi Zhou (author), Robert Birke (author), Lydia Chen (author), Y. Chen (author), Lydia Y. Chen (author)

Noisy labeled data is more a norm than a rarity for crowd sourced contents. It is effective to distill noise and infer correct labels through aggregating results from crowd workers. To ensure the time relevance and overcome slow responses of workers, online label aggregation is i ...

LABELNET

Recovering Noisy Labels

Conference paper (2021) - Amirmasoud Ghiassi (author), S. Ghiassi (author), Robert Birke (author), Rui Han (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author)

Today's available datasets in the wild, e.g., from social media and open platforms, present tremendous opportunities and challenges for deep learning, as there is a significant portion of tagged images, but often with noisy, i.e. erroneous, labels. Recent studies improve the robu ...

MemA

Fast Inference of Multiple Deep Models

Conference paper (2021) - Jeroen Galjaard (author), Bart Cox (author), B.A. Cox (author), S. Ghiassi (author), Amirmasoud Ghiassi (author), Lydia Chen (author), Y. Chen (author), Lydia Y. Chen (author), Robert Birke (author)

The execution of deep neural network (DNN) inference jobs on edge devices has become increasingly popular. Multiple of such inference models can concurrently analyse the on-device data, e.g. images, to extract valuable insights. Prior art focuses on low-power accelerators, compre ...

Pipetune

Pipeline parallelism of hyper and system parameters tuning for deep learning clusters

Conference paper (2020) - Isabelly Rocha (author), Nathaniel Morris (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author), Pascal Felber (author), Robert Birke (author), Valerio Schiavoni (author)

DNN learning jobs are common in today's clusters due to the advances in AI driven services such as machine translation and image recognition. The most critical phase of these jobs for model performance and learning cost is the tuning of hyperparameters. Existing approaches make u ...

Chisel

Reshaping Queries to Trim Latency in Key-Value Stores

Conference paper (2019) - Robert Birke (author), Juan Perez (author), Juan Pérez (author), Juan F. Pérez (author), Juan F. Perez (author), Sonia Ben Mokhtar (author), Sonia Mokhtar (author), Sonia Ben Mokhtar (author), Sonia Ben Ben Mokhtar (author), Navaneeth Rameshan (author), Y. Chen (author), Lydia Chen (author), Lydia Y. Chen (author)

It is challenging for key-value data stores to trim user (tail) latency of requests as the workloads are observed to have skewed number of key-value pairs and commonly retrieved via multiget operation, i.e., all keys at the same time. In this paper we present Chisel, a novel clie ...

Differential approximation and sprinting for multi-priority big data engines

Conference paper (2019) - Robert Birke (author), Isabelly Rocha (author), Juan Perez (author), Juan F. Pérez (author), Juan F. Perez (author), Juan Pérez (author), Valerio Schiavoni (author), Pascal Felber (author), Y. Chen (author), Lydia Y. Chen (author), Lydia Chen (author)

Today’s big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe l ...