JP

J.W. Peltenburg

14 records found

Tydi is an open specification for streaming dataflow designs in digital circuits, allowing designers to express how composite and variable-length data structures are transferred over streams using clear, data-centric types. These data types are extensively used in a many applicat ...

FPGA Acceleration for Big Data Analytics

Challenges and Opportunities

The big data revolution has ushered an era with ever increasing volumes and complexity of data requiring ever faster computational analysis. During this very same era, CPU performance growth has been stagnating, pushing the industry to either scale their computation horizontall ...
As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a ...
In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-mem ...
JSON is a popular data interchange format for many web, cloud, and IoT systems due to its simplicity, human readability, and widespread support. However, applications must first parse and convert the data to a native in-memory format before being able to perform useful computatio ...
Because of fundamental limitations of CMOS technology, computing researchers and the computing industry are focusing on using transistors in integrated circuits more efficiently towards obtaining a computational goal. At the architectural level, this has led to an era of heteroge ...

Tydi

An open specification for complex data structures over hardware streams

Streaming dataflow designs describe hardware by connecting components through streams that transport data structures. We introduce a stream-oriented specification and type system that provides a clear and intuitive way to map complex, dynamically-sized data structures onto hardwa ...

ArrowSAM

In-Memory Genomics Data Processing Using Apache Arrow

The rapidly growing size of genomics data bases, driven by advances in sequencing technologies, demands fast and cost-effective processing. However, processing this data creates many challenges, particularly in selecting appropriate algorithms and computing platforms. Computing s ...

Fletcher

A framework to efficiently integrate FPGA accelerators with apache arrow

Modern big data systems are highly heterogeneous. The components found in their many layers of abstraction are often implemented in a wide variety of programming languages and frameworks. Due to language implementation differences, interfaces between these components, including h ...
The newly proposed posit number format uses a significantly different approach to represent floating point numbers. This paper introduces a framework for posit arithmetic in reconfigurable logic that maintains full precision in intermediate results. We present the design and impl ...

Supporting Columnar In-memory Formats on FPGA

The Hardware Design of Fletcher for Apache Arrow

As a columnar in-memory format, Apache Arrow has seen increased interest from the data analytics community. Fletcher is a framework that generates hardware interfaces based on this format, to be used in FPGA accelerators. This allows efficient integration of FPGA accelerators wit ...
Convolutional Neural Networks (CNNs) are a class of widely used deep artificial neural networks. However, training large CNNs to produce state-of-the-art results can take a long time. In addition, we need to reduce compute time of the inference stage for trained networks to make ...

Pushing Big Data into Accelerators

Can the JVM Saturate Our Hardware?

Advancements in the field of big data have led into an increasing interest in accelerator-based computing as a solution for computationally intensive problems. However, many prevalent big data frameworks are built and run on top of the Java Virtual Machine (JVM), which does not e ...
In the analysis of next-generation DNA sequencing data, Hidden Markov Models (HMMs) are used to perform variant calling between DNA sequences and a reference genome. The PairHMM model is solved by the Forward Algorithm, for which the performance and power efficiency can be increa ...