A. Katsifodimos
168 records found
1
Authored
BlockJoin
Efficient Matrix Partitioning Through Joins
In our data-centric society, online services, decision making, and other aspects are increasingly becoming heavily dependent on trends and patterns extracted from data. A broad class of societal-scale data management problems requires system support for processing unbounded da ...
Parallel collection processing based on second-order functions such as map and reduce has been widely adopted for scalable data analysis. Initially popularized by Google, over the past decade this programming paradigm has found its way in the core APIs of parallel dataflow eng ...
Cutty
Aggregate sharing for user-defined windows
Aggregation queries on data streams are evaluated over evolving and often overlapping logical views called windows. While the aggregation of periodic windows were extensively studied in the past through the use of aggregate sharing techniques such as Panes and Pairs, little to ...
Bridging the Gap
Towards optimization across linear and relational Algebra
Advanced data analysis typically requires some form of preprocessing in order to extract and transform data before processing it with machine learning and statistical analysis techniques. Pre-processing pipelines are naturally expressed in dataflow APIs (e.g., MapReduce, Flink ...
Emma in action
Declarative Dataflows for scalable data analysis
Apache Flink™
Stream and Batch Processing in a Single Engine
Delta
Scalable data dissemination under capacity constraints
The efficient processing of XQuery still poses significant challenges. A particularly effective technique to improve XQuery processing performance consists of using materialized views to answer queries. In this work, we consider the problem of choosing the best views to materi ...