Past Projects

Accelerating Real-time Financial Market Data Processing

Financial market data has a trend of ever increasing growth in volumes. Increased volume of data and real-time processing constraints put a burden on processing capabilities of market data processing systems. More...


Crescando

Crescando is a scalable, distributed relational table implementation designed to perform large numbers of queries and updates with guaranteed access latency and data freshness. To this end, Crescando leverages a number of modern query processing techniques and hardware trends. Specifically, Crescando is based on parallel, collaborative scans in main memory and so-called "query-data" joins known from data-stream processing. More...


Global Distributed Dictionary

Snapshot Isolation is a widely adopted technique for transaction handling in database systems. This project explores the possibilities of this technique in two directions: in a distributed setting and on column stores.More...


MaxStream Federated Stream Processing System

Despite the availability of several commercial data stream processing engines (SPEs), it remains hard to develop and maintain streaming applications. More...


Semantic Data Warehouse Search

During the financial crises in 2008 several financial institutions needed to search their  data warehouses for investment products related to Lehman Brothers. Often that information was not readily available. The goal is to design and implement novel (semantic) search strategies that enable easy to use key word searching over the data warehouse. One of the challenges is to combine (semantic) search technology on meta data with base data stored in Terabyte-scale relational databases. More...


Snapshot Isolation in Distributed Column Stores

Snapshot Isolation is a widely adopted technique for transaction handling in database systems. This project explores the possibilities of this technique in two directions: in a distributed setting and on column stores. More...


Xadoop

Due to legal requirements all the queries against the DWH databases in production need to be audited. Hence, all queries of the DWH are logged and written out into compressed XML files. The log files are currently kept for certain number of days before they are archived. The total volume of the log files before being archived has an estimated size of some 6 TB. Due to the large data volume, processing these queries is not straightforward. More...