Sunteți pe pagina 1din 6

Prof.

Hasso Plattner

A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases
September 4, 2015

This learning material is part of the reading material for Prof.


Plattners online lecture "In-Memory Data Management" taking place at
www.openHPI.de. If you have any questions or remarks regarding the
online lecture or the reading material, please give us a note at openhpiimdb@hpi.uni-potsdam.de. We are glad to further improve the material.
Research Group "Enterprise Platform and Integration Concepts",
http://epic.hpi.uni-potsdam.de

Chapter 5

A Blueprint of SanssouciDB

SanssouciDB is a prototypical database system for unified analytical and


transactional processing. The concepts of SanssouciDB build on prototypes
developed at the HPI and an existing SAP database system. SanssouciDB is
an SQL database and it contains similar components as other databases such
as a query builder, a plan executer, meta data, a transaction manager, etc.

5.1 Data Storage in Main Memory


In contrast to traditional database management systems, the primary persistence of SanssouciDB is main memory. Yet logging and recovery still require
disks as non-volatile data storage to ensure data consistency in case of failures. All operators, e.g., find, join, or aggregation can anticipate that data
resides in main memory. Thus, operators are implemented differently moving the focus from optimizing for disk access towards optimizing for main
memory access and CPU utilization (see Chapter 4).
This apparently subtle difference of moving the primary persistence has
a vast impact on performance even when disk-based databases are completely memory resistent. Ailamaki et al. invested such fully cached diskbased databases and found that a large portion of query execution is spent
for memory and resource stalls [ADHW99]. Those stalls are mainly caused
by in-page data placements that do not utilize the CPU caches properly.
In many cases, the actual computation accounts for less than 40% of the
execution time. Besides, Harizopoulos et al. found that the buffer management of disk-based databases alone contributes 31% to the overall instruction
count [HAMS08]
Consequently, the reason for the performance advantages of in-memory
over disk-based databases derives from optimized data structures and algorithms avoiding memory and resource stalls together with the removal of
additional indirections.

33

34

5 A Blueprint of SanssouciDB

5.2 Column-Orientation
Another concept used in SanssouciDB was invented more than two decades
ago, that is, storing data column-wise [CK85] instead of row-wise. In columnorientation, complete columns are stored in adjacent blocks. This can be contrasted with row-oriented storage where complete tuples (rows) are stored in
adjacent blocks. Column-oriented storage, in contrast to row-oriented storage, is well suited for reading consecutive entries from a single column. This
can be useful for aggregation and column scans. More details on columnorientation and its differences to row-orientation can be found in Chapter 8.
To minimize the amount of data that needs to be transferred between storage and processor, SanssouciDB uses several different data compression
techniques, which will be discussed in Chapter 7.

5.3 Implications of Column-Orientation


Column-oriented storage has become widespread in database systems
specifically developed for OLAP, as the advantage of column-oriented storage is clear in case of quasi-sequential scanning of single attributes and set
processing thereof. If not all fields of a table are queried, column-orientation
can be exploited as well in transactional processing (avoiding "SELECT *").
An analysis of enterprise applications showed that there is actually no application that uses all fields of a given tuple. For example, in dunning only
17 attributes are necessary out of a table that contains 300 attributes. If only
the 17 needed attributes are queried instead of the full tuple representation
of all 300 attributes, an instant advantage of factor eight to 20 for data to be
scanned can be achieved.
As disk is not the bottleneck any longer, but access to main memory has to
be considered, an important aspect is to work on a minimal set of data. So far,
application programmers were fond of "SELECT *" statements. The difference in runtime between selecting specific fields or all fields in row-oriented
storage is often insignificant and in case changes to an application need
more fields, the data was already there (which besides is a weak argument
for using SELECT * and retrieving unnecessary data). However, in case of
column-orientation, the penalty for "SELECT *" statements grows with table
width. Especially if tables are growing in width during productive usage,
actual runtimes of applications cannot be anticipated during programming.
With the column-store approach, the number of indices can be significantly reduced. In a column store, every attribute can be used as an index.
Because all data is available in memory and the data of a column is stored
consecutively, the scanning speed is high enough that a full sequential scan
of an attribute is sufficient in most cases. If this is not fast enough, dedicated
indices can still be used in addition for further speedup.

5.4 Architecture Overview

35

Storing data in columns instead of rows is challenging for workloads


with many data modifying operations. Therefore, the concept of a differential buffer was introduced, where new entries are written to a differential
buffer first. In contrast to the main store, the differential buffer is optimized
for inserts. At a later point in time and depending on thresholds, e.g. the
frequency of changes and new entries, the data in the differential buffer is
merged into the main store. More details about the differential buffer and
the merge process will be provided later in Chapter 25 and Chapter 27.

5.4 Architecture Overview


The architecture shown in Figure 5.1 grants an overview of the components
of SanssouciDB.
SanssouciDB is split in three different logical layers fulfilling specific tasks
inside the database system. The Management Layer handles the communication to applications, creates query execution plans, stores meta data and
contains the logic for database transactions. Inside the main memory of a specific machine the main working set of SanssouciDB is located. That working
set is accessed during query execution and is stored either in row, column or
hybrid-oriented data layout, depending on the specific type of queries sent
to the database tables. The non-volatile memory in the durable storage area
is used for logging and recovery purposes, as well as for data aging and time
travel. All those concepts will be described in the subsequent sections.
14.8.2014

Canvas 9

Financials

Manufacturing

Logistics

OLTP & OLAP


Applications

SQL Interface
Stored Procedures
Query Execution

Metadata

Sessions

Transactions

Main Memory
Storage

Read-onlyReplicas
Replicas
Read-only

Cold Store - 1

Hot Store (Master)

Main

Main

Main

Attribute Vectors

Attribute Vectors

Attribute Vectors

Dictionaries

Dictionaries

Index

Index

Merge

Cold Store - 2

Management
Layer

Delta
Attribute Vectors

Dictionaries

Dictionaries
Aggregate Cache

Index

Aggregate Cache
Index

History

Aggregate
Cache

Log

Checkpoint
Checkpoints

Durable
Storage

Fig. 5.1: Schematic Architecture of SanssouciDB


file:///Users/sykarian/Dropbox/EPIC/Vorlesungen/TuKSS2014/Overview.svg

1/1

36

REFERENCES

5.5 References
[ADHW99] Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and
David A. Wood. Dbmss on a modern processor: Where does
time go? In Malcolm P. Atkinson, Maria E. Orlowska, Patrick
Valduriez, Stanley B. Zdonik, and Michael L. Brodie, editors,
VLDB, pages 266277, San Francisco, CA, USA, 1999. Morgan
Kaufmann.
[CK85]
George P. Copeland and Setrag N. Khoshafian. A Decomposition Storage Model. SIGMOD Rec., 14(4):268279, May 1985.
[HAMS08] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and
Michael Stonebraker. Oltp through the looking glass, and what
we found there. In Jason Tsong-Li Wang, editor, SIGMOD Conference, pages 981992. ACM, 2008.

S-ar putea să vă placă și