Programmierprojekt

Introduction

This is the documentation of the programming project for the exercises in the Lecture Databases II - Implementation Techniques (DB2) in summer term 2020. You can find general information about the lecture here.

Tasks

The purpose of the programming tasks is to deepen your knowledge in selected aspects of the lecture. This year, we decided to set this focus on compression techniques in column oriented database management systems. Furthermore, we choose C++ as programming language, because it is the most frequently used programming language for database management systems (except C). The task is to implement compression techniques in our framework. We provide a set of classes as presetting, where you have to include an implementation w.r.t. an interface. You can download the sources here. A set of unit tests will help you during the development process to identify errors. The same unit tests will be used at the end of the term to validate your solution. A working implementation is a necessary prerequisite to participate in the exam!

You may choose between the following compression techniques (you may also suggest other compression techniques):

  • Run Length Encoding
  • Delta Coding
  • Bit-Vector Encoding
  • Dictionary Encoding
  • Frequency Partitioning

All compression techniques are explained in the lecture. You can find the slides here.

Organization

Students will form teams of two students each.
Please register your team until the 11.05.2020  via moodle.

Solutions are to be submitted via moodle.
The deadline is the 06.07.2020 at 23:59 o'clock.
Note that the deadline is strict, there will be no deadline extension.

Solutions will be presented and discussed by each team in the last exercise.

Teams consisting of bachelor students have to implement two compression techniques and will receive 5 credit points when they pass the exam.
Teams consisting of master students have to implement three compression techniques, because they will receive 6 credit points when they pass the exam.

We will check the quality of your submitted solution. It has to pass the unit tests, implement the compression technique it represents, and may not be a copy of a solution submitted by another team or any third party implementation. Solutions who fail to fulfill only one of these requirements will not be able to participate in the exam.

Setup and Tools

The framework runs on Linux and Windows (cygwin) with common C++ compilers (g++, clang). You need to install the boost libraries (Serialization, Any), which can be installed easily on Linux and Windows (cygwin).

Setup in Ubuntu

Open a terminal and type:

sudo apt-get install build-essential libboost-all-dev doxygen

 

Then, enter the directory you unpacked the archive with the source code and type the following commands to build the program, the documentation and run the program:

cd db2_programming_project; make; make documentation; make run

Setup for Windows (Cygwin) - Unsupported

As for Windows (cygwin), you need to install the necessary packages using the GUI of the cygwin setup program, which you can download on the official website. You should install the latest version of the boost libraries, the compiler you wish to use (e.g., g++, clang), the make program, as well as a tool to unpack the source archive. The build steps are the same as for Ubuntu.

Setup for MacOS- Unsupported

In case of MacOS, you need to install xcode, macPorts and its necessary libraries for compiling the project successfully. You can more details and installer for XCode on the official website. MacPort related details are given in the official website. Next, you have to install boost libraries, typing

sudo port install boost

Once these components are installed, the build steps are same as for Ubuntu. Note: In case it is hard to install some of these components and setting the header path to boost, you can follow the steps in this video

Getting Started

To implement your selected compression technique, you have to inherit from the base class CoGaDB::CompressedColumn and implement it's pure virtual methods (similar to an abstract method in Java). You can test your class by creating an instance and pass a pointer to the unittest function. We prepared an example in the project, the CoGaDB::DictionaryCompressedColumn, which is stored in the file compression dictionary_compressed_column.hpp.

C++ Background

You should familiarize yourself with the following features of the C++ language:

  • pointers, references, and smart pointers
  • create objects on the heap with new
  • call by value and call by reference
  • public inheritance
  • basic STL containers, such as std::vector and std::list
  • basic templates and how to use them

You can find a lot of useful examples in the framework code, e.g., the unit tests.

Recommended (selected) sources of information about C++ are:

  • http://www.cplusplus.com/
  • Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, 4th edition, 2013
  • Scott Meyers. Effective C++: 55 Specific Ways to Improve Your Programs and Designs, Addison-Wesley, 3rd edition, 2005

Letzte Änderung: 14.06.2021 - Ansprechpartner: