Theses/Jobs

Please contact the respective supervisor in case of questions for a specific topic. Further topics can usually be found through direct contact with our group members. We are open to new ideas - send your own topics to one of our members or a mailing list (for projects, theses, and internships).

Topics for theses

The following open topics are currently offered for Bachelor, Master and Diploma theses.

Database Topics

  • Elf: An efficient index structure for multi-column selection predicates
Supervisor:   David Broneske
Abstract: With analytical queries getting more and more complex, the number of evaluated selection predicates per query rises as well, which also involves several different columns. Our idea is to address this new requirement with Elf, a multi-dimensional index structure and storage structure. Elf is able to exploit the relation between data of several columns to accelerate multi-column selection predicate evaluation. Elf features cache sensitivity, an optimized storage layout, fixed search paths, and slight data compression. However, there are still many points that have to be researched. These include, but are not limited to: efficient insert/update/delete algorithms, a merging mechanism of two Elfs, Furthermore, we are interested in how far SIMD can be used to accelerate the multi-column predicate evaluation in Elf.
Goals and results: 
  • Implementation of efficient build, update, or search algorithms for Elf
  • Critical evaluation against state-of-the-art competitors
  • Relational Algebra Translator for SQLValidator (Bachelor)
Supervisor: ; Victor Obionwu
Abstract: This translator will allow students to execute relational algebra statements on a database. The relational algebra translator will read a relational statement as input and perform the following basic steps.
Goals and results: 
  • Syntax Validation: The syntax of the query is verified in this step.
  • Semantics Verification: Here, Type checking and verification of valid column references will be performed.
  • Query Evaluation: The query is evaluated using database engine. Its relational algebra statement will be translated to an SQL statement and then executed against a database.
  • Analytical report on primitives-based execution of TPC-H queries
Supervisor:   Bala Gurumurthy
Abstract: TPC-H provides compute intensive queries as benchmark for comparing different data processing mechanisms. In this topic, we explore the diiferent aspects of primitives to be tuned for efficient processing of all these queries and provide a extensive analytical report on the impact of the tuing opportunities on these queries
Goals and results: 
  • General primitive based execution of TPC-H queries
  • Variants of the primitives
  • Analytical report on the primitive-based execution
Notes or Requirement: 
  • programming in C++ & OpenCL and Python/R (for charting)

Software Engineering Topics

  • Semi-automatic approaches to support systematic literature reviews (Master)
Supervisor:   Yusra Shakeel
Abstract: Systematic Literature Review (SLR) is a methodology of research which aims to gather and evaluate all the available evidence regarding a specific research topic. The process is composed of three phases: Planning, Conducting and Reporting. Although SLRs gained immense popularity among evidence-based researchers, to conduct the entire process manually can be very time consuming. Hence, software engineering researchers are currently involved in proposing semi-automatic approaches to support different phases of an SLR. In this thesis, you analyze the current state of research related to reporting phase of the SLR process. Based on the findings, develop an approach to support researchers with the steps involved for reporting results of an SLR.
Goals and results: 
  • Determine the current-state-of-art related to approaches for reporting of an SLR
  • Propose and evaluate your concept to semi-automate the steps involved in this phase
  • Automate quality assessment of studies to support literature analysis (Bachelor/Master)
Supervisor:   Yusra Shakeel
Abstract: The number of empirical studies reported in software engineering have significantly increased over the past years. However, there are some problems associated with them, for example, the approach used to conduct the study is not clear or the conclusions are incomplete. Thus, making it difficult for evidence-based researchers to conduct an effective and valid literature analysis. To overcome this problem, a methodology to assess the quality and validity of empirical studies is important. Manually performing quality assessment of empirical studies is quite challenging hence, we propose a semi-automatic approach. In this thesis, you improve the already existing prototype for assessing quality of articles. The aim is to provide the most promising studies relevant to answer a research question.
Goals and results: 
  • Extend existing prototype to assess quality of empirical studies
  • Evaluate the proposed approach
  • Cleaning Feature Models (Bachelor)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Feature Modeling

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). The variability of an SPL is made explicit in a feature model, which is typically visualized as a tree-like feature diagram. The tree structure of this diagram makes it hard to express complex dependencies between features, which are therefore expressed as additional propositional formulas (called cross-tree constraints). Such complex feature dependencies may cause inconsistencies in a feature model like dead features, which cannot be selected in any configuration.

Abstract:

A feature model of a configurable system can contain several inconsistencies (besides dead features, also false-optional features or redundant cross-tree constraints). There are analyses for detecting these kinds of inconsistencies; however, there is user action required to actually fix them. The aim of this thesis is to investigate how and to what degree this process can be automatized; that is, whether feature models can be “cleaned” with a simple and intuitive procedure.

Goals:
  • Compare and discuss suitable strategies for cleaning (e.g., which redundant constraints to remove)
  • Implement promising strategies in FeatureIDE
  • Evaluate usability in a user study
Requirements:

Completion of ISP is strongly recommended.

  • Configuring Attributed Feature Models (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Highly-Configurable Software, Constraint Programming

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). The variability of an SPL is made explicit in a feature model, which is typically visualized as a tree-like feature diagram. This feature model is then used to create concrete configurations, a process that is typically supported within an advanced configuration editor. Such editors use state-of-the-art analysis techniques that are based on constraint programming with SAT/CSP solvers.

Abstract:

Attributed feature models extend common feature models with attributes for certain features like costs, memory consumption, or performance impact. In state-of-the-art tooling, such attributes cannot be used in configurations, which is however one of the main reasons to use attributes in the first place (for example, to set a maximum cost allowance). The aim of this thesis is to extend the configuration editor of FeatureIDE (a tool for developing SPLs) such that it allows to limit the value range of attributes during configuration. Additionally common analyses for configurations (such as checking validity, auto-completion, or decision propagation) should also be considered.

Goals:
  • Develop a concept and implementation to support configuration of attributed feature models
  • Adapt existing analyses for the configuration editor
  • Evaluate performance and/or usability
Requirements:

Completion of ISP is strongly recommended.

  • Cross-Tree Constraints for Attributed Feature Models (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Feature Modeling, Constraint Programming

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). The variability of an SPL is made explicit in a feature model, which is typically visualized as a tree-like feature diagram. The tree structure of this diagram makes it hard to express complex dependencies between features, which are therefore expressed as additional propositional formulas (called cross-tree constraints). Such complex feature dependencies may cause inconsistencies in a feature model like dead features, which cannot be selected in any configuration. To avoid these inconsistencies, state-of-the-art tooling uses analysis techniques that are based on constraint programming with SAT/CSP solvers.

Abstract:

Attributed feature models extend common feature models with attributes for certain features like costs, memory consumption, or performance impact. In state-of-the-art tooling, such attributes cannot be used in cross-tree constraints, which is however one of the main reasons to use attributes in the first place (for example, to set ranges for integer attributes). The aim of this thesis is to extend the feature model editor of FeatureIDE (a tool for developing SPLs) to allow for cross-tree constraints over attributes. Additionally common feature model analyses (such as checking the validity of a feature model, finding dead/core features, or finding redundant constraints) should also be considered.

Goals:
  • Develop a concept and implementation to support constraints over (at least) integer and enumeration attributes
  • Adapt existing feature model analyses to such constraints
  • Evaluate performance on several feature models
Requirements:

Completion of ISP is strongly recommended.

  • Efficient Detection of Indeterminate Hidden Features in Feature Models (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Feature Modeling, Propositional Logic

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). The variability of an SPL is made explicit in a feature model, which is typically visualized as a tree-like feature diagram. This feature model is then used to create concrete configurations, a process that is typically supported within an advanced configuration editor. To support the feature modeling and configuration processes, state-of-the-art tooling uses analysis techniques that are based on SAT solver technology.

Abstract:

Hidden features can be used to mark features that are not configurable by end users, but are automatically (de-)selected by the configuration editor according to additional constraints in the feature model. A hidden feature is called indeterminate if there is at least one configuration in which all regular features are defined but a value for the hidden feature cannot be deduced. Indeterminate hidden features can cause problems during configuration and should therefore be detected beforehand, which is a time-consuming task. The aim of the thesis is to optimize the current analysis in FeatureIDE (a tool for developing SPLs) for finding indeterminate hidden features to improve its performance and therefore applicability in the field.

Goals:
  • Develop an improved analysis for finding indeterminate hidden features in FeatureIDE
  • Evaluate the new analysis by comparing it to previous algorithms
Requirements:

Completion of ISP is strongly recommended.

  • Efficiently Counting Feature Model Configurations (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Feature Modeling, Propositional Logic

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). The variability of an SPL is made explicit in a feature model, which is typically visualized as a tree-like feature diagram. This feature model is then used to create concrete configurations, a process that is typically supported within an advanced configuration editor. To support the feature modeling and configuration processes, state-of-the-art tooling uses analysis techniques that are based on SAT solver technology.

Abstract:

Counting how many configurations a feature model comprises can be a time-consuming task — however, sometimes it is sufficient to know an estimation of the number of configurations. There are already algorithms for precisely and approximately counting the number of configurations that a feature model represents. The aim of this thesis is to create a systematic overview of all available methods for exact and approximate configuration counting. The overview should contain a comparison of the differences between the researched methods in efficiency and accuracy. At least one approximate counting approach should be implemented in FeatureIDE and evaluated against contemporary SAT and #SAT approaches.

Goals:
  • Systematic research and survey of methods for (approximate) configuration counting for feature models
  • Evaluate and compare the researched methods
Requirements:

Completion of ISP is strongly recommended.

  • Evolution-Aware Configuration Sampling for Software Product Lines (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Software Testing, Evolution

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). A key challenge in managing SPLs is the potentially large (exponential) number of products, which makes it hard to test all products in an SPL — for example, there are easily more variants of the Linux kernel than there are atoms in the universe. In practice, this means that testing an SPL often amounts to testing some number of configurations and hoping that such a configuration sample covers most potential faults in the SPL. One class of faults in SPLs are so-called feature interactions, which are bugs that only occur for very specific combinations of selected features. For example, two device drivers in the Linux kernel may work fine on their own, but be incompatible with each other, which causes crashes when both drivers are loaded at the same time.

Abstract:

A contemporary technique for testing SPLs is to generate a representative set of configurations (i.e., a configuration sample) and run tests for every configuration in the sample. However, generating such samples can be time-intensive, and if the SPL is changed, samples become invalid and must be generated completely from scratch. The aim of this thesis is to find an efficient technique to update an existing sample such that it can accommodate changes to the SPL. Thus, after changes to the feature model (which expresses the valid configurations of an SPL) have been made, parts of an existing sample should be reused to avoid complete re-calculation.

Goals:
  • Develop and implement a technique for efficiently updating a configuration sample after feature model evolution
  • Evaluate the proposed approach
Requirements:

Completion of ISP is strongly recommended.

  • Identifying Feature Interactions in Software Product Lines (Bachelor/Master)
Supervisor:

Elias Kuiter

Keywords:

Software Product Lines, Software Testing

Context:

Software product lines (SPLs) are collections of many software products (called configurations) that share certain characteristics (called features). Thus, SPLs can be used to develop and maintain variant-rich, highly-customizable software systems (e.g., the Linux kernel). A key challenge in managing SPLs is the potentially large (exponential) number of products, which makes it hard to test all products in an SPL — for example, there are easily more variants of the Linux kernel than there are atoms in the universe. In practice, this means that testing an SPL often amounts to testing some number of configurations and hoping that such a configuration sample covers most potential faults in the SPL. One class of faults in SPLs are so-called feature interactions, which are bugs that only occur for very specific combinations of selected features. For example, two device drivers in the Linux kernel may work fine on their own, but be incompatible with each other, which causes crashes when both drivers are loaded at the same time.

Abstract:

A contemporary technique for testing SPLs is to generate a representative set of configurations (i.e., a configuration sample) and run tests for every configuration in the sample. In case one configuration fails some tests, the developers have to fix the corresponding fault in the SPL. The aim of this thesis is to detect the feature interaction(s) responsible for such a fault. That is, given a configuration sample for an SPL and some configuration(s) in the sample that fail a test, your technique should find the smallest set of selected features that is necessary to fail the test. This set then indicates the feature interaction to the developer, who can use this information to fix the corresponding bug.

Goals:
  • Investigate techniques to find faulty feature interactions for a given configuration sample
  • Evaluate the performance of these techniques
Requirements:

Completion of ISP is strongly recommended.

Working students and open positions

Currently, there are no open job positions.

Scientific team projects

For scientific team projects, we offer a lecture: 

In the first lecture, several topics are presented for students to work on during the semester.

Software Projects

Currently, we offer the following topics for software projects:

  • Analytic Dashboard for SQLValidator
Supervisor:   Victor Obionwu
Description: Desired features: Admin page where the respective survey statistics are analyzed and classified. A student page that shows the statistics about the students work patterns over the past week and months with respect to the exercise tasks, self-checks and team activities. The dashboard will also be able to provide recommendation based on a student’s survey and task submissions.
  • Verbesserungen für SQLValidator (Bachelor)
Ansprechpartner:  David Broneske
Beschreibung: Im Rahmen dieses Softwareprojekts soll das bestehende Tool SQLValidator um weitere Funktionalität erweitert werden. Die zu implementierende Funktionalität ist dabei zusammen mit dem Betreuer abzustimmen und kann beliebig erweitert oder eingeschränkt werden. Mögliche Aufgaben sind:
  • User Statistiken über bearbeitete Aufgaben
  • User Account Management
  • Erfassung mehrerer Jahrgänge
  • Duplizierung von Aufgaben
  • Check der Korrektheit der Aufgaben bei deren Erstellung
  • Einreichung ER-Aufgaben
Ziele und Ergebnisse: 
  • Implementierung weiterer Funktionen im SQL Validator
  • Datenquälität im Datacenter: das nächste Level ist greifbar (Bachelor)
Ansprechpartner:  David Broneske, Marcus Pöhls
Beschreibung: Implementierung einer Applikation zur Verbesserung der Datenqualität von CPU-Daten eines Rechenzentrums. Zuerst wird die Datenqualität analysiert und anschließend mit Hilfe der APIs von Intel und AMD verbessert. Die Daten der Hardware-Infrastruktur stelle ich für das Projekt bereit. Konkretes Beispiel: für eine Maschine bei der keine Angabe der CPU-Cores vorhanden ist, kann über das Prozessor-Modell (bspw. Intel Xeon Processor E5-2650 v3) die Core-Lücke geschlossen werden. Das SW-Projekt beinhaltet auch die Duplikaterkennung. D.h. für Maschinen die im Datensatz mehrfach und sogar mit unterschiedlichen CPU-Daten vorhanden sind, wird der "beste" (plausibelste) Datenstand genutzt.
Ziele und Ergebnisse: 
  • Recherche und Implementierung von Algorithmen zur Datenqualitätsanalyse
  • API Integration mit Intel/AMD
  • Duplikate finden und bereinigen
  • Carbon2Json: Improve Conversion Time from CARBON to JSON
About Profile current implementation to find bottleneck in multi-step conversion routine, design and implementation new concepts, improve existing ones.
Topic type  Team Project
Supervisor   Marcus Pinnecke
Fundamentals  See guest lecture A Gentle Introduction to Document Stores and Querying with the SQL/JSON Path Language in Advanced Topics of Database Systems
Context  See our open source project Protobase/libcarbon on GitHub in which you will add and evaluate the required functionality. You may read a brief introduction to the system first: Pinnecke et al. Protobase: It's About Time for Backend/Database Co-Design. BTW 2019
Requirements  In addition to formal requirements for a thesis (e.g., number of credit points), you match the following profile.

You have had courses on practical application of computer science concepts including programming languages, and knowledge of data structures along with their implementation. Moreover, you have good abstraction skills, and the drive to work with a non-trivial code base (35k lines of C code). Finally, you have the following focus during your studies:
  1. You have had a course for Database Implementations (e.g., Databases II) or similar
  2. You have had a course for Modern Database Concepts (e.g., Advanced Topics in Databases) or similar
  3. You have had a course for C (C++) programming (e.g., Architecting and Engineering Main Memory Database Systems in Modern C) or practice of C programming
A passion for the topic is a precondition.
Does it fit?   Before starting working on the topic, we typically have a first meeting to check whether the topic and requirements fit to your profile. As a preparation, you may have a look at the following programming task: Optimizing Memory Pooling for In-Memory Database Systems.
  • Quality: Testing of Several Components in Libcarbon, Protobase and NG5
About Design and implement unit and integration tests for several components in the library.
Topic type  Software Project
Supervisor   Marcus Pinnecke
Fundamentals  See guest lecture A Gentle Introduction to Document Stores and Querying with the SQL/JSON Path Language in Advanced Topics of Database Systems
Context  See our open source project Protobase/libcarbon on GitHub in which you will add and evaluate the required functionality. You may read a brief introduction to the system first: Pinnecke et al. Protobase: It's About Time for Backend/Database Co-Design. BTW 2019
Requirements  In addition to formal requirements for a thesis (e.g., number of credit points), you match the following profile.

You have had courses on practical application of computer science concepts including programming languages, and knowledge of data structures along with their implementation. Moreover, you have good abstraction skills, and the drive to work with a non-trivial code base (35k lines of C code). Finally, you have the following focus during your studies:
  1. You have had a course for Database Implementations (e.g., Databases II) or similar
  2. You have had a course for Modern Database Concepts (e.g., Advanced Topics in Databases) or similar
  3. You have had a course for C (C++) programming (e.g., Architecting and Engineering Main Memory Database Systems in Modern C) or practice of C programming
A passion for the topic is a precondition.
Does it fit?   Before starting working on the topic, we typically have a first meeting to check whether the topic and requirements fit to your profile. As a preparation, you may have a look at the following programming task: Optimizing Memory Pooling for In-Memory Database Systems.
  • Split&Merge: Efficient Splitting and Merging of CARBON Archives
About Currently, CARBON archives are constructed from a user-empowered JSON collection and read-only afterwards. In preparation of physical optimizations (such as undo archiving) and defragmentation, archives must be splittable and mergabele. This thesis is about this actions.
Topic type  Software Project
Supervisor   Marcus Pinnecke
Fundamentals  See guest lecture A Gentle Introduction to Document Stores and Querying with the SQL/JSON Path Language in Advanced Topics of Database Systems
Context  See our open source project Protobase/libcarbon on GitHub in which you will add and evaluate the required functionality. You may read a brief introduction to the system first: Pinnecke et al. Protobase: It's About Time for Backend/Database Co-Design. BTW 2019
Requirements  In addition to formal requirements for a thesis (e.g., number of credit points), you match the following profile.

You have had courses on practical application of computer science concepts including programming languages, and knowledge of data structures along with their implementation. Moreover, you have good abstraction skills, and the drive to work with a non-trivial code base (35k lines of C code). Finally, you have the following focus during your studies:
  1. You have had a course for Database Implementations (e.g., Databases II) or similar
  2. You have had a course for Modern Database Concepts (e.g., Advanced Topics in Databases) or similar
  3. You have had a course for C (C++) programming (e.g., Architecting and Engineering Main Memory Database Systems in Modern C) or practice of C programming
A passion for the topic is a precondition.
Does it fit?   Before starting working on the topic, we typically have a first meeting to check whether the topic and requirements fit to your profile. As a preparation, you may have a look at the following programming task: Optimizing Memory Pooling for In-Memory Database Systems.
  • StringIdRewrite: Embedding of String ID Resolution w/o Indexes in CARBON
About In the current form, resolving a fixed-length string reference in a CARBON archives - in case of a cache miss - requires to resolve the reference (string id) to the offset inside the string table on disk. This thesis is about rewriting archives by replacing string ids by their offset.
Topic type  Software Project
Supervisor   Marcus Pinnecke
Fundamentals  See guest lecture A Gentle Introduction to Document Stores and Querying with the SQL/JSON Path Language in Advanced Topics of Database Systems
Context  See our open source project Protobase/libcarbon on GitHub in which you will add and evaluate the required functionality. You may read a brief introduction to the system first: Pinnecke et al. Protobase: It's About Time for Backend/Database Co-Design. BTW 2019
Requirements  In addition to formal requirements for a thesis (e.g., number of credit points), you match the following profile.

You have had courses on practical application of computer science concepts including programming languages, and knowledge of data structures along with their implementation. Moreover, you have good abstraction skills, and the drive to work with a non-trivial code base (35k lines of C code). Finally, you have the following focus during your studies:
  1. You have had a course for Database Implementations (e.g., Databases II) or similar
  2. You have had a course for Modern Database Concepts (e.g., Advanced Topics in Databases) or similar
  3. You have had a course for C (C++) programming (e.g., Architecting and Engineering Main Memory Database Systems in Modern C) or practice of C programming
A passion for the topic is a precondition.
Does it fit?   Before starting working on the topic, we typically have a first meeting to check whether the topic and requirements fit to your profile. As a preparation, you may have a look at the following programming task: Optimizing Memory Pooling for In-Memory Database Systems.
  • JSON Check Tool as Separate Tool
About Currently, in the CARBON Tool (carbon-tool) there is a sub module to check whether a particular JSON file is parsable and satisfies the criteria for conversion into CARBON archives (checkjs). Since this logic is shared with the BISON Tool (bison-tool), the task is to move the module in carbon-tool to a dedicated new tool called checkjs.
Topic type  Software Project
Supervisor   Marcus Pinnecke
Fundamentals  See guest lecture A Gentle Introduction to Document Stores and Querying with the SQL/JSON Path Language in Advanced Topics of Database Systems
Context  See our open source project Protobase/libcarbon on GitHub in which you will add and evaluate the required functionality. You may read a brief introduction to the system first: Pinnecke et al. Protobase: It's About Time for Backend/Database Co-Design. BTW 2019
Requirements  In addition to formal requirements for a thesis (e.g., number of credit points), you match the following profile.

You have had courses on practical application of computer science concepts including programming languages, and knowledge of data structures along with their implementation. Moreover, you have good abstraction skills, and the drive to work with a non-trivial code base (35k lines of C code). Finally, you have the following focus during your studies:
  1. You have had a course for Database Implementations (e.g., Databases II) or similar
  2. You have had a course for Modern Database Concepts (e.g., Advanced Topics in Databases) or similar
  3. You have had a course for C (C++) programming (e.g., Architecting and Engineering Main Memory Database Systems in Modern C) or practice of C programming
A passion for the topic is a precondition.
Does it fit?   Before starting working on the topic, we typically have a first meeting to check whether the topic and requirements fit to your profile. As a preparation, you may have a look at the following programming task: Optimizing Memory Pooling for In-Memory Database Systems.

Templates

For theses and presentation templates, have a look at the German version of this page.

 

Last Modification: 02.03.2021 - Contact Person:

Sie können eine Nachricht versenden an: Webmaster
Sicherheitsabfrage:
Captcha
 
Lösung: