Carnegie Mellon University
Browse
CMU-CS-24-107.pdf (3.38 MB)

On Embedding Database Management System Logic in Operating Systems via Restricted Programming Environments

Download (3.38 MB)
thesis
posted on 2024-06-25, 19:39 authored by Matthew ButrovichMatthew Butrovich

 The ever-increasing improvement in computer storage and network performance means that disk I/O and network communication are often no longer bottlenecks in database management systems (DBMSs). Instead, the overheads associated with operating system (OS) services (e.g., system calls, thread scheduling, and data movement from kernel-space) limit query processing responsiveness. User-space applications like DBMSs can elide these overheads with a kernel-bypass design. However, extracting benefits from kernel-bypass frameworks is challenging, and the libraries are incompatible with standard deployment and debugging tools. 

This dissertation presents an alternative implementation strategy for systems called user-bypass—a design that extends OS behavior for DBMS-specific features, including observability, networking, and query execution. Historically, DBMS developers avoid kernel extensions for safety and security reasons, but recent improvements in OS extensibility present new opportunities. Developers write safe, event-driven programs with user-bypass to push DBMS logic into the kernel and avoid user-space overheads. When a DBMS in user-space invokes these programs, user-bypass provides behavior similar to a new OS system call, albeit without kernel modifications. Alternatively, when an OS thread or interrupt triggers these programs in kernel-space, user-bypass inserts DBMS logic into the kernel stack. 

In this dissertation, we will introduce three systems that use the user-bypass method in DBMSs. First, we present an observability framework that employs userbypass to collect training data for self-driving DBMSs that reduces the number of round trips to kernel-space to retrieve performance counters and other system metrics. Next, we present a database proxy that applies user-bypass to support features like connection pooling and workload replication while reducing data copying and user-space thread scheduling. Lastly, we present an embedded that provides ACID transactions over multi-versioned data in kernel-space. 

The techniques in this dissertation show user-bypass benefits across multiple DBMS design disciplines and provide a template for future DBMS and OS co-design. 

History

Date

2024-05-01

Degree Type

  • Dissertation

Department

  • Computer Science

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Andrew Pavlo

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC