Abstract
The evolution of distributed data management systems, especially a class of systems de- veloped during the last 15-20 years commonly known as NoSQL data stores, has led to a multitude of designs optimised for different application types, data formats, and work- load characteristics. Given the complexity of the environments they operate in as parts of multi-tier software stacks driven by Internet workloads, data stores are facing significant challenges during their operation. An important objective for service operators is to en- sure that data store performance levels and guarantees are maintained despite internal or external changes that they face. Such an objective can be reached via automated adapta- tion mechanisms by which data stores adapt to changes automatically and transparently while maintaining efficiency and performance goals as the data store transitions to new configurations.In this dissertation we explore adaptation mechanisms in distributed data stores facing internally ...
The evolution of distributed data management systems, especially a class of systems de- veloped during the last 15-20 years commonly known as NoSQL data stores, has led to a multitude of designs optimised for different application types, data formats, and work- load characteristics. Given the complexity of the environments they operate in as parts of multi-tier software stacks driven by Internet workloads, data stores are facing significant challenges during their operation. An important objective for service operators is to en- sure that data store performance levels and guarantees are maintained despite internal or external changes that they face. Such an objective can be reached via automated adapta- tion mechanisms by which data stores adapt to changes automatically and transparently while maintaining efficiency and performance goals as the data store transitions to new configurations.In this dissertation we explore adaptation mechanisms in distributed data stores facing internally or externally-induced changes, with a focus on workload variations, occasional background activities, or the evolution of an external middleware component that inter- operates with a distributed data store. We propose novel adaptation mechanisms and im- provements to existing mechanisms in three different contexts (data store elasticity, mask- ing background activities, and alignment with external distributed middleware), aiming to improve the overall performance during the aforementioned contexts in the lifecycle of scalable data stores, aiming at challenges that had not been addressed so far.First, this dissertation focuses on the expansion phase of a data store when the need arises to adapt its capacity as workload demands increase and the system tries to improve its performance by incorporating more resources. We study the performance impact of data transfers over the network during this phase and propose a mechanism that sched- ules data transfers in a fine-grain manner, reducing their performance impact while pro- gressively increasing the processing capacity in an incremental fashion. The proposed method realizes early benefits from data transfers during the elasticity action as it incor- porates new resources and makes data sub-sections available prior to completing the full data transfers.Next, we study the performance overhead of background activities that often impact data store performance. We propose replica-group reconfiguration as a way to mask per- formance bottlenecks in replicated data stores and investigate the benefits of changing replica-group leadership prior to resource-intensive background tasks (e.g. internal data reorganization, garbage collection or data backup tasks). Our observation of an occa- sional performance glitch during reconfiguration actions, caused by cold-cache misses in the cache of a new leader that was not adequately prepared for the transition to the new configuration, led us to propose a new mechanism to maintain up-to-date read caches across replicas without affecting the data consistency and availability by disseminating read-hints within the replica group.Finally, in this dissertation we investigate the benefits of automatically aligning data stores with distributed middleware systems that rely on those data stores to maintain their state. We do that by appropriately co-locating data partitions of data store with process- ing tasks of the distributed middleware systems. We propose a system that continuously strives to discover such alignment opportunities across systems and improve data locality. The alignment actions combine multiple data store mechanisms in common use, such as data replication and migration, as well as the adaptation of the partitioning schemes across systems, a mechanism that has not been studied before in this context.The evaluation of the proposed mechanisms over widely deployed systems confirms their performance improvements, advancing the state of the art in distributed data stores in the direction of systems that adapt more efficiently and in new ways through internal and external changes in their lifecycle.
show more