Abstract
Concurrent search trees are one of the most popular and widely used family of data structures. They are used in applications where it is necessary to store a large volume of sorted data with the ability to efficiently search, insert, remove, as well as more advanced operations, such as range queries. Due to their importance, a large amount of research has led to many different types of search trees with different characteristics such as, for example, the max allowed length of a path of the tree. Each search tree provides different performance guarantees for each tree operation and each tree is chosen based on the needs of the specific application.With the proliferation of multicores, where multiple threads execute concurrently and access shared data, concurrent data structures have become a critical component of parallel applications. In concurrent data structures it is necessary to coordinate the concurrent accesses by multiple threads in a way that guarantees the integrity of the dat ...
Concurrent search trees are one of the most popular and widely used family of data structures. They are used in applications where it is necessary to store a large volume of sorted data with the ability to efficiently search, insert, remove, as well as more advanced operations, such as range queries. Due to their importance, a large amount of research has led to many different types of search trees with different characteristics such as, for example, the max allowed length of a path of the tree. Each search tree provides different performance guarantees for each tree operation and each tree is chosen based on the needs of the specific application.With the proliferation of multicores, where multiple threads execute concurrently and access shared data, concurrent data structures have become a critical component of parallel applications. In concurrent data structures it is necessary to coordinate the concurrent accesses by multiple threads in a way that guarantees the integrity of the data structure and the correctness of the operations. This coordination is achieved using some kind of synchronization mechanism such as, locks, hardware-provided atomic operations, Read-Copy-Update (RCU) and Transactional Memory (TM).Despite the high amount of prior work, it still remains challenging to implement highly efficient concurrent search trees. This is mainly due to the fact that both traditional synchronization methods (i.e., locks and atomic operations) and more novel ones (i.e., Read-Copy-Update and Transactional Memory) fail to provide solutions that are generic and at the same time able to attain high performance under diverse execution scenarios.Until recently, TM was mainly implemented in software and used through a library. However, recently two of the biggest processor manufacturers, Intel and IBM, have added support for Transactional Memory in the hardware level, allowing TM to be used without the large overheads imposed by the software implementations. In this work, we explore how HTM can be used to implement highly efficient concurrent search trees. More specifically, we present RCU-HTM, a synchronization mechanism that combines RCU and HTM, and: a) supports the implementation of a concurrent version of any type of search tree, and b) achieves high performance across all execution scenarios.In RCU-HTM threads that modify the tree structure in any way work in copies of the affected part of the tree. Once their local copy is ready, they use HTM to validate that the part of the tree that will be replaced has not been modified in the meanwhile and, if this is true, to replace the old part of the tree with their new modified version.To showcase the capabilities of our technique, we implement and evaluate multiple RCU-HTM trees and compare their performance with several state-of-the-art competitors. More specifically, we apply RCU-HTM to 12 different types of binary, B+-trees and (a-b)-trees and compare against several state-of-the-art implementations that use 4 different synchronization mechanisms, namely locks, atomic operations, RCU, and HTM. We evaluate the trees under multiple different execution scenarios by varying the size of the keys stored in the tree, the size of the trees, the operations mix, and the number of threads, for a total of 630 execution scenarios for each implementation. We also evaluate the search trees using two well-known real-life benchmarks, namely TPCC and YCSB, which are widely used for the evaluation of database management systems. Our evaluation shows that in the majority of executions, RCU-HTM trees outperform their state-of-the-art alternatives, and even in the few cases where they do not, their performance is very close to that of the best performing implementation.
show more