Improving RRB-Tree Performance through Transience by Jean Niklas L’orange.
Abstract:
The RRB-tree is a confluently persistent data structure based on the persistent vector, with efficient concatenation and slicing, and effectively constant time indexing, updates and iteration. Although efficient appends have been discussed, they have not been properly studied.
This thesis formally describes the persistent vector and the RRB-tree, and presents three optimisations for the RRB-tree which have been successfully used in the persistent vector. The differences between the implementations are discussed, and the performance is measured. To measure the performance, the C library librrb is implemented with the proposed optimisations.
Results shows that the optimisations improves the append performance of the RRB-tree considerably, and suggests that its performance is comparable to mutable array lists in certain situations.
Jean’s thesis is available at: http://hypirion.com/thesis.pdf
Although immutable data structures are obviously better suited for parallel programming, years of hacks on mutable data structures have set a high bar for performance. Unreasonably, parallel programmers want the same level of performance from immutable data structures as from their current mutable ones. 😉
Research such as Jean’s moves functional languages one step closer to being the default for parallel processing.