Paper ID: 2204.00470
Continuous Integration of Data Histories into Consistent Namespaces
Mark Burgess, Andras Gerlits
We describe a policy-based approach to the scaling of shared data services, using a hierarchy of calibrated data pipelines to automate the continuous integration of data flows. While there is no unique solution to the problem of time order, we show how to use a fair interleaving to reproduce reliable `latest version' semantics in a controlled way, by trading locality for temporal resolution. We thus establish an invariant global ordering from a spanning tree over all shards, with controlled scalability. This forms a versioned coordinate system (or versioned namespace) with consistent semantics and self-protecting rate-limited versioning, analogous to publish-subscribe addressing schemes for Content Delivery Network (CDN) or Name Data Networking (NDN) schemes.
Submitted: Mar 30, 2022