Paper ID: 2204.00470

Continuous Integration of Data Histories into Consistent Namespaces

Mark Burgess, Andras Gerlits

We describe a policy-based approach to the scaling of shared data services, using a hierarchy of calibrated data pipelines to automate the continuous integration of data flows. While there is no unique solution to the problem of time order, we show how to use a fair interleaving to reproduce reliable `latest version' semantics in a controlled way, by trading locality for temporal resolution. We thus establish an invariant global ordering from a spanning tree over all shards, with controlled scalability. This forms a versioned coordinate system (or versioned namespace) with consistent semantics and self-protecting rate-limited versioning, analogous to publish-subscribe addressing schemes for Content Delivery Network (CDN) or Name Data Networking (NDN) schemes.

Submitted: Mar 30, 2022