Skip to content

Checkpointing

Checkpointing needs atomic metadata publication even when bytes are stored in an external object store.

Current Primitive

PublishArtifact uploads the object body first, then publishes metadata with a single metadata command:

text
body bytes -> object store
metadata command:
  - inode attr
  - dentry projection
  - body descriptor

The namespace entry appears only after the metadata command commits.

Current And Remaining Semantics

  • atomic replace returns the old body descriptor;
  • durable object GC queue stores old body refs from remove/replace;
  • snapshot pins protect old body refs until retired;
  • chunk manifests describe large checkpoint files;
  • read-only snapshot FUSE mounts can expose a pinned subtree as /;
  • service-level typed watch replay lets checkpoint consumers observe publish and replace events;
  • live FUSE mounts consume typed watch replay to invalidate kernel entry/inode caches;

Remaining work:

  • SDK watch consumer integration and broader node-local cache invalidation;

Failure Handling

The product contract should be explicit:

  • object upload failure means no metadata publish;
  • metadata publish failure returns staged object refs for explicit cleanup;
  • metadata remove/replace success persists old body refs in the metadata GC queue and returns the old body descriptor to the caller;
  • snapshot pins protect old body refs from object cleanup until retired;
  • snapshot pins protect metadata history before history GC;
  • typed watch replay uses the durable watch log rather than MVCC history.