Skip to content

Object Layout

NoKV stores file bodies outside the metadata service. Metadata stores compact body descriptors; durability of bytes is delegated to the configured object store.

Chunk Layout

Files are split into immutable object blocks:

text
file inode
  -> body descriptor
  -> chunk manifests
  -> object blocks

Default sizes:

text
chunk_size = 64 MiB
block_size = 4 MiB

Object block keys are generated by the metadata service:

text
blocks/<mount>/<inode>/<generation>/<chunk>/<block>

Blocks are never modified in place. A replace or overwrite creates a new inode generation and atomically publishes a new manifest in metadata.

Body Descriptor

text
producer
digest_uri
size
content_type
generation
manifest_id
chunk_size
block_size

manifest_id is provider-neutral and stable for the artifact publish request. It is not the physical object key. Physical object keys are derived from mount, inode, generation, chunk, and block.

The same object boundary works for AWS S3, RustFS, MinIO, and Ceph RGW.

Use --object-backend rustfs for a local RustFS deployment or --object-backend s3 for another S3-compatible provider. See RustFS Backend for the local RustFS shape.

Publish Rule

Artifact publish is staged:

text
upload object bytes
  -> split into blocks and PUT immutable objects
  -> commit inode + dentry projection + body summary + chunk manifests
  -> expose namespace entry

If object upload succeeds and metadata publish fails, the object is staged but not reachable from the namespace. The caller can pass the staged object set to the explicit cleanup helper.

If metadata remove or replace succeeds, the old body objects are written into a durable metadata GC queue in the same metadata commit that removes namespace reachability. The current local service exposes an explicit cleanup API and a background object GC worker; live FUSE mounts start the worker by default. Active snapshot pins conservatively block object cleanup so snapshot-version artifact reads can still fetch the old blocks. Retiring the snapshot lets later cleanup consume the queued records.

Metadata history uses the same retention boundary. Active snapshot pins define the oldest read version that must remain reconstructible. History cleanup keeps the per-key anchor needed by that oldest snapshot and removes older versions; when no snapshot pins remain, history cleanup may remove all historical records. Live FUSE mounts start the history GC worker alongside object GC.

Chunk Manifest

Each chunk_manifest record stores the blocks that cover one logical chunk:

text
chunk_index
logical_offset
len
blocks:
  object_key
  logical_offset
  object_offset
  len
  digest_uri

Readers construct a range read plan from the manifests, fetch object ranges, and assemble the requested file range. The first cache layer is a read-through block cache keyed by object range.