Capacity planning and overcommit policy DATAVERKET 030

Proposed Infrastructure Policy Capacity Scheduling Overcommit Placement

Defines how Dataverket models capacity, placement pressure, and overcommit boundaries across resource classes and sites.

Author
Lars Solem
Updated

Status

Proposed on 2026-03-14 by Lars Solem.

Context

Dataverket already models quotas, placement, and multi-datacenter resources, but that is not enough to make safe scheduling decisions.

The platform still needs a clear policy for:

  • placement under capacity pressure
  • spread versus bin-packing behavior
  • whether CPU, memory, and storage may be overcommitted
  • how scarcity is surfaced to operators and users

Decision

Dataverket treats capacity planning and overcommit as explicit policy inputs to scheduling and allocation.

The platform must define:

  • resource accounting rules
  • placement defaults
  • overcommit boundaries by resource type
  • operator visibility into scarcity and risk

Resource classes

The platform should treat at least these resources separately:

  • CPU
  • memory
  • storage capacity
  • network capacity where it materially affects placement

The same overcommit policy should not be assumed for all of them.

Placement posture

The platform should support both:

  • spread-oriented placement For resilience, anti-affinity, or fault-domain awareness.

  • packing-oriented placement For efficient infrastructure use where risk is acceptable.

The default should be explicit per product or workload class.

Overcommit model

The platform should define overcommit policy by resource type rather than as a single boolean.

Examples:

  • CPU may allow some degree of overcommit
  • memory should be treated more conservatively
  • storage overcommit should be tightly controlled and highly visible if allowed at all

These are policy questions, not hidden scheduler heuristics.

Scarcity behavior

When capacity becomes constrained, the platform must have explicit behavior for:

  • rejecting new allocations
  • preferring one site over another
  • honoring placement and residency constraints
  • surfacing operator-visible risk before outright failure

Scarcity must not show up only as surprising scheduler behavior.

Operator visibility

Operators should be able to see:

  • current capacity by site and pool
  • overcommit posture by resource class
  • which workloads are consuming scarce capacity
  • when placement policy and capacity policy are in tension

Explicit non-decisions for now

This ADR intentionally does not yet choose:

  • exact overcommit ratios
  • exact scheduling algorithm
  • exact forecasting method
  • exact reservation model for every product

Those require later implementation decisions.

Consequences

  • quotas no longer have to carry the whole burden of placement and safety
  • scheduling becomes policy-driven rather than accidental
  • capacity risk becomes an explicit operator concern

Decision Outcome

Proposed. This ADR records the current preferred direction and still needs acceptance before it becomes binding.

More Information

  • scheduler strategy
  • capacity reservation model
  • site saturation alerting thresholds

Audit

  • 2026-03-14: ADR proposed.