Status
Proposed on 2026-03-14 by Lars Solem.
Context
Dataverket needs a repeatable way to bring up physical servers for Kubernetes and compute without depending on manual installation flows or a mutable host operating system.
The repository already establishes Talos as an expected operating model, including break-glass integration for Talos-based systems. The open question is how servers should be provisioned and whether Talos should run diskless or be installed onto local storage.
Decision
Dataverket provisions bare-metal servers through network boot with iPXE, and installs Talos Linux onto local disk for normal operation.
The platform does not adopt diskless Talos as the default runtime model.
Provisioning flow
The standard server lifecycle is:
- Maskin discovers the server through BMC and inventory data.
- Maskin sets a one-time network boot override through the BMC when provisioning is requested.
- The server boots into iPXE from the provisioning network.
- Maskin serves a node-specific or role-specific Talos installer profile.
- The Talos installer writes the target system to local disk.
- The server reboots from local disk into installed Talos.
- Plattform applies cluster bootstrap or join operations.
Why iPXE
iPXE is chosen because it supports a practical modern provisioning model:
- HTTP-based asset delivery
- dynamic boot scripting
- easier per-node logic than legacy PXE alone
- compatibility with BMC-driven one-shot network boot workflows
Legacy PXE and TFTP may still be used only as a chainload path where hardware requires it.
Why installed Talos instead of diskless by default
Talos installed to local disk is the default because it gives the platform:
- a stable and supported Talos lifecycle
- predictable reboot behavior
- simpler upgrades and rollback handling
- local persistence for kubelet and image cache behavior
- lower operational complexity during initial platform bring-up
Purely diskless Talos remains an optional mode for:
- rescue environments
- hardware diagnostics
- installer environments
- specialized stateless worker experiments
It is not the baseline contract for compute or Kubernetes products.
Identity and configuration model
Provisioning identity should be derived from hardware-backed attributes available before the OS is installed, such as:
- BMC identity
- serial number
- MAC address
- rack and port placement from inventory
Maskin generates the installer configuration and binds it to that hardware identity. Plattform owns higher-level cluster intent and post-install Talos cluster lifecycle.
Required provisioning components
The platform must provide:
- a dedicated provisioning network
- DHCP for provisioning
- iPXE boot endpoint
- HTTP image and config hosting
- Talos image cache
- BMC control for power and one-shot boot order
- hardware discovery and reconciliation
Operational model
Bare-metal nodes are treated as disposable infrastructure:
- reprovisioning must be routine
- no manual host customization is allowed
- drift is resolved by re-applying config or reinstalling the node
- cluster membership is an orchestrated state transition, not a handcrafted procedure
Consequences
- Talos becomes the only supported host OS for the primary bare-metal path
- local disks are required for normal Talos operation
- provisioning network design becomes a core dependency for the platform
- Maskin must integrate with BMCs early
- the diskless Talos idea is deferred to an explicit optional mode instead of shaping the primary architecture
Decision Outcome
Proposed. This ADR records the current preferred direction and still needs acceptance before it becomes binding.
Related Decisions
- This ADR complements the Talos break-glass architecture by defining the normal provisioning path.
- Nett must provide the provisioning network and related switch configuration required by this flow.
- A later ADR should define supported BMC vendors and Talos hardware assumptions.
Audit
- 2026-03-14: ADR proposed.