Skip to main content

linstor-provision-disk

Overview

linstor-provision-disk is a utility designed to run as an init container for LINSTOR satellite pods. It automatically detects when a satellite's ID has changed and reinitializes the storage layer accordingly—cleaning up old physical volumes, volume groups, and logical volumes before provisioning fresh storage.

The Problem

When managing a Kubernetes homelab with frequent cluster resets, the storage layer on each satellite node needs to be reinitialized. Without automation, this requires manual intervention on every node:

  1. Manually identify the satellite ID and detect changes
  2. Log into each node and clean up old storage artifacts (PVs, VGs, LVs)
  3. Reinitialize the LVM layer for LINSTOR to consume
  4. Repeat for every node after each cluster reset

After repeated cluster resets, this becomes tedious and error-prone. This utility automates the entire process.

How It Works

Architecture

linstor-provision-disk manages the storage layer lifecycle for LINSTOR satellites by tracking satellite ID changes and reinitializing LVM structures accordingly.

The utility operates on a storage hierarchy:

Partition (by-partlabel)

Physical Volume (PV)

Volume Group (VG)

├─ Thin Pool (for LINSTOR storage)
└─ Metadata LV (tracks satellite ID)

The key insight is the metadata logical volume, which stores the satellite ID on disk. By comparing the stored satellite ID with the current ID on each startup, the utility can detect node resets and trigger cleanup.

Workflow

Init Container Starts

Resolve partition by label → PV path

Query existing PVs, VGs, LVs

Read satellite ID from metadata LV

Compare with provided SATELLITE_ID
├─ Match → No cleanup needed, proceed to provisioning
└─ Mismatch → Satellite reset detected

Remove all LVs and VGs

Remove all PVs

Provision storage layer:
• Create/verify PV from partition
• Create/verify VG
• Create/verify thin pool
• Create/verify metadata LV with satellite ID

Init Container Exits Successfully

LINSTOR Satellite Pod Starts

Key Concepts

  • Satellite ID: A unique identifier for each LINSTOR satellite node. When the cluster resets, this ID may change or a new node may be reused. The utility uses this to detect resets.

  • Metadata Logical Volume: A 100MB LV mounted temporarily to read/write the satellite ID. This is the mechanism for detecting satellite ID changes across container restarts.

  • Thin Pool: The LVM thin pool that LINSTOR uses to provision logical volumes for storage. This is created once and extended as needed.

  • Partition Label: A GPT partition label on the underlying disk used to consistently identify the storage device (e.g., linstor-storage). This avoids relying on device names which can change.

Deployment

Prerequisites

  • LINSTOR satellite nodes running in your Kubernetes cluster
  • A dedicated storage device with a GPT partition labeled for LINSTOR (e.g., /dev/sdb1 with label linstor-storage)
  • Access to mount the storage device (requires privileged: true in the init container)
  • The init container must run before the LINSTOR satellite pod starts

Usage as Init Container

Privileged Containers

The init container must run with privileged: true to access and modify the host's LVM storage layer. This is required and expected for storage provisioning utilities.

Add linstor-provision-disk as an init container in your LINSTOR satellite pod specification. The utility requires four parameters via environment variables or CLI flags:

apiVersion: v1
kind: Pod
metadata:
name: linstor-satellite
namespace: linstor
spec:
initContainers:
- name: provision-disk
image: benfiola/homelab-helper:latest
command:
- homelab-helper
- linstor-provision-disk
env:
- name: PARTITION_LABEL
value: "linstor-storage"
- name: POOL
value: "linstor-pool"
- name: SATELLITE_ID
value: "node-01"
- name: VOLUME_GROUP
value: "linstor-vg"
securityContext:
privileged: true
volumeMounts:
- name: sys
mountPath: /sys
- name: dev
mountPath: /dev
- name: run
mountPath: /run
containers:
- name: satellite
image: drbd.io/linstor-satellite:latest
# ... rest of satellite configuration
volumes:
- name: sys
hostPath:
path: /sys
- name: dev
hostPath:
path: /dev
- name: run
hostPath:
path: /run

Configuration

CLI Flags and Environment Variables

Stable Satellite ID

Choose a stable SATELLITE_ID that persists across pod restarts—typically the node name or a fixed per-node identifier. Avoid randomly generated values, as ID changes trigger storage reinitialization.

See the Troubleshooting section for stable ID patterns.

The utility is invoked as:

homelab-helper linstor-provision-disk [flags]

All parameters are required:

FlagEnvironment VariableDescription
--partition-labelPARTITION_LABELGPT partition label for the storage device (e.g., linstor-storage)
--poolPOOLName of the LVM thin pool to create/manage (e.g., linstor-pool)
--satellite-idSATELLITE_IDUnique identifier for this satellite node (e.g., node-01)
--volume-groupVOLUME_GROUPName of the LVM volume group to create/manage (e.g., linstor-vg)

Operational Guidance

Verifying Successful Provisioning

Check init container logs:

kubectl logs <pod-name> -c provision-disk -n linstor

Successful completion shows:

disk provisioning completed successfully

Monitoring for Satellite ID Changes

If the satellite ID changes (detected via mismatch with stored metadata), the logs will show:

satellite id mismatch, resetting lvm configuration existing=<old-id> expected=<new-id>
removing logical volumes ...
removing volume group ...
removing physical volume ...

This indicates the utility is cleaning up and reprovisioning. This is expected behavior and requires no intervention.

Troubleshooting

Provisioning Fails

Symptom: Init container logs show errors indicating a failed provisioning attempt.

Root Causes and Solutions:

  • Partition label not found
    • Verify: lsblk --output NAME,PARTLABEL
    • Fix: Ensure label matches PARTITION_LABEL exactly
  • Device not accessible
    • Verify: Check volumeMounts include /dev
    • Fix: Update pod spec with proper hostPath volumes
  • LVM tools missing
    • Verify: Container image includes lvm2
    • Fix: Use benfiola/homelab-helper image which includes tools

General Recovery:

  1. Check init container logs: kubectl logs <pod-name> -c provision-disk -n linstor
  2. Identify the specific error from logs
  3. Fix the configuration or infrastructure issue
  4. Remove the retry marker: rm /tmp/.disk-provisioner-wait
  5. Restart the pod to retry

Satellite ID Mismatch Causing Repeated Cleanup

Symptom: Each pod restart triggers storage cleanup.

Cause: The SATELLITE_ID environment variable doesn't match the value expected to persist across pod restarts. This typically happens if satellite IDs are randomly generated.

Resolution: Ensure SATELLITE_ID is stable and derived from the node name or a fixed identity. Example for a DaemonSet:

env:
- name: SATELLITE_ID
valueFrom:
fieldRef:
fieldPath: spec.nodeName

Or use a fixed value per node if using a StatefulSet:

env:
- name: SATELLITE_ID
value: "$(hostname)"

Enable Debug Logging

Run with debug log level:

kubectl set env -n linstor pod/<pod-name> -c provision-disk LOG_LEVEL=debug

Or specify in the pod spec:

env:
- name: LOG_LEVEL
value: "debug"

This provides detailed logs of each step (device resolution, LV queries, provisioning operations).

Limitations

  • The utility operates at the LVM level and does not manage partitioning. The underlying partition must already exist with the correct label.
  • Storage reinitialization is destructive—all LVs and VGs are removed on satellite ID mismatch. Ensure critical data is backed up before changing satellite IDs.
  • The utility is not suitable for managing shared storage across multiple satellites; each satellite must have dedicated storage.

See Also