Skip to main content

bucket-sync

Overview

bucket-sync is a Kubernetes controller that automates periodic synchronization between object storage buckets using scheduled or on-demand sync jobs. It eliminates manual coordination both within the cluster and from external systems by managing the complete lifecycle of sync operations: acquiring exclusive locks, executing rclone-based transfers, and cleaning up resources.

The Problem

Synchronizing data between object storage buckets (e.g., S3, GCS, Minio, etc.) traditionally requires:

  1. Manually scheduling sync operations (e.g., via cron jobs within the cluster or external systems)
  2. Ensuring only one sync runs at a time (preventing concurrent access conflicts)
  3. Configuring credentials and managing authentication for both source and destination buckets
  4. Monitoring sync status and handling failures
  5. Cleaning up temporary resources after completion

This workflow is error-prone and operationally complex whether run inside or outside the cluster. The controller eliminates this friction by automating the entire workflow through Kubernetes resources, enabling bucket syncs to be managed declaratively alongside other cluster infrastructure.

How It Works

Architecture

bucket-sync uses three reconcilers working together:

  1. BucketSyncPolicyReconciler: Watches BucketSyncPolicy resources. On schedule (via cron expression) or manual trigger, it creates a BucketSync resource to execute the sync operation. Also manages cleanup of old BucketSync resources based on configured history limit.

  2. BucketSyncReconciler: Manages individual sync operations through a state machine. It orchestrates acquiring locks on both source and destination buckets, creating and monitoring a Kubernetes Job that runs rclone, and releasing locks after completion.

  3. LockCleanupReconciler: Periodically cleans up stale locks from failed syncs, preventing deadlock scenarios where a bucket is locked indefinitely.

Workflow

Create/Update BucketSyncPolicy with cron schedule

BucketSyncPolicyReconciler checks if sync is due (or manual trigger via annotation)

Creates BucketSync resource

BucketSyncReconciler enters Initialize phase

Acquires exclusive lock on source bucket

Acquires exclusive lock on destination bucket

Transitions to Sync phase

Creates Kubernetes Job with rclone sync container

Waits for Job to complete

Transitions to Finalize phase

Deletes Job

Releases locks on both buckets

Transitions to Finished (terminal state)

Key Concepts

  • BucketSyncPolicy Resource: Defines a recurring sync operation with source/destination buckets, credentials, and cron schedule. The policy always defines a schedule; manual triggers via annotation are available when a sync is needed outside the regular schedule.

  • BucketSync Resource: Represents a single execution of a sync operation. Created by a BucketSyncPolicy (on schedule or manual annotation trigger). Tracks phase, job status, error state, and timing.

  • Phase-Based State Machine: Sync progresses through phases (Initialize → Sync → Finalize → Finished). Each phase completes before moving to the next, enabling resumption after transient failures.

  • Distributed Locking: Prevents concurrent syncs on the same bucket. Uses ConfigMap resources to coordinate across the cluster. If a sync fails or the controller crashes, locks are automatically cleaned up after 5 minutes.

  • Flexible Scheduling: Supports standard cron expressions (minute, hour, day, month, day-of-week) or special schedules like @daily, @hourly, etc. Manual triggers via annotation run immediately, independent of the schedule.

  • Job Labels: Optional labels defined in BucketSyncPolicy.spec.jobLabels or BucketSync.spec.jobLabels are applied to the pods running the rclone Job. Useful for pod affinity, topology spread constraints, or workload classification. Labels from BucketSyncPolicy are inherited by auto-created BucketSync resources.

  • Sync History Management: Optional retention of historical BucketSync resources. When syncHistoryLimit is set on a BucketSyncPolicy, older sync executions are automatically cleaned up, keeping only the most recent syncs up to the specified limit.

Installation

Prerequisites

  • Kubernetes cluster with RBAC enabled
  • Secrets containing rclone configuration (credentials for source and destination buckets)
  • Helm 3.x (optional, for chart-based deployment)

Deploy with Helm

Add the repository and install:

helm repo add homelab-helper https://benfiola.github.io/homelab-helper
helm repo update
helm install bucket-sync homelab-helper/bucket-sync \
--namespace bucket-sync-system \
--create-namespace

The chart deploys:

  • A Deployment running the controller
  • A ServiceAccount with necessary RBAC permissions
  • ClusterRole and ClusterRoleBinding for BucketSync/BucketSyncPolicy resource access and ConfigMap locking
  • Custom Resource Definitions (BucketSync, BucketSyncPolicy)

Verify Installation

Check the deployment is running:

kubectl get deployment -n bucket-sync-system bucket-sync
kubectl logs -n bucket-sync-system -l app.kubernetes.io/name=bucket-sync

Usage

Configuring Environment Variables

The controller passes environment variables to the rclone sync container. Variables are defined separately for source and destination, and the controller automatically prefixes them with RCLONE_CONFIG_SOURCE_ or RCLONE_CONFIG_DESTINATION_ respectively.

Environment variables can come from three sources:

  • Literal values: Inline in the policy (e.g., provider type, endpoint)
  • Secrets: For sensitive data like credentials (e.g., access keys)
  • ConfigMaps: For non-sensitive configuration data

Example with mixed sources:

Create a Secret for sensitive credentials:

kubectl create secret generic garage-credentials \
--from-literal=ACCESS_KEY_ID=<your-access-key> \
--from-literal=SECRET_ACCESS_KEY=<your-secret-key> \
-n default

Create a ConfigMap for non-sensitive configuration (optional):

kubectl create configmap garage-config \
--from-literal=ENDPOINT=https://garage.example.com \
--from-literal=REGION=us-west-2 \
-n default

Refer to the rclone documentation for configuration details specific to your storage backend.

Creating a BucketSyncPolicy

Define a scheduled sync operation with source and destination environment variables:

apiVersion: bucket-sync.homelab-helper.benfiola.com/v1
kind: BucketSyncPolicy
metadata:
name: garage-to-s3-sync
namespace: default
spec:
source: "my-bucket"
destination: "backup-bucket"
schedule: "@daily"
syncHistoryLimit: 10
sourceEnv:
# Literal values for Garage provider
- name: TYPE
value: "s3"
- name: PROVIDER
value: "Garage"
- name: ENDPOINT
value: "https://garage.example.com"
- name: REGION
value: "us-west-2"
# Credentials from Secret
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: garage-credentials
key: ACCESS_KEY_ID
- name: SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: garage-credentials
key: SECRET_ACCESS_KEY
destinationEnv:
# Literal values for AWS S3
- name: TYPE
value: "s3"
- name: PROVIDER
value: "aws"
# Credentials from Secret
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: s3-credentials
key: ACCESS_KEY_ID
- name: SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: s3-credentials
key: SECRET_ACCESS_KEY
jobLabels:
app: bucket-sync
environment: production

Spec Fields:

FieldTypeRequiredDescription
sourcestringYesSource bucket in rclone notation: <remote-name>:<bucket-path>
destinationstringYesDestination bucket in rclone notation: <remote-name>:<bucket-path>
sourceEnv[]corev1.EnvVarNoEnvironment variables for source bucket configuration (literals, secrets, ConfigMaps, etc.)
destinationEnv[]corev1.EnvVarNoEnvironment variables for destination bucket configuration (literals, secrets, ConfigMaps, etc.)
schedulestringYesCron expression (e.g., @daily, 0 2 * * *) defining when syncs should run
syncHistoryLimitintegerNoOptional limit on the number of historical BucketSync resources to retain (minimum 1)
jobLabelsmap[string]stringNoOptional labels to apply to the rclone Job pods (e.g., for pod affinity/topology)

Environment Variable Prefixing:

Variable names in sourceEnv are automatically prefixed with RCLONE_CONFIG_SOURCE_ when injected into the rclone container. Similarly, destinationEnv variables are prefixed with RCLONE_CONFIG_DESTINATION_.

For example:

  • sourceEnv.name = "TYPE"RCLONE_CONFIG_SOURCE_TYPE
  • sourceEnv.name = "PROVIDER"RCLONE_CONFIG_SOURCE_PROVIDER
  • destinationEnv.name = "ACCESS_KEY_ID"RCLONE_CONFIG_DESTINATION_ACCESS_KEY_ID

EnvVar Reference:

Both sourceEnv and destinationEnv use the standard Kubernetes corev1.EnvVar type, which supports:

FieldTypeDescription
namestringEnvironment variable name (required)
valuestringLiteral value for the variable
valueFromEnvVarSourceReference to a value in a Secret, ConfigMap, or field
valueFrom.secretKeyRefSecretKeySelectorGet value from a Secret key
valueFrom.configMapKeyRefConfigMapKeySelectorGet value from a ConfigMap key

See the Kubernetes documentation for complete EnvVar reference.

Monitoring Sync Status

Check scheduled policies:

kubectl get bucketsyncpolicies

Detailed status:

kubectl describe bucketsyncpolicy daily-s3-sync

Example status:

status:
lastSyncTime: "2024-01-15T02:00:00Z"
nextSyncTime: "2024-01-16T02:00:00Z"
error: null
lastReconciledTime: "2024-01-15T02:01:30Z"
observedGeneration: 1

Check individual sync executions:

kubectl get bucketsyncs

Detailed sync status:

kubectl describe bucketsync daily-s3-sync-1705316400

Example status:

status:
phase: Sync
job: daily-s3-sync-1705316400
startTime: "2024-01-15T02:00:00Z"
finishTime: null
error: null
lastReconciledTime: "2024-01-15T02:00:30Z"
observedGeneration: 1

Manual Sync Trigger

Trigger a sync on-demand by annotating the BucketSyncPolicy:

kubectl annotate bucketsyncpolicy daily-s3-sync \
bucket-sync.homelab-helper.benfiola.com/sync-now="" --overwrite

The controller immediately creates a BucketSync resource. The annotation is removed after the sync completes, allowing the next scheduled sync to run normally.

Configuration

CLI Flags and Environment Variables

The controller is invoked as:

homelab-helper bucket-sync [flags]

Available flags and their environment variable equivalents:

FlagEnvironment VariableDefaultDescription
--health-addressHEALTH_ADDRESS:8081Address for health/readiness probes (/healthz, /readyz)
--metrics-addressMETRICS_ADDRESS:8080Address for Prometheus metrics endpoint (/metrics)
--leader-electionLEADER_ELECTIONfalseEnable leader election for HA deployments
--namespaceNAMESPACE""Namespace where bucket locks (ConfigMaps) are created (required)
--kubeconfigKUBECONFIG""Path to kubeconfig; uses in-cluster config if empty

Helm Chart Values

deployment:
image:
# Override image tag (defaults to chart version)
tag: ""

# Number of controller replicas (use >1 with --leader-election for HA)
replicas: 1

# Resource limits/requests (optional)
resources:
null
# Example:
# limits:
# cpu: 200m
# memory: 256Mi
# requests:
# cpu: 100m
# memory: 128Mi

# Namespace where bucket locks (ConfigMaps) will be created (required)
namespace: default

Resource Reference

BucketSyncPolicy

Defines a scheduled or manually-triggered sync operation between buckets.

Spec

apiVersion: bucket-sync.homelab-helper.benfiola.com/v1
kind: BucketSyncPolicy
metadata:
name: daily-backup
namespace: default
spec:
source: "data/backups"
destination: "archive/backups"
schedule: "@daily"
syncHistoryLimit: 7
sourceEnv:
- name: TYPE
value: "s3"
- name: PROVIDER
value: "custom"
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: backup-source-creds
key: access-key-id
- name: SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: backup-source-creds
key: secret-access-key
destinationEnv:
- name: TYPE
value: "s3"
- name: PROVIDER
value: "aws"
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: backup-dest-creds
key: access-key-id
- name: SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: backup-dest-creds
key: secret-access-key
jobLabels:
workload-type: backup

Spec Fields:

FieldTypeRequiredDescription
sourcestringYesSource bucket path in rclone notation
destinationstringYesDestination bucket path in rclone notation
sourceEnv[]corev1.EnvVarNoEnvironment variables for source bucket configuration (auto-prefixed with RCLONE_CONFIG_SOURCE_)
destinationEnv[]corev1.EnvVarNoEnvironment variables for destination bucket configuration (auto-prefixed with RCLONE_CONFIG_DESTINATION_)
schedulestringYesCron expression defining when syncs should run
syncHistoryLimitintegerNoOptional limit on retained historical BucketSync resources (minimum 1); no cleanup if unset
jobLabelsmap[string]stringNoOptional labels applied to rclone Job pods

Status

Communicates policy execution status and scheduling information.

status:
lastSyncTime: "2024-01-15T02:00:00Z"
nextSyncTime: "2024-01-16T02:00:00Z"
error: null
lastReconciledTime: "2024-01-15T02:01:30Z"
observedGeneration: 1

Status Fields:

FieldTypeDescription
lastSyncTimeRFC3339 stringTimestamp of the most recent sync execution
nextSyncTimeRFC3339 stringCalculated timestamp of the next scheduled sync
errorstring or nullError message if policy failed, otherwise null
lastReconciledTimeRFC3339 stringTimestamp of last successful reconciliation
observedGenerationintegerTracks which spec generation was last processed

BucketSync

Represents a single sync execution. Created by BucketSyncPolicy or manually.

Spec

apiVersion: bucket-sync.homelab-helper.benfiola.com/v1
kind: BucketSync
metadata:
name: daily-backup-1705316400
namespace: default
spec:
source: "data/backups"
destination: "archive/backups"
sourceEnv:
- name: TYPE
value: "s3"
- name: PROVIDER
value: "custom"
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: backup-source-creds
key: access-key-id
destinationEnv:
- name: TYPE
value: "s3"
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: backup-dest-creds
key: access-key-id
jobLabels:
workload-type: backup

Spec Fields:

FieldTypeRequiredDescription
sourcestringYesSource bucket path in rclone notation
destinationstringYesDestination bucket path in rclone notation
sourceEnv[]corev1.EnvVarNoEnvironment variables for source (auto-prefixed with RCLONE_CONFIG_SOURCE_)
destinationEnv[]corev1.EnvVarNoEnvironment variables for destination (auto-prefixed with RCLONE_CONFIG_DESTINATION_)
jobLabelsmap[string]stringNoOptional labels applied to rclone Job pods

Status

Tracks sync execution progress and state.

status:
phase: Sync
job: daily-backup-1705316400
startTime: "2024-01-15T02:00:00Z"
finishTime: null
error: null
lastReconciledTime: "2024-01-15T02:00:30Z"
observedGeneration: 1

Status Fields:

FieldTypeDescription
phasestringCurrent phase: Initialize, Sync, Finalize, or Finished
jobstring or nullName of the Kubernetes Job executing the sync
startTimeRFC3339 stringWhen the sync began
finishTimeRFC3339 stringWhen the sync completed (all phases)
errorstring or nullError message if sync failed, otherwise null
lastReconciledTimeRFC3339 stringTimestamp of last successful reconciliation
observedGenerationintegerTracks which spec generation was last processed

BucketSync Lifecycle

Once created, a BucketSync transitions through phases:

  1. Initialize: Acquires exclusive locks on both source and destination buckets to prevent concurrent syncs.
  2. Sync: Creates a Kubernetes Job running rclone sync source destination and waits for completion.
  3. Finalize: Deletes the Job and releases locks on both buckets.
  4. Finished: Terminal state. The sync is complete.

If any phase fails, the status moves directly to Finalize (to cleanup and release locks), then Finished.

Troubleshooting

Initial Diagnostics

When a sync is stuck or failing, start with these steps:

Check the BucketSyncPolicy status:

kubectl describe bucketsyncpolicy daily-backup

Shows next scheduled sync time and any errors.

Check the BucketSync status:

kubectl describe bucketsync daily-backup-1705316400

Shows which phase the sync is in and any error message.

Check controller logs:

kubectl logs -n bucket-sync-system -l app.kubernetes.io/name=bucket-sync

Check the rclone Job:

kubectl get jobs | grep bucket
kubectl describe job daily-backup-1705316400
kubectl logs job/daily-backup-1705316400

Sync Not Starting

Symptom: BucketSync resource exists but remains in Initialize phase indefinitely.

Check lock contention:

kubectl get configmaps | grep bucket-sync-lock

If a lock ConfigMap exists for your bucket, another sync may be in progress or deadlocked. Check for running BucketSync resources:

kubectl get bucketsyncs | grep <bucket-name>

If you see stale locks and no corresponding BucketSync, the lock cleanup reconciler should clean them up (runs periodically). To manually clean up:

kubectl delete configmap bucket-sync-lock-<bucket-name>-<hash>

Sync Stuck in Sync Phase

Symptom: BucketSync has been in Sync phase for an extended time, or timed out.

Follow the Initial Diagnostics steps above. Common causes:

  • Large dataset (sync is slow due to size)
  • Network issues between Kubernetes cluster and object storage
  • rclone Job is pending (waiting for resources)
  • Secret credentials are invalid (Job fails silently)

Check the rclone Job logs:

kubectl logs job/daily-backup-1705316400

If you see authentication errors or other credential issues, the controller will catch the Job failure and report it in the BucketSync.status.error field. Verify that your Secret contains the correct rclone configuration by consulting the rclone documentation.

Enable Debug Logging

Run controller with debug logging:

kubectl set env -n bucket-sync-system deployment/bucket-sync LOG_LEVEL=debug

Then tail the logs:

kubectl logs -n bucket-sync-system -l app.kubernetes.io/name=bucket-sync -f

Limitations

  • One sync at a time per bucket: Concurrent syncs on the same bucket are serialized via locking. A second sync on the same bucket will wait until the first completes.
  • rclone only: Syncs use rclone; other tools are not supported.

See Also