Home k3s lab
The on-prem path. A small k3s cluster (one Apple Silicon Mac Mini control plane, two Raspberry Pi 4 workers, a UNAS Pro NAS) that serves preview.home.austinrose.xyz and acts as the staging gate before AWS and GCP. Same GitOps discipline as the cloud paths, smaller everything.
Run the same GitOps discipline at home that the cloud paths are patterned on, against ARM hardware and a NAS, on my own dollars.
- Single-node control plane (the Mini); no HA control-plane today
- NFS v3 limitation on UNAS: cross-volume hardlinks fail (atomic-move workaround in the *arr stack)
- Secret sprawl across ~22 ExternalSecrets without per-app audit
- One bad image promotion silently rolling to preview
- Push-to-live under 10 minutes from `git push origin preview` to preview.home.austinrose.xyz
- Zero secrets in git (SOPS for bootstrap, 1Password Connect via External Secrets Operator for runtime)
- CNPG WAL backed up to R2 within 15 minutes of write
- Full cluster rebuild from `ansible/site.yaml` in under an hour
- MariaDB service `<cluster>-primary` only exists when replicas > 1; a single-replica pod silently NXDOMAINs and the app falls back to SQLite undetected. birdnet-go burned an evening on this.
- Per-app restic repos (one S3 subpath each) prevent one app's `forget` from pruning another's snapshots.
- Flux native image automation beats Renovate for this workload; the contract is `highest run-N` and the controller writes its own commit.
- UNAS Pro exports NFS v3 only; StorageClass `nfs-unas` must use `nfsvers=3` (no `nconnect`).
- Three-layer architecture in ../home-ops (tofu / ansible / kubernetes-flux), each with its own lifecycle and blast radius
- Image-promotion contract: this repo emits `:run-N`; home-ops Flux watches and rolls
- Hybrid SOPS + 1Password secret discipline (6 bootstrap secrets in SOPS, ~22 runtime secrets via External Secrets Operator)
- Two-tier R2 + UNAS backup strategy with per-app restic isolation
- k3s 1.33, Flux 2.8 (flux-operator + FluxInstance)
- bjw-s/app-template v4 for ~38 HelmReleases
- CloudNativePG (two clusters), MariaDB Operator, Dragonfly
- Volsync + restic + Cloudflare R2; UNAS Pro (NFS v3)
- Traefik, kube-vip, Tailscale Operator, cert-manager, External Secrets Operator, 1Password Connect
tofu/External APIs that live outside the cluster lifecycle.
- Cloudflare (zone settings, DNS, R2 state bucket, Zero Trust tunnel)
- GitHub (repo settings, branch protection, Actions secrets)
- Tailscale (ACL HuJSON, split-DNS, OAuth clients)
- UniFi (static-DNS records for kube-vip ingress)
- 1Password (item lookups; outputs piped back post-apply)
- State backend: Cloudflare R2 with native locking
ansible/Cluster bootstrap from bare hardware to a green k3s cluster.
- 25 roles; ordered playbooks 00 prerequisites to 99 validate
- Lima VM provisioning on the Mac Mini control plane
- Pi prep: cgroups in cmdline.txt, zram, log2ram, swap off
- k3s server + agent install via the k3s-io/k3s_ansible role
- Platform stack in dependency order: cert-manager, External Secrets Operator, 1Password Connect, kube-vip, Tailscale Operator, flux-operator
- Smoke-test pass before handing off to Flux
kubernetes/ + Flux 2.8GitOps runtime that owns everything from operators to apps.
- FluxInstance CRD via flux-operator (not bare `flux bootstrap`)
- GitRepository points at auzroz/home-ops main; Kustomization syncs kubernetes/apps
- Reconcile interval 10 minutes; cluster-settings ConfigMap supplies substitutions
- ~38 HelmReleases across 12 namespaces
- ~22 ExternalSecrets via 1Password Connect through ESO `external-secrets.io/v1`
- Image Automation watches GHCR for the austinrose-me preview image
Three layers, each with its own lifecycle and blast radius. Source-of-truth lives in the sibling ../home-ops repo; this page is a curated view.
- 38HelmReleases
- 12namespaces
- 22ExternalSecrets
- 25Ansible roles
- 5tofu providers
- 2storage classes
- Mac Mini (Apple Silicon, 16GB)×1
control planeDebian 13 Trixie arm64 in a Lima VM
9GB allocated to the VM; Tailscale subnet router runs on the Lima VM, not the Pis.
- Raspberry Pi 4 (8GB)×2
workersPi OS Lite 64-bit Trixie
USB3 SSD/stick boot; cgroups added to cmdline.txt; log2ram + zram for SD-card protection.
- UniFi UNAS Pro (40TB usable)×1
NASvendor firmware
NFS v3 only (vendor limitation); offsite replication of select shares to Google Drive.
local-path-mac-miniLocal SSD on the Mac Mini, pinned by nodePathDatabases, SQLite, app config, small caches
Default class for stateful workloads with strict consistency requirements.
nfs-unasNFS v3 to the UNAS ProApp data, media libraries (mounted as-is), Volsync caches
mountOptions vers=3,proto=tcp,hard,noatime,rsize=1048576,wsize=1048576,timeo=600,retrans=2 (no nconnect; UNAS Pro v3 limitation).
<app>.austinrose.xyztraefikPublic via Cloudflared Zero Trust tunnel
Cert: cert-manager (letsencrypt-prod, DNS-01 via Cloudflare)
<app>.home.austinrose.xyztraefikPrivate LAN plus tailnet
Cert: cert-manager (shared wildcard via DNS-01)
<app>.<tailnet>.ts.nettailscaleTailnet only (legacy, marked for cleanup)
Cert: Tailscale auto-provisioned
automation2- n8n
Workflow automation; Postgres-backed (shared-postgres); pinned to Mini; public webhooks via Cloudflared tunnel.
- actions-runner-controller
Self-hosted GitHub Actions runners; runner scale-set fields out custom builds for the homepage repo.
backups4- volsync
Per-app restic-backed ReplicationSources to R2; copyMethod Direct, pruneIntervalDays 7.
- restic-rest-server
In-cluster restic REST endpoint backed by NFS for bulk Volsync repos.
- reflector
Cross-namespace replication of base secrets and configmaps so per-app namespaces stay self-contained.
- reloader
Watches ConfigMap and Secret changes; restarts pods on update so config rotation is uneventful.
databases7- cloudnative-pg
CNPG operator and Cluster CRDs; the relational substrate for the cluster.
- shared-postgres
PostgreSQL 16 cluster (single replica, pinned to Mini) for Paperless, Mealie, n8n; ScheduledBackup to R2.
- immich-postgres
PostgreSQL 16 cluster with VectorChord extension for Immich face/CLIP search; ScheduledBackup to R2.
- mariadb-operator
MariaDB operator + CR; single-instance cluster pinned to Mini for birdnet-go.
- mariadb-cluster
Single-instance MariaDB; ScheduledBackup to NFS nightly.
- dragonfly
Redis-API drop-in cache for Immich; pinned to Mini; small footprint.
- phpmyadmin
Web admin UI for the MariaDB cluster.
documents3- paperless
Document scan + OCR + tagging; index on local-path, media on NFS; Volsync daily.
- mealie
Recipe manager backed by shared-postgres; Volsync covers app state.
- homebox
Household inventory; SQLite on NFS; Volsync every 6 hours.
media6- plex
Media server; metadata on local-path, libraries mounted as-is from NFS shares; Pi-ineligible.
- tautulli
Plex monitoring and notifications.
- sonarr
TV automation; hardlinks disabled (NFS volume isolation), atomic-move used instead.
- radarr
Movie automation; same hardlink-free posture as Sonarr.
- sabnzbd
Usenet downloader; downloads to NFS, no seeding semantics.
- prowlarr
Indexer aggregator for the *arr suite.
monitoring1- birdnet-go
ML bird-call classifier from a backyard mic; persists to MariaDB.
network2- traefik-overrides
Overrides only; k3s default Traefik kept (not replaced with nginx).
- cloudflared
Zero Trust tunnel for public ingress; rules in tofu/cloudflare/tunnel.tf.
observability6- loki
Log aggregation; index on local-path, chunks via Volsync.
- grafana
Visualization; backed by shared-postgres for state.
- alloy
DaemonSet metrics + log forwarder; ships to Loki.
- gatus
Endpoint health checker; lightweight status board.
- headlamp
Browser-based Kubernetes admin UI for cluster inspection.
- homepage
Dashboard linking to the cluster's other surfaces; not the same homepage as this site.
photos2- immich
Self-hosted photo backup; immich-postgres for vector search, Dragonfly for caching, NFS RWX library.
- immich-public-proxy
Public-share proxy for Immich albums via Cloudflared tunnel.
portfolio1- austinrose-me
This site, on the home cluster path. Static nginx; image promoted by Flux Image Automation from GHCR.
registry1- zot
OCI-compliant container registry on NFS; reduces external pulls for in-cluster images.
tools2- rustdesk
Self-hosted remote desktop server; keypair on local-path, exposed via TailVIP.
- scrypted
HomeKit bridge; hostNetwork for Bonjour/mDNS; LevelDB on NFS, single-replica Recreate rollout.
- Alloy
DaemonSet metrics collector and log forwarder; ships to Loki and Prometheus when configured
- Loki
Log aggregation; queried via Grafana
- Grafana
Visualization; admin UI on the private hostname tier
- Gatus
Endpoint health checker (datasource not yet wired)
- Headlamp
Kubernetes admin UI for direct API access
- Homepage
Dashboard with links to the rest of the cluster's surfaces
- Tier 0
- 6 bootstrap secrets in
ansible/inventory/group_vars/all.sops.yaml (SOPS + age) - Tier 1
- ~22 runtime secrets via 1Password Connect via External Secrets Operator (`external-secrets.io/v1`)
- Encryption
- SOPS + age for tier 0 (public key in .sops.yaml, private key off-cluster); ESO ClusterSecretStore for tier 1
No sealed-secrets, no external KMS. Two layers handle two different problems: SOPS bootstraps the cluster (including the 1P Connect credentials themselves), ESO owns runtime secrets for the apps. Fewer moving parts and a clean separation between bootstrap and steady state.
- Critical
- Cloudflare R2 · CNPG WAL plus base (continuous), tofu state, rustdesk keypair, Scrypted HomeKit pairing, n8n credentials. Total ~1GB.
- Bulk
- UNAS via in-cluster restic-rest-server · Plex config, Loki chunks, *arr configs, app data, Volsync repos. Total ~10GB.
- Offsite
- UNAS replicates select shares to Google Drive on a manual cadence.
- Isolation
- Per-app restic repositories (one S3 subpath each) prevent one app's `forget` from pruning another's snapshots.
- 01
`git push origin preview` to auzroz/austinrose-me.
- 02
deploy-preview.yml builds the Next.js static export on the runner; Puppeteer renders the dual-mode CV PDFs in the same job.
- 03
Multi-arch (amd64 + arm64) container build; push to private ghcr.io/auzroz/austinrose-me with `:latest`, `:run-N`, `:<sha>` tags.
- 04
Flux ImageRepository in home-ops polls GHCR every 5 minutes (read PAT from 1Password).
- 05
ImagePolicy selects the highest `run-N` (numeric ascending).
- 06
ImageUpdateAutomation commits the tag bump to auzroz/home-ops main using a fine-grained push PAT from 1Password.
- 07
Flux GitRepository reconciles the new commit; HelmRelease in the portfolio namespace rolls.
- 08
Traefik on kube-vip (LAN plus tailnet) serves the new pod; cert-manager wildcard cert handles TLS.
- 09
Container entrypoint stamps /origin.json with `{cloud: k3s, region: lab}`; the footer reflects it.
Cross-link: /lab/platform documents the workflows on the source side of this contract.