Managing a growing home lab manually leads to drift and undocumented state. You SSH in, kubectl apply a few manifests, tweak something, and within a few weeks you can’t remember what’s running or how it got there. GitOps fixes this — your Git repos become the single source of truth, and a controller on the cluster continuously reconciles the actual state to match.

This post covers how I use Flux CD and k3s to manage my home lab with GitOps. For the HTTPS/certificate side of things, see Automated Wildcard HTTPS Behind NAT with Let’s Encrypt.

The end result

Every service I run is declared in a single kustomization.yaml in my fleet-services repo:

yaml
resources:
  - home-assistant
  - plex
  - immich
  - grafana
  - prometheus

Adding a new service = adding a directory with manifests + one line in this file. Push to main, and Flux deploys it within a minute. No SSH, no kubectl apply, no remembering what you did.

Architecture overview

graph TD
    repos["GitHub Repos<br>(fleet-infra, fleet-services)"] -->|"git pull (1 min)"| flux

    subgraph cluster["k3s Cluster (192.168.0.251)"]
        flux["Flux CD"] --> infra["Infrastructure<br>(Traefik, cert-manager)"]
        flux --> config["Config<br>(certificates, issuers)"]
        flux --> apps["Apps<br>(fleet-services repo)"]
        apps --> pods["immich · plex · HA · ..."]
    end

GitHub holds the desired state. Flux pulls it and applies it to the cluster. That’s the whole idea.

Why two repos

I split my configuration across two repositories:

  • fleet-infra — Platform infrastructure: Flux itself, Traefik, cert-manager, certificates, issuers. Changes rarely.
  • fleet-services — Application services: deployments, services, ingresses for each app. Changes often.

This separation matters because:

  1. Different change velocity — I add or tweak services weekly, but the platform changes maybe once a month.
  2. Blast radius — A bad commit in fleet-services breaks one app. A bad commit in fleet-infra could break everything.
  3. Access control — You could give different people (or AI agents) access to each repo. An AI coding assistant can safely commit to fleet-services to add or update apps without having access to fleet-infra where a mistake could break the platform.

The dependency chain

Flux doesn’t just blindly apply everything at once. It uses Kustomization resources (not to be confused with kustomize’s kustomization.yaml) to define what to deploy and in what order.

In fleet-infra, three Kustomizations form a dependency chain:

clusters/my-cluster/infrastructure.yaml — Deploys platform components:

yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infrastructure
  namespace: flux-system
spec:
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure
  prune: true
  wait: true

clusters/my-cluster/config.yaml — Deploys configuration that depends on infrastructure:

yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: config
  namespace: flux-system
spec:
  dependsOn:
    - name: infrastructure
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./config
  prune: true

clusters/my-cluster/apps-kustomization.yaml — Deploys services from the fleet-services repo:

yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  dependsOn:
    - name: config
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: fleet-services
  path: .
  prune: true

The ordering matters: infrastructure installs Traefik and cert-manager first, then config creates certificates and issuers, and only then does Flux deploy your apps. Without dependsOn, Flux might try to create an Ingress before Traefik exists.

Notice that the apps Kustomization points to a different sourceRef — the fleet-services repo. This is how Flux watches the second repository.

k3s setup

Install k3s without the built-in Traefik (we’ll manage it via Flux instead):

bash
curl -sfL https://get.k3s.io | sh -s - --disable traefik

The --disable traefik flag is important. We want Flux to manage Traefik’s lifecycle so its configuration lives in Git alongside everything else.

Bootstrapping Flux

bash
flux bootstrap github \
  --owner=stjohnb \
  --repository=fleet-infra \
  --branch=main \
  --path=clusters/my-cluster \
  --personal

This creates deploy keys, installs Flux controllers, and sets up the GitOps sync for fleet-infra.

To watch the second repo, add a GitRepository source in fleet-infra:

clusters/my-cluster/fleet-services-source.yaml:

yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: fleet-services
  namespace: flux-system
spec:
  interval: 1m
  url: ssh://git@github.com/stjohnb/fleet-services
  ref:
    branch: main
  secretRef:
    name: fleet-services-deploy-key

You’ll also need to create a deploy key for the fleet-services repo and store it as a Kubernetes secret. Once this is in place, Flux watches both repos.

Deploying services: two patterns

In-cluster services

Most services run as pods in the cluster. Here’s immich as an example — you need three manifests plus a kustomization:

immich/deployment.yaml — Pod specification (image, ports, volumes, etc.)

immich/service.yaml — ClusterIP service:

yaml
apiVersion: v1
kind: Service
metadata:
  name: immich
spec:
  selector:
    app: immich
  ports:
    - port: 2283
      targetPort: 2283

immich/ingress.yaml — Makes it accessible via HTTPS:

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: immich
spec:
  ingressClassName: traefik-traefik
  tls:
    - hosts:
        - immich.lab.bstjohn.net
      secretName: lab-bstjohn-wildcard-tls
  rules:
    - host: immich.lab.bstjohn.net
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: immich
                port:
                  number: 2283

immich/kustomization.yaml:

yaml
resources:
  - deployment.yaml
  - service.yaml
  - ingress.yaml

The ingress references a wildcard TLS certificate — see the HTTPS post for how that’s set up.

External service proxy

Some services run on other machines but still benefit from the cluster’s ingress and certificates. Home Assistant runs on a separate server at 192.168.0.73:8123. To proxy through the cluster, use a Service without selectors and explicit Endpoints:

home-assistant/service.yaml:

yaml
apiVersion: v1
kind: Service
metadata:
  name: home-assistant
spec:
  ports:
    - port: 8123
      targetPort: 8123

home-assistant/endpoints.yaml:

yaml
apiVersion: v1
kind: Endpoints
metadata:
  name: home-assistant
subsets:
  - addresses:
      - ip: 192.168.0.73
    ports:
      - port: 8123

home-assistant/ingress.yaml:

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: home-assistant
spec:
  ingressClassName: traefik-traefik
  tls:
    - hosts:
        - home-assistant.lab.bstjohn.net
      secretName: lab-bstjohn-wildcard-tls
  rules:
    - host: home-assistant.lab.bstjohn.net
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: home-assistant
                port:
                  number: 8123

Now https://home-assistant.lab.bstjohn.net terminates SSL at Traefik and proxies to the external server. The service running on the other machine doesn’t need to know anything about certificates.

Adding a new service

The workflow is simple:

bash
cd fleet-services

# Create service directory with manifests
mkdir my-service
# ... create deployment.yaml, service.yaml, ingress.yaml, kustomization.yaml

# Add to root kustomization
echo "  - my-service" >> kustomization.yaml

# Deploy
git add . && git commit -m "Add my-service" && git push

# Flux deploys automatically within ~1 minute

That’s it. No SSH, no kubectl, no manual certificate setup.

Disaster recovery

This is the real selling point of GitOps for a home lab. If the cluster dies, rebuilding is straightforward:

  1. Install k3s with --disable traefik
  2. Bootstrap Flux pointing at fleet-infra
  3. Recreate the handful of secrets that aren’t stored in Git (API keys, deploy keys)

Flux reads both repos and rebuilds everything. The cluster converges to the declared state. All your services, their configuration, the infrastructure — it’s all in Git. You don’t need to remember what you deployed or how you configured it.


For the HTTPS and certificate management side of this setup, see Automated Wildcard HTTPS Behind NAT with Let’s Encrypt.