- Nix 93.3%
- HCL 4.6%
- Makefile 2.1%
nixos-generators' "proxmox" format doesn't include virtio_pci / virtio_blk / virtio_scsi in the initrd's available modules. The VMs booted yesterday because the running kernel had loaded them after stage 1 — but after the 2026-06-11 power outage, every cluster VM came back up unable to find /dev/vda and hung in stage 1 emergency mode (root account locked, so unrecoverable from the console). Declaring boot.initrd.availableKernelModules here ensures the modules are on every future template build and any nixos-rebuild that touches proxmox-base. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| template | ||
| terraform | ||
| .gitignore | ||
| inventory.yaml | ||
| README.md | ||
basha_infra
Hypervisor + VM layer for the home cluster. Owns machines; service repos
own workloads. See ~/src/CLAUDE.md for the broader picture.
Layout
basha_infra/
├── inventory.yaml # source of truth: Proxmox hosts, VMs, roles, host_volumes
├── template/ # NixOS Proxmox template builder
│ ├── flake.nix
│ ├── modules/
│ │ ├── proxmox-base.nix # bootloader, networking, ssh — used by every VM
│ │ └── cluster-client.nix # consul + nomad + docker + tailscale (client mode)
│ ├── hosts/
│ │ └── apps.nix # per-VM nix config (hostname, meta.role)
│ └── Makefile # `make template` builds + uploads + registers
└── terraform/ # VM lifecycle
├── main.tf # bpg/proxmox: clones template per inventory.yaml
├── variables.tf
└── outputs.tf
The control plane (Consul server, Nomad server, Vault) lives on the home
NixOS box (~/src/nixos/), not here. VMs only run the client halves.
How it fits together
template/ → qmrestore → Proxmox template VMID 9001
│
│ qm clone
▼
terraform/ → cloned VMs (apps, postgres, …) per inventory.yaml
│
│ nixos-rebuild --target-host
▼
template/hosts/<name>.nix lays the per-host config on top of the template
The template carries the cluster-client modules baked in, so a freshly
cloned VM joins Consul + Nomad on first boot. Per-host overrides
(hostname, meta.role, future host_volume mounts) are applied later via
nixos-rebuild --target-host.
Bootstrap (first-time on a fresh Proxmox)
Prerequisites (one-off on Proxmox, see pveum/pvesm):
terraform@pveuser with an API token + Administrator on/localdatastore hasimportandsnippetscontent types enabled- Your laptop's SSH key in
root@<proxmox>:~/.ssh/authorized_keys
# 1. Build + register the template (VMID 9001)
cd template && make template
# 2. Fill in terraform.tfvars
cd ../terraform
cp terraform.tfvars.example terraform.tfvars
$EDITOR terraform.tfvars
# 3. Apply — clones a VM per inventory.yaml entry
terraform init
terraform apply
# 4. After first boot, SSH in via LAN IP (guest agent reports it),
# run `tailscale up --authkey=...` once. Then from the laptop:
nixos-rebuild switch --flake ../template#apps --target-host root@apps
Add a new VM
- Add an entry under
vms:ininventory.yaml. - Add
template/hosts/<name>.nixand anixosConfigurations.<name>entry intemplate/flake.nix. cd terraform && terraform applyclones the VM.nixos-rebuild switch --flake template#<name> --target-host root@<lan-ip>.
Update an existing VM
Edit template/modules/cluster-client.nix or template/hosts/<name>.nix,
then redeploy without recreating the VM:
nixos-rebuild switch --flake ./template#<name> --target-host root@<name>
If the change should also affect future clones, rebuild the template too:
cd template && make template. (Existing VMs are not auto-recreated.)
Remove a VM
nomad node drain -enable -force <node-id>— evacuate jobs.- Delete the entry from
inventory.yaml. cd terraform && terraform apply— destroys the VM.
TODOs
- Move terraform state to Garage S3 backend (see
~/src/CLAUDE.md). - Add
postgres/garage/vault/forgejohosts as stateful services migrate off the home box. - Thread tailscale auth via
nixos-rebuildso the one-time manualtailscale upafter clone isn't needed.