After months of planning, wiring, airflow tuning, and too many late nights this is my home lab GPU cluster finally up and running.
This setup is built mainly for:
β’ AI / LLM inference & training β’ Image & video generation pipelines β’ Kubernetes + GPU scheduling β’ Self-hosted APIs & experiments
π§ Hardware Overview
β’ Total GPUs: 12 Γ RTX 5090 β’ Layout: 6 machines Γ 2 GPUs each β’ Gpu Machine Memory: 128 GB per Machne β’ Total VRAM: 1.5 TB+ β’ CPU: 88 cores / 176 threads per server β’ System RAM: 256 GB per machine
π₯οΈ Infrastructure
β’ Dedicated rack with managed switches β’ Clean airflow-focused cases (no open mining frames) β’ GPU nodes exposed via Kubernetes β’ Separate workstation + monitoring setup β’ Everything self-hosted (no cloud dependency)
π‘οΈ Cooling & Power
β’ Tuned fan curves + optimized case airflow β’ Stable thermals even under sustained load β’ Power isolation per node (learned this the hard way π
)
π What Iβm Running
β’ Kubernetes with GPU-aware scheduling β’ Multiple AI workloads (LLMs, diffusion, video) β’ Custom API layer for routing GPU jobs β’ NAS-backed storage + backups
This is 100% a learning + building lab, not a mining rig.