After months of planning, wiring, airflow tuning, and too many late nights this is my home lab GPU cluster finally up and running.
This setup is built mainly for:
• AI / LLM inference & training • Image & video generation pipelines • Kubernetes + GPU scheduling • Self-hosted APIs & experiments
🔧 Hardware Overview
• Total GPUs: 12 × RTX 5090 • Layout: 6 machines × 2 GPUs each • Gpu Machine Memory: 128 GB per Machne • Total VRAM: 1.5 TB+ • CPU: 88 cores / 176 threads per server • System RAM: 256 GB per machine
🖥️ Infrastructure
• Dedicated rack with managed switches • Clean airflow-focused cases (no open mining frames) • GPU nodes exposed via Kubernetes • Separate workstation + monitoring setup • Everything self-hosted (no cloud dependency)
🌡️ Cooling & Power
• Tuned fan curves + optimized case airflow • Stable thermals even under sustained load • Power isolation per node (learned this the hard way 😅)
🚀 What I’m Running
• Kubernetes with GPU-aware scheduling • Multiple AI workloads (LLMs, diffusion, video) • Custom API layer for routing GPU jobs • NAS-backed storage + backups
This is 100% a learning + building lab, not a mining rig.