In the era of foundation models, multimodal AI, LLMs, and ever-larger datasets, access to raw compute is still one of the biggest bottlenecks for researchers, founders, developers, and engineers. While the cloud offers scalability, building a personal AI Workstation delivers complete control over your environment, latency reduction, custom configurations and setups, and the privacy of running all workloads locally.
This post covers our version of a four-GPU workstation powered by the new NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs. This build pushes the limits of desktop AI computing with 384GB of VRAM (96GB each GPU), all in a shell that can fit under your desk.
Training, fine-tuning, and running inference on modern AI models require massive VRAM bandwidth, high CPU throughput, and ultra-fast storage. Running these workloads in the cloud can introduce latency, setup overhead, slower data transfer speeds, and privacy tradeoffs.
By building a workstation around enterprise-grade GPUs with full PCIe 5.0 x16 connectivity, we get:
We are planning to test and make a limited number of these custom a16z Founders Edition AI Workstations.
Let’s break down the hardware:
Each GPU is connected via its own dedicated PCIe 5.0 x16, ensuring maximum data transfer rates between CPU and GPU. Unlike multi-GPU setups that rely on bifurcated lanes, multiplexers, or external bridges, this build guarantees no compromise on lane allocation or defaulting in lower PCIe versions.
The four PCIe 5.0 NVMe SSDs provide read speeds of up to ~14.9 GB/s each (theoretical), scaling to ~59 GB/s theoretical in RAID 0. While we are still in the process of testing full NVIDIA GPUDirect Storage (GDS) compatibility, it could allow GPUs to fetch data directly from NVMe drives, enabling direct-memory access (DMA).
The overall system consumes 1650W peak and fits comfortably into a home or office environment without requiring dedicated circuits or 220V wiring. With built-in wheels, it is designed for effortless transport between locations.
Integrated AST2600, a Baseboard Management Controller (BMC) that serves as a dedicated processor for remote out-of-band server management, operating independently of the host CPU and OS to handle critical monitoring and control tasks.
With libraries like vLLM, DeepSpeed, SGLang, etc., this machine serves as a foundation for training and serving custom LLMs, RL training pipelines, multimodal models, autonomous agents, etc., without cloud dependency and with a custom setup and environment.
This RTX 6000 Pro Blackwell workstation represents a sweet spot between datacenter power and desktop accessibility; all while staying within the footprint and power draw of a high-end AI Workstation under your desk.
Whether you’re a researcher exploring new architectures, a startup prototyping private LLM deployments, or simply an enthusiast, this build demonstrates an efficient, AI Workstation under your desk.
Some temperature tests: