Many of today’s multimodal workloads require a powerful mix of GPU-based accelerators, large GPU memory, and professional graphics to achieve the performance and throughput that they need. Today, we announced the general availability of the G4 VM, powered by NVIDIA’s RTX PRO 6000 Blackwell Server Edition GPUs. The addition of the G4 expands our comprehensive NVIDIA GPU portfolio, complementing the specialized scale of the A-series VMs, and the cost-efficiency of G2 VMs. The G4 VM is available now, bringing GPU availability to more Google Cloud regions than ever before, for applications that are latency sensitive or have specific regulatory requirements.
We also announced the general availability of NVIDIA Omniverse as a virtual machine image (VMI) on Google Cloud Marketplace. When run on G4, it’s easier than ever to develop and deploy industrial digital twin and physical AI simulation applications leveraging NVIDIA Omniverse libraries. G4 VMs provide the necessary infrastructure — up to 768 GB of GDDR7 memory, NVIDIA Tensor Cores, and fourth-generation Ray Tracing (RT) cores — to run the demanding real-time rendering and physically accurate simulations required for enterprise digital twins. Together, they provide a scalable cloud environment to build, deploy, and interact with applications for industrial digital twins or robotics simulation.
The G4 VM offers a profound leap in performance, with up to 9x the throughput of G2 instances, enabling a step-change in results for a wide spectrum of workloads, from multi-modal AI inference, photorealistic design and visualization, and robotics simulation using applications developed on NVIDIA Omniverse. The G4 currently comes in 1, 2, 4, and 8 NVIDIA RTX PRO 6000 Blackwell GPU options, with fractional GPU options coming soon.
Here are some of the ways you can use G4 to innovate and accelerate your business:
AI training, fine-tuning, and inference
NVIDIA Omniverse and simulation
AI-driven rendering, graphics and virtual workstations
Modern generative AI models often exceed the VRAM of a single GPU, making you use multi-GPU configurations to serve these workloads. While this approach is common, performance can be bottlenecked by the communication speed between the AI architecture. We significantly boosted multi-GPU performance on G4 VMs by implementing an enhanced PCIe-based P2P data path that optimizes critical collective operations like All-Reduce, which is essential for splitting models across GPUs. Thanks to the G4’s enhanced peer-to-peer capabilities, you can expect up to 168% throughput gains and 41% lower latency (inter-token latency) when using tensor parallelism for model serving compared to standard non-P2P offerings.
For your generative AI applications, this technical differentiation translates into:
Faster user experience: Lower latency means quicker responses from your AI services, enabling more interactive and real-time applications.
Higher scalability: Increased throughput allows you to serve more concurrent users from a single virtual machine, significantly improving the price-performance and scalability of your service.
G4 VMs are fully integrated with several Google Cloud services, accelerating your AI workloads from day one.
Google Kubernetes Engine (GKE): G4 GPUs are generally available through GKE. Since GKE recently extended Autopilot to all qualifying clusters, including GKE Standard clusters, you can benefit from GKE’s container-optimized compute platform to rapidly scale your G4 GPUs, enabling you to optimize costs. By adding the GKE Inference Gateway, you can stretch the benefits of G4 even further to achieve lower AI serving latency and higher throughput.
Vertex AI: Both inference and training benefit significantly from G4’s large GPU memory (96 GB per GPU, 768 GB total), native FP4 precision support, and global presence.
Dataproc: G4 VMs are fully supported on the Dataproc managed analytics platform, letting you accelerate large-scale Spark and Hadoop workloads. This enables data scientists and data engineers to significantly boost performance for machine learning and large-scale data processing workloads.
Cloud Run: We’ve extended our serverless platform’s AI infrastructure options to include the NVIDIA RTX PRO 6000, so you can perform real-time AI inference with your preferred LLMs or media rendering using fully managed, simple, pay-per-use GPUs.
Hyperdisk ML, Managed Lustre, and Cloud Storage: When you need to expand beyond local storage for your HPC and large scale AI/ML workloads, you can connect G4 to a variety of Google Cloud storage services. For low latency and up to 500K of IO per instance, Hyperdisk ML is a great option. For high-performance file storage in the same zone, Managed Lustre offers a parallel file system ideal for persistent storage, up to 1TB/s. Finally, if you need nearly unlimited global capacity, with powerful capabilities like Anywhere Cache for use cases like inference, choose Cloud Storage as your primary, highly available, and globally scalable storage platform for training datasets, model artifacts, and feature stores.
Here’s how customers are using G4 to innovate and accelerate within their businesses:
“The combination of NVIDIA Omniverse on Google Cloud G4 VMs is the true engine for our creative transformation. It empowers our teams to compress weeks of traditional production into hours, allowing us to instantly generate photorealistic 3D advertising environments at a global scale while ensuring pixel-perfect brand compliance—a capability that redefines speed and personalization in digital marketing.” – Perry Nightingale, SVP Creative AI, WPP
“We’re excited to bring the power of Google Cloud G4 VMs into Altair One, so you can run your most demanding simulation and fluid dynamics workloads with the speed, scale, and visual fidelity needed to push innovation further.” – Yeshwant Mummaneni, Chief Engineer – Analytics, HPC, IoT & Digital Twin, Altair
Choosing Google Cloud means selecting a platform engineered for tangible results. The new G4 VM is a prime example, with our custom P2P interconnect unlocking up to 168% more throughput from the underlying NVIDIA RTX PRO 6000 Blackwell GPUs. This focus on optimized performance extends across our comprehensive portfolio; the G4 perfectly complements our existing A-Series and G2 GPUs, ensuring you have the ideal infrastructure for any workload. Beyond raw performance, we deliver turnkey solutions to accelerate your time to value. With NVIDIA Omniverse now available on the Google Cloud Marketplace, you can immediately deploy enterprise-grade digital twin and simulation applications on a fully managed and scalable platform.
G4 capacity is immediately available. To get started, simply select G4 VMs from the Google Cloud console. NVIDIA Omniverse and Isaac Sim are qualified Google Cloud Marketplace solutions that can draw down on your Google Cloud commitments; for more information, please contact your Google Cloud sales team or reseller.
submitted by /u/nikitagent [link] [comments]
A dangerous assumption that can be made from prior work on the bias transfer hypothesis…
Author: Keertana Chidambaram, Qiuling Xu, Ko-Jen Hsiao, Moumita Bhattacharya(*The work was done when Keertana interned…
Remember when browsers were simple? You clicked a link, a page loaded, maybe you filled…
These affordable open buds come with Bose-crafted sound.
Over the past decade, deep learning has transformed how artificial intelligence (AI) agents perceive and…