Categories: FAANG

NVIDIA to Present Innovations at Hot Chips That Boost Data Center Performance and Energy Efficiency

A deep technology conference for processor and system architects from industry and academia has become a key forum for the trillion-dollar data center computing market.

At Hot Chips 2024 next week, senior NVIDIA engineers will present the latest advancements powering the NVIDIA Blackwell platform, plus research on liquid cooling for data centers and AI agents for chip design.

They’ll share how:

  • NVIDIA Blackwell brings together multiple chips, systems and NVIDIA CUDA software to power the next generation of AI across use cases, industries and countries.
  • NVIDIA GB200 NVL72 — a multi-node, liquid-cooled, rack-scale solution that connects 72 Blackwell GPUs and 36 Grace CPUs — raises the bar for AI system design.
  • NVLink interconnect technology provides all-to-all GPU communication, enabling record high throughput and low-latency inference for generative AI.
  • The NVIDIA Quasar Quantization System pushes the limits of physics to accelerate AI computing.
  • NVIDIA researchers are building AI models that help build processors for AI.

An NVIDIA Blackwell talk, taking place Monday, Aug. 26, will also spotlight new architectural details and examples of generative AI models running on Blackwell silicon.

It’s preceded by three tutorials on Sunday, Aug. 25, that will cover how hybrid liquid-cooling solutions can help data centers transition to more energy-efficient infrastructure and how AI models, including large language model (LLM)-powered agents, can help engineers design the next generation of processors.

Together, these presentations showcase the ways NVIDIA engineers are innovating across every area of data center computing and design to deliver unprecedented performance, efficiency and optimization.

Be Ready for Blackwell

NVIDIA Blackwell is the ultimate full-stack computing challenge. It comprises multiple NVIDIA chips, including the Blackwell GPU, Grace CPU, BlueField data processing unit, ConnectX network interface card, NVLink Switch, Spectrum Ethernet switch and Quantum InfiniBand switch.

Ajay Tirumala and Raymond Wong, directors of architecture at NVIDIA, will provide a first look at the platform and explain how these technologies work together to deliver a new standard for AI and accelerated computing performance while advancing energy efficiency.

The multi-node NVIDIA GB200 NVL72 solution is a perfect example. LLM inference requires low-latency, high-throughput token generation. GB200 NVL72 acts as a unified system to deliver up to 30x faster inference for LLM workloads, unlocking the ability to run trillion-parameter models in real time.

Tirumala and Wong will also discuss how the NVIDIA Quasar Quantization System — which brings together algorithmic innovations, NVIDIA software libraries and tools, and Blackwell’s second-generation Transformer Engine — supports high accuracy on low-precision models, highlighting examples using LLMs and visual generative AI.

Keeping Data Centers Cool

The traditional hum of air-cooled data centers may become a relic of the past as researchers develop more efficient and sustainable solutions that use hybrid cooling, a combination of air and liquid cooling.

Liquid-cooling techniques move heat away from systems more efficiently than air, making it easier for computing systems to stay cool even while processing large workloads. The equipment for liquid cooling also takes up less space and consumes less power than air-cooling systems, allowing data centers to add more server racks — and therefore more compute power — in their facilities.

Ali Heydari, director of data center cooling and infrastructure at NVIDIA, will present several designs for hybrid-cooled data centers.

Some designs retrofit existing air-cooled data centers with liquid-cooling units, offering a quick and easy solution to add liquid-cooling capabilities to existing racks. Other designs require the installation of piping for direct-to-chip liquid cooling using cooling distribution units or by entirely submerging servers in immersion cooling tanks. Although these options demand a larger upfront investment, they lead to substantial savings in both energy consumption and operational costs.

Heydari will also share his team’s work as part of COOLERCHIPS, a U.S. Department of Energy program to develop advanced data center cooling technologies. As part of the project, the team is using the NVIDIA Omniverse platform to create physics-informed digital twins that will help them model energy consumption and cooling efficiency to optimize their data center designs.

AI Agents Chip In for Processor Design

Semiconductor design is a mammoth challenge at microscopic scale. Engineers developing cutting-edge processors work to fit as much computing power as they can onto a piece of silicon a few inches across, testing the limits of what’s physically possible.

AI models are supporting their work by improving design quality and productivity, boosting the efficiency of manual processes and automating some time-consuming tasks. The models include prediction and optimization tools to help engineers rapidly analyze and improve designs, as well as LLMs that can assist engineers with answering questions, generating code, debugging design problems and more.

Mark Ren, director of design automation research at NVIDIA, will provide an overview of these models and their uses in a tutorial. In a second session, he’ll focus on agent-based AI systems for chip design.

AI agents powered by LLMs can be directed to complete tasks autonomously, unlocking broad applications across industries. In microprocessor design, NVIDIA researchers are developing agent-based systems that can reason and take action using customized circuit design tools, interact with experienced designers, and learn from a database of human and agent experiences.

NVIDIA experts aren’t just building this technology — they’re using it. Ren will share examples of how engineers can use AI agents for timing report analysis, cell cluster optimization processes and code generation. The cell cluster optimization work recently won best paper at the first IEEE International Workshop on LLM-Aided Design.

Register for Hot Chips, taking place Aug. 25-27, at Stanford University and online.

AI Generated Robotic Content

Recent Posts

Flux Kontext Dev is pretty good. Generated completely locally on ComfyUI.

You can find the workflow by scrolling down on this page: https://comfyanonymous.github.io/ComfyUI_examples/flux/ submitted by /u/comfyanonymous…

18 hours ago

7 AI Agent Frameworks for Machine Learning Workflows in 2025

Machine learning practitioners spend countless hours on repetitive tasks: monitoring model performance, retraining pipelines, data…

18 hours ago

A Gentle Introduction to Attention Masking in Transformer Models

This post is divided into four parts; they are: • Why Attention Masking is Needed…

18 hours ago

10 Essential Machine Learning Key Terms Explained

Artificial intelligence (AI) is an umbrella computer science discipline focused on building software systems capable…

18 hours ago

From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating Mobile UI Operation Impacts

With advances in generative AI, there is increasing work towards creating autonomous agents that can…

18 hours ago

Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

Amazon Bedrock Guardrails provides configurable safeguards to help build trusted generative AI applications at scale.…

18 hours ago