Categories: FAANG

Introducing three new NVIDIA GPU-based Amazon EC2 instances

Amazon Elastic Compute Cloud (Amazon EC2) accelerated computing portfolio offers the broadest choice of accelerators to power your artificial intelligence (AI), machine learning (ML), graphics, and high performance computing (HPC) workloads. We are excited to announce the expansion of this portfolio with three new instances featuring the latest NVIDIA GPUs: Amazon EC2 P5e instances powered by NVIDIA H200 GPUs, Amazon EC2 G6 instances featuring NVIDIA L4 GPUs, and Amazon EC2 G6e instances powered by NVIDIA L40S GPUs. All three instances will be available in 2024, and we look forward to seeing what you can do with them.

AWS and NVIDIA have collaborated for over 13 years and have pioneered large-scale, highly performant, and cost-effective GPU-based solutions for developers and enterprise across the spectrum. We have combined NVIDIA’s powerful GPUs with differentiated AWS technologies such as AWS Nitro System, 3,200 Gbps of Elastic Fabric Adapter (EFA) v2 networking, hundreds of GB/s of data throughput with Amazon FSx for Lustre, and exascale computing with Amazon EC2 UltraClusters to deliver the most performant infrastructure for AI/ML, graphics, and HPC. Coupled with other managed services such as Amazon Bedrock, Amazon SageMaker, and Amazon Elastic Kubernetes Service (Amazon EKS), these instances provide developers with the industry’s best platform for building and deploying generative AI, HPC, and graphics applications.

High-performance and cost-effective GPU-based instances for AI, HPC, and graphics workloads

To power the development, training, and inference of the largest large language models (LLMs), EC2 P5e instances will feature NVIDIA’s latest H200 GPUs, which offer 141 GBs of HBM3e GPU memory, which is 1.7 times larger and 1.4 times faster than H100 GPUs. This boost in GPU memory along with up to 3200 Gbps of EFA networking enabled by AWS Nitro System will enable you to continue to build, train, and deploy your cutting-edge models on AWS.

EC2 G6e instances, featuring NVIDIA L40S GPUs, are built to provide developers with a broadly available option for training and inference of publicly available LLMs, as well as support the increasing adoption of Small Language Models (SLM). They are also optimal for digital twin applications that use NVIDIA Omniverse for describing and simulating across 3D tools and applications, and for creating virtual worlds and advanced workflows for industrial digitalization.

EC2 G6 instances, featuring NVIDIA L4 GPUs, will deliver a lower-cost, energy-efficient solution for deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization as well as graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming.


About the Author

Chetan Kapoor is the Director of Product Management for the Amazon EC2 Accelerated Computing Portfolio.

AI Generated Robotic Content

Recent Posts

We may have a new SOTA open-source model: ERNIE-Image Comparisons

Base model is definitely SOTA, can even easily compete with closed-source ones in terms of…

46 mins ago

Navigating the generative AI journey: The Path-to-Value framework from AWS

Generative AI is reshaping how organizations approach productivity, customer experiences, and operational capabilities. Across industries,…

46 mins ago

The Surprising MacBook Neo Competitor You’ve Never Heard Of

In many ways, the HP OmniBook 5 is a better budget laptop than the MacBook…

2 hours ago

Tiny cameras in earbuds let users talk with AI about what they see

University of Washington researchers developed the first system that incorporates tiny cameras in off-the-shelf wireless…

2 hours ago

Update: Distilled v1.1 is live

We've pushed an LTX-2.3 update today. The Distilled model has been retrained (now v1.1) with…

1 day ago

How to Implement Tool Calling with Gemma 4 and Python

The open-weights model ecosystem shifted recently with the release of the

1 day ago