Categories: FAANG

Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices

ML 17896 image001

This post is co-written with Abhishek Sawarkar, Eliuth Triana, Jiahong Liu and Kshitiz Gupta from NVIDIA.

At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. This a revolutionary new capability within Amazon Bedrock that serves as a centralized hub for discovering, testing, and implementing foundation models (FMs). It provides developers and organizations access to an extensive catalog of over 100 popular, emerging, and specialized FMs, complementing the existing selection of industry-leading models in Amazon Bedrock. Bedrock Marketplace enables model subscription and deployment through managed endpoints, all while maintaining the simplicity of the Amazon Bedrock unified APIs.

The NVIDIA Nemotron family, available as NVIDIA NIM microservices, offers a cutting-edge suite of language models now available through Amazon Bedrock Marketplace, marking a significant milestone in AI model accessibility and deployment.

In this post, we discuss the advantages and capabilities of the Bedrock Marketplace and Nemotron models, and how to get started.

About Amazon Bedrock Marketplace

Bedrock Marketplace plays a pivotal role in democratizing access to advanced AI capabilities through several key advantages:

Comprehensive model selection – Bedrock Marketplace offers an exceptional range of models, from proprietary to publicly available options, allowing organizations to find the perfect fit for their specific use cases.
Unified and secure experience – By providing a single access point for all models through the Amazon Bedrock APIs, Bedrock Marketplace significantly simplifies the integration process. Organizations can use these models securely, and for models that are compatible with the Amazon Bedrock Converse API, you can use the robust toolkit of Amazon Bedrock, including Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Guardrails, and Amazon Bedrock Flows.
Scalable infrastructure – Bedrock Marketplace offers configurable scalability through managed endpoints, allowing organizations to select their desired number of instances, choose appropriate instance types, define custom auto scaling policies that dynamically adjust to workload demands, and optimize costs while maintaining performance.

About the NVIDIA Nemotron model family

At the forefront of the NVIDIA Nemotron model family is Nemotron-4, as stated by NVIDIA, it is a powerful multilingual large language model (LLM) trained on an impressive 8 trillion text tokens, specifically optimized for English, multilingual, and coding tasks. Key capabilities include:

Synthetic data generation – Able to create high-quality, domain-specific training data at scale
Multilingual support – Trained on extensive text corpora, supporting multiple languages and tasks
High-performance inference – Optimized for efficient deployment on GPU-accelerated infrastructure
Versatile model sizes – Includes variants like the Nemotron-4 15B with 15 billion parameters
Open license – Offers a uniquely permissive open model license that gives enterprises a scalable way to generate and own synthetic data that can help build powerful LLMs

The Nemotron models offer transformative potential for AI developers by addressing critical challenges in AI development:

Data augmentation – Solve data scarcity problems by generating synthetic, high-quality training datasets
Cost-efficiency – Reduce manual data annotation costs and time-consuming data collection processes
Model training enhancement – Improve AI model performance through high-quality synthetic data generation
Flexible integration – Support seamless integration with existing AWS services and workflows, enabling developers to build sophisticated AI solutions more rapidly

These capabilities make Nemotron models particularly well-suited for organizations looking to accelerate their AI initiatives while maintaining high standards of performance and security.

Getting started with Bedrock Marketplace and Nemotron

To get started with Amazon Bedrock Marketplace, open the Amazon Bedrock console. From there, you can explore Bedrock Marketplace interface, which offers a comprehensive catalog of FMs from various providers. You can browse through the available options to discover different AI capabilities and specializations. This exploration will lead you to find NVIDIA’s model offerings, including Nemotron-4.

We walk you through these steps in the following sections.

Open Amazon Bedrock Marketplace

Navigating to Amazon Bedrock Marketplace is straightforward:

On the Amazon Bedrock console, choose Model catalog in the navigation pane.
Under Filters, select Bedrock Marketplace.

Upon entering Bedrock Marketplace, you’ll find a well-organized interface with various categories and filters to help you find the right model for your needs. You can browse by providers and modality.

Use the search function to quickly locate specific providers, and explore models cataloged in Bedrock Marketplace.

Deploy NVIDIA Nemotron models

After you’ve located NVIDIA’s model offerings in Bedrock Marketplace, you can narrow down to the Nemotron model. To subscribe to and deploy Nemotron-4, complete the following steps:

Filter by Nemotron under Providers or search by model name.
Choose from the available models, such as Nemotron-4 15B.

On the model details page, you can examine its specifications, capabilities, and pricing details. The Nemotron-4 model offers impressive multilingual and coding capabilities.

Choose View subscription options to subscribe to the model.
Review the available options and choose Subscribe.
Choose Deploy and follow the prompts to configure your deployment options, including instance types and scaling policies.

The process is user-friendly, allowing you to quickly integrate these powerful AI capabilities into your projects using the Amazon Bedrock APIs.

Conclusion

The launch of NVIDIA Nemotron models on Amazon Bedrock Marketplace marks a significant milestone in making advanced AI capabilities more accessible to developers and organizations. Nemotron-4 15B, with its impressive 15-billion-parameter architecture trained on 8 trillion text tokens, brings powerful multilingual and coding capabilities to the Amazon Bedrock.

Through Bedrock Marketplace, organizations can use Nemotron’s advanced capabilities while benefiting from the scalable infrastructure of AWS and NVIDIA’s robust technologies. We encourage you to start exploring the capabilities of NVIDIA Nemotron models today through Amazon Bedrock Marketplace, and experience firsthand how this powerful language model can transform your AI applications.

About the authors

James Park is a Solutions Architect at Amazon Web Services. He works with Amazon.com to design, build, and deploy technology solutions on AWS, and has a particular interest in AI and machine learning. In h is spare time he enjoys seeking out new cultures, new experiences, and staying up to date with the latest technology trends. You can find him on LinkedIn.

Saurabh Trikande is a Senior Product Manager for Amazon Bedrock and SageMaker Inference. He is passionate about working with customers and partners, motivated by the goal of democratizing AI. He focuses on core challenges related to deploying complex AI applications, inference with multi-tenant models, cost optimizations, and making the deployment of Generative AI models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.

Melanie Li, PhD, is a Senior Generative AI Specialist Solutions Architect at AWS based in Sydney, Australia, where her focus is on working with customers to build solutions leveraging state-of-the-art AI and machine learning tools. She has been actively involved in multiple Generative AI initiatives across APJ, harnessing the power of Large Language Models (LLMs). Prior to joining AWS, Dr. Li held data science roles in the financial and retail industries.

Marc Karp is an ML Architect with the Amazon SageMaker Service team. He focuses on helping customers design, deploy, and manage ML workloads at scale. In his spare time, he enjoys traveling and exploring new places.

Abhishek Sawarkar is a product manager in the NVIDIA AI Enterprise team working on integrating NVIDIA AI Software in Cloud MLOps platforms. He focuses on integrating the NVIDIA AI end-to-end stack within Cloud platforms & enhancing user experience on accelerated computing.

Eliuth Triana is a Developer Relations Manager at NVIDIA empowering Amazon’s AI MLOps, DevOps, Scientists and AWS technical experts to master the NVIDIA computing stack for accelerating and optimizing Generative AI Foundation models spanning from data curation, GPU training, model inference and production deployment on AWS GPU instances. In addition, Eliuth is a passionate mountain biker, skier, tennis and poker player.

Jiahong Liu is a Solutions Architect on the Cloud Service Provider team at NVIDIA. He assists clients in adopting machine learning and AI solutions that leverage NVIDIA-accelerated computing to address their training and inference challenges. In his leisure time, he enjoys origami, DIY projects, and playing basketball.

Kshitiz Gupta is a Solutions Architect at NVIDIA. He enjoys educating cloud customers about the GPU AI technologies NVIDIA has to offer and assisting them with accelerating their machine learning and deep learning applications. Outside of work, he enjoys running, hiking, and wildlife watching.

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

December 3, 2024

In "FAANG"

Mercury foundation models from Inception Labs are now available in Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

August 28, 2025

In "FAANG"

Optimize price-performance of LLM inference on NVIDIA GPUs using the Amazon SageMaker integration with NVIDIA NIM Microservices

NVIDIA NIM microservices now integrate with Amazon SageMaker, allowing you to deploy industry-leading large language models (LLMs) and optimize model performance and cost. You can deploy state-of-the-art LLMs in minutes instead of days using technologies such as NVIDIA TensorRT, NVIDIA TensorRT-LLM, and NVIDIA Triton Inference Server on NVIDIA accelerated instances…

March 19, 2024

In "FAANG"

AI Generated Robotic Content