GPU blog gif 2

Run your AI inference applications on Cloud Run with NVIDIA GPUs

Developers love Cloud Run for its simplicity, fast autoscaling, scale-to-zero capabilities, and pay-per-use pricing. Those same benefits come into play for real-time inference apps serving open gen AI models. That’s why today, we’re adding support for NVIDIA L4 GPUs to Cloud Run, in preview. This opens the door to many new use cases to Cloud …

Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy

Developers of generative AI typically face a tradeoff between model size and accuracy. But a new language model released by NVIDIA delivers the best of both, providing state-of-the-art accuracy in a compact form factor. Mistral-NeMo-Minitron 8B — a miniaturized version of the open Mistral NeMo 12B model released by Mistral AI and NVIDIA last month …

ml 17291 image001

Migrate Amazon SageMaker Data Wrangler flows to Amazon SageMaker Canvas for faster data preparation

Amazon SageMaker Data Wrangler provides a visual interface to streamline and accelerate data preparation for machine learning (ML), which is often the most time-consuming and tedious task in ML projects. Amazon SageMaker Canvas is a low-code no-code visual interface to build and deploy ML models without the need to write code. Based on customers’ feedback, …

Can You Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Self-supervised features are typically used in place of filter-bank features in speaker verification models. However, these models were originally designed to ingest filter-banks as inputs, and thus, training them on self-supervised features assumes that both feature types require the same amount of learning for the task. In this work, we observe that pre-trained self-supervised speech …

ML 17331 arch diagram

Cohere Rerank 3 Nimble now generally available on Amazon SageMaker JumpStart

The Cohere Rerank 3 Nimble foundation model (FM) is now generally available in Amazon SageMaker JumpStart. This model is the newest FM in Cohere’s Rerank model series, built to enhance enterprise search and Retrieval Augmented Generation (RAG) systems. In this post, we discuss the benefits and capabilities of this new model with some examples. Overview …

RepCNN: Micro-Sized, Mighty Models for Wakeword Detection

Always-on machine learning models require a very low memory and compute footprint. Their restricted parameter count limits the model’s capacity to learn, and the effectiveness of the usual training algorithms to find the best parameters. Here we show that a small convolutional model can be better trained by first refactoring its computation into a larger …

ML 17141 image001 v2

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

Amazon SageMaker Canvas now empowers enterprises to harness the full potential of their data by enabling support of petabyte-scale datasets. Starting today, you can interactively prepare large datasets, create end-to-end data flows, and invoke automated machine learning (AutoML) experiments on petabytes of data—a substantial leap from the previous 5 GB limit. With over 50 connectors, …

Save up to $400 on Your Chatbot Conference Tickets!

For a limited time, you can save up to $400 on Tickets to the Chatbot Conference 2024. Whether you’re a returning attendee or new to our community, this is the perfect chance to experience the future of AI and chatbot technology at a discounted rate. Here’s What You Can Expect: Insightful Keynotes: Hear from pioneers and …

QnABot Arch 1024x579 1

Delight your customers with great conversational experiences via QnABot, a generative AI chatbot

QnABot on AWS (an AWS Solution) now provides access to Amazon Bedrock foundational models (FMs) and Knowledge Bases for Amazon Bedrock, a fully managed end-to-end Retrieval Augmented Generation (RAG) workflow. You can now provide contextual information from your private data sources that can be used to create rich, contextual, conversational experiences. The advent of generative …