faang

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Training large language models (LLMs) models has become a significant expense for businesses. For many use cases, companies are looking…

1 year ago

Using transcription confidence scores to improve slot filling in Amazon Lex

When building voice-enabled chatbots with Amazon Lex, one of the biggest challenges is accurately capturing user speech input for slot…

1 year ago

Can “Safe AI” Companies Survive in an Unrestrained AI Landscape?

TL;DR A conversation with 4o about the potential demise of companies like Anthropic. As artificial intelligence (AI) continues to advance,…

1 year ago

AI Systems Governance through the Palantir Platform

Editor’s note: This is the second post in a series that explores a range of topics about upcoming AI regulation,…

1 year ago

Introducing Configurable Metaflow

David J. Berg*, David Casler^, Romain Cledat*, Qian Huang*, Rui Lin*, Nissan Pow*, Nurcan Sonmez*, Shashank Srikanth*, Chaoying Wang*, Regina…

1 year ago

Add a generative AI experience to your website or web application with Amazon Q embedded

Generative AI offers many benefits for both you, as a software provider, and your end-users. AI assistants can help users…

1 year ago

Find sensitive data faster (but safely) with Google Distributed Cloud’s gen AI search solution

Today, generative AI is giving organizations new ways to process and analyze data, discover hidden insights, increase productivity and build…

1 year ago

Accelerating LLM Inference on NVIDIA GPUs with ReDrafter

Accelerating LLM inference is an important ML research problem, as auto-regressive token generation is computationally expensive and relatively slow, and…

1 year ago

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

This post is co-written with Marta Cavalleri and Giovanni Germani from Fastweb, and Claudia Sacco and Andrea Policarpi from BIP…

1 year ago

Optimizing RAG retrieval: Test, tune, succeed

Retrieval-augmented generation (RAG) supercharges large language models (LLMs) by connecting them to real-time, proprietary, and specialized data. This helps LLMs…

1 year ago