ML 16280 arch diagram 1024x452 1

Improve AI assistant response accuracy using Knowledge Bases for Amazon Bedrock and a reranking model

AI chatbots and virtual assistants have become increasingly popular in recent years thanks the breakthroughs of large language models (LLMs). Trained on a large volume of datasets, these models incorporate memory components in their architectural design, allowing them to understand and comprehend textual context. Most common use cases for chatbot assistants focus on a few …

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such data requires an abundance of both compute and data, which grows with the size of the model being trained. This is infeasible both because of the large compute …

ML 17312 IMG2 1

Build custom generative AI applications powered by Amazon Bedrock

With last month’s blog, I started a series of posts that highlight the key factors that are driving customers to choose Amazon Bedrock. I explored how Bedrock enables customers to build a secure, compliant foundation for generative AI applications. Now I’d like to turn to a slightly more technical, but equally important differentiator for Bedrock—the …

Announcing LangChain on Vertex AI for AlloyDB and Cloud SQL for PostgreSQL

Among application developers, LangChain is one of the most popular open-source LLM orchestration frameworks. To help developers use LangChain to create context-aware gen AI applications with Google Cloud databases, in March we open-sourced LangChain integrations for all of our Google Cloud databases including Vector stores, Document loaders, and Chat message history. And now, we’re excited …

02A0w6pQunt9GUJsFFV

Can Chatbots Reduce Business Costs Dramatically? Expert Opinion and Insights

The clock strikes 3 AM. A potential buyer is browsing your website, eager to make a purchase, but they have a question. Your support team is fast asleep, and the sale slips away. Sound familiar? Today’s consumers expect instant gratification, and companies are feeling the pressure — both on their resources and their bottom line. Meeting these …

BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks

This paper was accepted at IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) 2024 Programmers frequently engage with machine learning tutorials in computational notebooks and have been adopting code generation technologies based on large language models (LLMs). However, they encounter difficulties in understanding and working with code produced by LLMs. To mitigate these challenges, …

Investigation of a Cross-regional Network Performance Issue

Hechao Li, Roger Cruz Cloud Networking Topology Netflix operates a highly efficient cloud computing infrastructure that supports a wide array of applications essential for our SVOD (Subscription Video on Demand), live streaming and gaming services. Utilizing Amazon AWS, our infrastructure is hosted across multiple geographic regions worldwide. This global distribution allows our applications to deliver content …

Image 001 3

Build an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and AWS CloudFormation

Retrieval Augmented Generation (RAG) is a state-of-the-art approach to building question answering systems that combines the strengths of retrieval and foundation models (FMs). RAG models first retrieve relevant information from a large corpus of text and then use a FM to synthesize an answer based on the retrieved information. An end-to-end RAG solution involves several …