Building RAG Systems with Transformers

This post is divided into five parts: • Understanding the RAG architecture • Building the Document Indexing System • Implementing the Retrieval System • Implementing the Generator • Building the Complete RAG System An RAG system consists of two main components: • Retriever: Responsible for finding relevant documents or passages from a knowledge base given …

ML 18239 architecture 1 1024x896 1

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Archival data in research institutions and national laboratories represents a vast repository of historical knowledge, yet much of it remains inaccessible due to factors like limited metadata and inconsistent labeling. Traditional keyword-based search mechanisms are often insufficient for locating relevant documents efficiently, requiring extensive manual review to extract meaningful insights. To address these challenges, a …

1 Rg2TCpJmax 1000x1000 1

Going from requirements to prototype with Gemini Code Assist

Imagine this common scenario: you have a detailed product requirements document for your next project. Instead of reading the whole document and manually starting to code (or defining test cases or API specifications) to implement the required functions, you want to see how AI can shorten your path from the requirements document to a working …

Engineering a robot that can jump 10 feet high — without legs

Inspired by the movements of a tiny parasitic worm, engineers have created a 5-inch soft robot that can jump as high as a basketball hoop. Their device, a silicone rod with a carbon-fiber spine, can leap 10 feet high even though it doesn’t have legs. The researchers made it after watching high-speed video of nematodes …

Touch meets tech: AI brings tactile textures to 3D-printed objects

Essential for many industries ranging from Hollywood computer-generated imagery to product design, 3D modeling tools often use text or image prompts to dictate different aspects of visual appearance, like color and form. As much as this makes sense as a first point of contact, these systems are still limited in their realism due to their …

ml 18766 image001

Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15

Today, we’re excited to announce the launch of Amazon SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This version now supports the latest open-source models, such as Meta’s Llama 4 models Scout and Maverick, Google’s Gemma 3, Alibaba’s Qwen, Mistral AI, DeepSeek-R, and many more. …

Google Cloud Database and LangChain integrations now support Go, Java, and JavaScript

Last year, Google Cloud and LangChain announced integrations that give generative AI developers access to a suite of LangChain Python packages. This allowed application developers to leverage Google Cloud’s database portfolio in their gen AI applications to drive the most value from their private data. Today, we are expanding language support for our integrations to …