ML 17179 image001

Configure Amazon Q Business with AWS IAM Identity Center trusted identity propagation

Amazon Q Business is a fully managed, permission aware generative artificial intelligence (AI)-powered assistant built with enterprise grade security and privacy features. Amazon Q Business can be configured to answer questions, provide summaries, generate content, and securely complete tasks based on your enterprise data. The native data source connectors provided by Amazon Q Business can …

1. icons.max 1000x1000 1

Designing Generative AI Solutions: Key Lessons Learned

Generative AI is transforming the way we interact with technology. At Google Cloud, our Applied AI Engineering team has shaped the design and development of generative AI solutions for Generative AI for Marketing, Customer Experience Modernization, and Open Data QnA (NL2SQL), among others. Throughout this process, we’ve gained valuable insights that we believe can help …

Apple Intelligence Foundation Language Models

We present foundation language models developed to power Apple Intelligence features, including a ∼3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, …

12Ab5oZiN2Ew96GEeZ9oIIhPA

Java 21 Virtual Threads – Dude, Where’s My Lock?

Getting real with virtual threads By Vadim Filanovsky, Mike Huang, Danny Thomas and Martin Chalupa Intro Netflix has an extensive history of using Java as our primary programming language across our vast fleet of microservices. As we pick up newer versions of Java, our JVM Ecosystem team seeks out new language features that can improve the ergonomics …

ML 15312 image001

Build generative AI–powered Salesforce applications with Amazon Bedrock

This post is co-authored by Daryl Martis and Darvish Shadravan from Salesforce. This is the fourth post in a series discussing the integration of Salesforce Data Cloud and Amazon SageMaker. In Part 1 and Part 2, we show how Salesforce Data Cloud and Einstein Studio integration with SageMaker allows businesses to access their Salesforce data …

Hugging Face Offers Developers Inference-as-a-Service Powered by NVIDIA NIM

One of the world’s largest AI communities — comprising 4 million developers on the Hugging Face platform — is gaining easy access to NVIDIA-accelerated inference on some of the most popular AI models. New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI …

1 xdU0TAU.max 1000x1000 1

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

With Vertex AI Model Garden, Google Cloud strives to deliver highly efficient and cost-optimized ML workflow recipes. Currently, it offers a selection of more than 150 first-party, open and third-party foundation models. Last year, we introduced the popular open source LLM serving stack vLLM on GPUs, in Vertex Model Garden. Since then, we have witnessed …

ML 16996 Picture1

Amazon SageMaker inference launches faster auto scaling for generative AI models

Today, we are excited to announce a new capability in Amazon SageMaker inference that can help you reduce the time it takes for your generative artificial intelligence (AI) models to scale automatically. You can now use sub-minute metrics and significantly reduce overall scaling latency for generative AI models. With this enhancement, you can improve the …

1 j1iPfal.max 1000x1000 1

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Leveraging enterprise data for generative AI and large language models (LLMs) presents significant challenges related to data silos, quality inconsistencies, privacy and security concerns, compliance with data regulations, capturing domain-specific knowledge, and mitigating inherent biases. Organizations must navigate the complexities of consolidating fragmented data sources, ensuring data integrity, and addressing ethical considerations. Techniques like retrieval …