Apple Intelligence Foundation Language Models

We present foundation language models developed to power Apple Intelligence features, including a ∼3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, …

12Ab5oZiN2Ew96GEeZ9oIIhPA

Java 21 Virtual Threads – Dude, Where’s My Lock?

Getting real with virtual threads By Vadim Filanovsky, Mike Huang, Danny Thomas and Martin Chalupa Intro Netflix has an extensive history of using Java as our primary programming language across our vast fleet of microservices. As we pick up newer versions of Java, our JVM Ecosystem team seeks out new language features that can improve the ergonomics …

ML 15312 image001

Build generative AI–powered Salesforce applications with Amazon Bedrock

This post is co-authored by Daryl Martis and Darvish Shadravan from Salesforce. This is the fourth post in a series discussing the integration of Salesforce Data Cloud and Amazon SageMaker. In Part 1 and Part 2, we show how Salesforce Data Cloud and Einstein Studio integration with SageMaker allows businesses to access their Salesforce data …

Hugging Face Offers Developers Inference-as-a-Service Powered by NVIDIA NIM

One of the world’s largest AI communities — comprising 4 million developers on the Hugging Face platform — is gaining easy access to NVIDIA-accelerated inference on some of the most popular AI models. New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI …

1 xdU0TAU.max 1000x1000 1

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden

With Vertex AI Model Garden, Google Cloud strives to deliver highly efficient and cost-optimized ML workflow recipes. Currently, it offers a selection of more than 150 first-party, open and third-party foundation models. Last year, we introduced the popular open source LLM serving stack vLLM on GPUs, in Vertex Model Garden. Since then, we have witnessed …

ML 16996 Picture1

Amazon SageMaker inference launches faster auto scaling for generative AI models

Today, we are excited to announce a new capability in Amazon SageMaker inference that can help you reduce the time it takes for your generative artificial intelligence (AI) models to scale automatically. You can now use sub-minute metrics and significantly reduce overall scaling latency for generative AI models. With this enhancement, you can improve the …

1 j1iPfal.max 1000x1000 1

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Leveraging enterprise data for generative AI and large language models (LLMs) presents significant challenges related to data silos, quality inconsistencies, privacy and security concerns, compliance with data regulations, capturing domain-specific knowledge, and mitigating inherent biases. Organizations must navigate the complexities of consolidating fragmented data sources, ensuring data integrity, and addressing ethical considerations. Techniques like retrieval …

Federated Learning With Differential Privacy for End-to-End Speech Recognition

*Equal Contributors While federated learning (FL) has recently emerged as a promising approach to train machine learning models, it is limited to only preliminary explorations in the domain of automatic speech recognition (ASR). Moreover, FL does not inherently guarantee user privacy and requires the use of differential privacy (DP) for robust privacy guarantees. However, we …

ml 17272 image001

Mistral Large 2 is now available in Amazon Bedrock

Mistral AI’s Mistral Large 2 (24.07) foundation model (FM) is now generally available in Amazon Bedrock. Mistral Large 2 is the newest version of Mistral Large, and according to Mistral AI offers significant improvements across multilingual capabilities, math, reasoning, coding, and much more. In this post, we discuss the benefits and capabilities of this new …