Divide-or-Conquer? Which Part Should You Distill Your LLM?

Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is …

Picture3 1

How Planview built a scalable AI Assistant for portfolio and project management using Amazon Bedrock

This post is co-written with Lee Rehwinkel from Planview. Businesses today face numerous challenges in managing intricate projects and programs, deriving valuable insights from massive data volumes, and making timely decisions. These hurdles frequently lead to productivity bottlenecks for program managers and executives, hindering their ability to drive organizational success efficiently. Planview, a leading provider …

1 Llama2 70b Training Performance on A3 Me.max 1000x1000 1

AI Hypercomputer software updates: Faster training and inference, a new resource hub, and more

The potential of AI has never been greater, and infrastructure plays a foundational role in driving it forward. AI Hypercomputer is our supercomputing architecture based on performance-optimized hardware, open software, and flexible consumption models. Together, these offer exceptional performance and efficiency, resiliency at scale, and give you the flexibility to choose offerings at each layer …

Combining Machine Learning and Homomorphic Encryption in the Apple Ecosystem

At Apple, we believe privacy is a fundamental human right. Our work to protect user privacy is informed by a set of privacy principles, and one of those principles is to prioritize using on-device processing. By performing computations locally on a user’s device, we help minimize the amount of data that is shared with Apple …

ml16490 rag

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

Large language models (LLMs) are very large deep-learning models that are pre-trained on vast amounts of data. LLMs are incredibly flexible. One model can perform completely different tasks such as answering questions, summarizing documents, translating languages, and completing sentences. LLMs have the potential to revolutionize content creation and the way people use search engines and …

Adapting model risk management for financial institutions in the generative AI era

Generative AI (gen AI) promises to usher in an era of transformation for quality, accessibility, efficiency, and compliance in the financial services industry. As with any new technology, it also introduces new complexities and risks. Striking a balance between harnessing its potential and mitigating its risks will be crucial for the adoption of gen AI …

CtrlSynth: Controllable Image-Text Synthesis for Data-Efficient Multimodal Learning

Pretraining robust vision or multimodal foundation models (e.g., CLIP) relies on large-scale datasets that may be noisy, potentially misaligned, and have long-tail distributions. Previous works have shown promising results in augmenting datasets by generating synthetic samples. However, they only support domain-specific ad hoc use cases (e.g., either image or text only, but not both), and …

ML 17337 image001

Unlocking generative AI for enterprises: How SnapLogic powers their low-code Agent Creator using Amazon Bedrock

This post is cowritten with Greg Benson, Aaron Kesler and David Dellsperger from SnapLogic. The landscape of enterprise application development is undergoing a seismic shift with the advent of generative AI. SnapLogic, a leader in generative integration and automation, has introduced the industry’s first low-code generative AI development platform, Agent Creator, designed to democratize AI …

1 Choosing the right metric GPU Utilizat.max 1000x1000 1

Save on GPUs: Smarter autoscaling for your GKE inferencing workloads

While LLM models deliver immense value for an increasing number of use cases, running LLM inference workloads can be costly. If you’re taking advantage of the latest open models and infrastructure, autoscaling can help you optimize your costs — ensuring you’re meeting customer demand while only paying for the AI accelerators you need. As a …