Announcing Claude 3.7 Sonnet, Anthropic’s first hybrid reasoning model, is available on Vertex AI

Today, we’re announcing Claude 3.7 Sonnet, Anthropic’s most intelligent model to date and the first hybrid reasoning model on the market, is available in preview on Vertex AI Model Garden. Claude 3.7 Sonnet can produce quick responses or extended, step-by-step thinking that is made visible to the user. Claude 3.7 Sonnet includes improvements in coding, …

Everyone Has Given Up on AI Safety, Now What?

The End of the AI Safety Debate For years, a passionate contingent of researchers, ethicists, and policymakers warned about the potential dangers of unchecked artificial intelligence development. They argued about p(doom) probabilities, AI alignment strategies, and regulations that could prevent catastrophe. But as of now, that conversation has all but collapsed. The frontier AI companies—OpenAI, …

RKT LegacyFlow

How Rocket Companies modernized their data science solution on AWS

This post was written with Dian Xu and Joel Hawkins of Rocket Companies. Rocket Companies is a Detroit-based FinTech company with a mission to “Help Everyone Home”. With the current housing shortage and affordability concerns, Rocket simplifies the homeownership process through an intuitive and AI-driven experience. This comprehensive framework streamlines every step of the homeownership …

1 GcYLFfO.max 1000x1000 1

Optimizing image generation pipelines on Google Cloud: A practical guide

Generative AI diffusion models such as Stable Diffusion and Flux produce stunning visuals, empowering creators across various verticals with impressive image generation capabilities. However, generating high-quality images through sophisticated pipelines can be computationally demanding, even with powerful hardware like GPUs and TPUs, impacting both costs and time-to-result. The key challenge lies in optimizing the entire …

Evaluating Sample Utility for Data Selection by Mimicking Model Weights

Foundation models are trained on large-scale web-crawled datasets, which often contain noise, biases, and irrelevant information. This motivates the use of data selection techniques, which can be divided into model-free variants — relying on heuristic rules and downstream datasets — and model-based, e.g., using influence functions. The former can be expensive to design and risk …

Image1

Generate synthetic counterparty (CR) risk data with generative AI using Amazon Bedrock LLMs and RAG

Data is the lifeblood of modern applications, driving everything from application testing to machine learning (ML) model training and evaluation. As data demands continue to surge, the emergence of generative AI models presents an innovative solution. These large language models (LLMs), trained on expansive data corpora, possess the remarkable capability to generate new content across …

An SRE’s guide to optimizing ML systems with MLOps pipelines

Picture this: you’re an Site Reliability Engineer (SRE) responsible for the systems that power your company’s machine learning (ML) services. What do you do to ensure you have a reliable ML service, how do you know you’re doing it well, and how can you build strong systems to support these services?  As artificial intelligence (AI) …

signing still 960x509 1

It’s a Sign: AI Platform for Teaching American Sign Language Aims to Bridge Communication Gaps

American Sign Language is the third most prevalent language in the United States — but there are vastly fewer AI tools developed with ASL data than data representing the country’s most common languages, English and Spanish. NVIDIA, the American Society for Deaf Children and creative agency Hello Monday are helping close this gap with Signs, …

KV Prediction for Improved Time to First Token

Inference with transformer-based language models begins with a prompt processing step. In this step, the model generates the first output token and stores the KV cache needed for future generation steps. This prompt processing step can be computationally expensive, taking 10s of seconds or more for billion-parameter models on edge devices when prompt lengths or …