An SRE’s guide to optimizing ML systems with MLOps pipelines

Picture this: you’re an Site Reliability Engineer (SRE) responsible for the systems that power your company’s machine learning (ML) services. What do you do to ensure you have a reliable ML service, how do you know you’re doing it well, and how can you build strong systems to support these services?  As artificial intelligence (AI) …

signing still 960x509 1

It’s a Sign: AI Platform for Teaching American Sign Language Aims to Bridge Communication Gaps

American Sign Language is the third most prevalent language in the United States — but there are vastly fewer AI tools developed with ASL data than data representing the country’s most common languages, English and Spanish. NVIDIA, the American Society for Deaf Children and creative agency Hello Monday are helping close this gap with Signs, …

KV Prediction for Improved Time to First Token

Inference with transformer-based language models begins with a prompt processing step. In this step, the model generates the first output token and stores the KV cache needed for future generation steps. This prompt processing step can be computationally expensive, taking 10s of seconds or more for billion-parameter models on edge devices when prompt lengths or …

ML 18103image001

Build verifiable explainability into financial services workflows with Automated Reasoning checks for Amazon Bedrock Guardrails

Foundational models (FMs) and generative AI are transforming how financial service institutions (FSIs) operate their core business functions. AWS FSI customers, including NASDAQ, State Bank of India, and Bridgewater, have used FMs to reimagine their business operations and deliver improved outcomes. FMs are probabilistic in nature and produce a range of outcomes. Though these models …

image1 Yr0kPzC.max 1000x1000 1

Rethinking 5G: The cloud imperative

The telecommunications industry is at a critical juncture. The demands of 5G, the explosion of connected devices, and the ever-increasing complexity of network architectures require a fundamental shift in how networks are managed and operated.  The future is autonomous — autonomous networks driving efficiency and innovation  The future isn’t just about scale and performance; it’s …

Massive Foundation Model for Biomolecular Sciences Now Available via NVIDIA BioNeMo

Scientists everywhere can now access Evo 2, a powerful new foundation model that understands the genetic code for all domains of life. Unveiled today as the largest publicly available AI model for genomic data, it was built on the NVIDIA DGX Cloud platform in a collaboration led by nonprofit biomedical research organization Arc Institute and …

root cause analysis architecture

How Formula 1® uses generative AI to accelerate race-day issue resolution

Formula 1® (F1) races are high-stakes affairs where operational efficiency is paramount. During these live events, F1 IT engineers must triage critical issues across its services, such as network degradation to one of its APIs. This impacts downstream services that consume data from the API, including products such as F1 TV, which offer live and …

image1 BlI8CAw.max 1000x1000 1

How to use gen AI for better data schema handling, data quality, and data generation

In the realm of data engineering, generative AI models are quietly revolutionizing how we handle, process, and ultimately utilize data. For example, large language models (LLMs) can help with data schema handling, data quality, and even data generation.  Building upon the recently released Gemini in BigQuery Data preparation capabilities, this blog showcases areas where gen …

ML 17765 figure1 bounding boxes 1024x577 1

Using Amazon Rekognition to improve bicycle safety

Cycling is a fun way to stay fit, enjoy nature, and connect with friends and acquaintances. However, riding is becoming increasingly dangerous, especially in situations where cyclists and cars share the road. According to the NHTSA, in the United States an average of 883 people on bicycles are killed in traffic crashes, with an average …

Transfer Learning in Scalable Graph Neural Network for Improved Physical Simulation

In recent years, graph neural network (GNN) based models showed promising results in simulating complex physical systems. However, training dedicated graph network simulator can be costly, as most models are confined to fully supervised training. Extensive data generated from traditional simulators is required to train the model. It remained unexplored how transfer learning could be …