SO-Bench: A Structural Output Evaluation of Multimodal LLMs

Multimodal large language models (MLLMs) are increasingly deployed in real-world, agentic settings where outputs must not only be correct, but also conform to predefined data schemas. Despite recent progress in structured generation in textual domain, there is still no benchmark that systematically evaluates schema-grounded information extraction and reasoning over visual inputs. In this work, we …

1 Agent controlled model sequence diagramax 1000x1000 1

Using MCP with Web3: How to secure agents making blockchain transactions

At Google Cloud, we sit at a unique intersection of two transformative technologies: AI and Web3. The rise of AI agents capable of interacting with blockchains opens up a world of automated financial strategies, fast payments, and more complex scenarios like executing complex DeFi operations and bridging assets across multiple chains.  However, the practical viability …

AI denial is becoming an enterprise risk: Why dismissing “slop” obscures real capability gains

Three years ago, ChatGPT was born. It amazed the world and ignited unprecedented investment and excitement in AI. Today, ChatGPT is still a toddler, but public sentiment around the AI boom has turned sharply negative. The shift began when OpenAI released GPT-5 this summer to mixed reviews, mostly from casual users who, unsurprisingly, judged the …

Today I made a Realtime Lora Trainer for Z-image/Wan/Flux Dev

Basically you pass it images with a load image node and it trains a lora on the fly, using your local install of AI-Toolkit, and then proceeds with the image generation. You just paste in the folder location for Ai-toolkit (windows or Linux), and it saves the setting. This train took about 5 mins on …

Building Trust at Scale

The Next Generation of Audit Logging at Palantir Every day, organizations entrust Palantir platforms with their most sensitive data and critical operations. From government agencies coordinating national security missions to healthcare providers safeguarding patient information to financial institutions detecting fraud, our customers depend on us to help them make decisions that matter. This trust isn’t given …

AV1 — Now Powering 30% of Netflix Streaming

AV1 — Now Powering 30% of Netflix Streaming Liwei Guo, Zhi Li, Sheldon Radford, Jeff Watts Streaming video has become an integral part of our daily lives. At Netflix, our top priority is delivering the best possible entertainment experience to our members, regardless of their devices or network conditions. One of the key technologies enabling this is AV1, …

image1 uEwzVComax 1000x1000 1

Accelerate model downloads on GKE with NVIDIA Run:ai Model Streamer

As large language models (LLMs) continue to grow in size and complexity, the time it takes to load them from storage to accelerator memory for inference can become a significant bottleneck. This “cold start” problem isn’t just a minor delay — it’s a critical barrier to building resilient, scalable, and cost-effective AI services. Every minute …