This AI mines the numbers buried in scientific papers and turns them into usable data fast

Numbers are the language of science—yet in research articles, they are often buried within the text and difficult to analyze. Researchers at Jülich have developed an AI system that automatically identifies these numbers, categorizes them, and converts them into structured data. The Quinex framework thus eliminates the need for time-consuming manual work.

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models (NADPFM) at ICLR 2026. Principled domain reweighting can substantially improve sample efficiency and downstream generalization; however, data-mixture optimization for multimodal pretraining remains underexplored. Current multimodal training recipes tune mixtures from only a single perspective such as data format or …

ML 19982 image 1 1

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

Text-to-SQL generation remains a persistent challenge in enterprise AI applications, particularly when working with custom SQL dialects or domain-specific database schemas. While foundation models (FMs) demonstrate strong performance on standard SQL, achieving production-grade accuracy for specialized dialects requires fine-tuning. However, fine-tuning introduces an operational trade-off: hosting custom models on persistent infrastructure incurs continuous costs, even during …

1 ucWDyWKmax 1000x1000 1

How WPP accelerates humanoid robot training 10x with G4 VMs

Editor’s note: Today we hear from Perry Nightingale, SVP of Creative AI at WPP about the workflow that cuts training time for humanoid robots from days to minutes — plus access to the open-source code to do it yourself. Robots are pushing the boundaries of what content creators and directors can capture. These technologies have …