Categories: FAANG

Language Models Improve When Pretraining Data Matches Target Tasks

Every data selection method inherently has a target. In practice, these targets often emerge implicitly through benchmark-driven iteration: researchers develop selection strategies, train models, measure benchmark performance, then refine accordingly. This raises a natural question: what happens when we make this optimization explicit? To explore this, we propose benchmark-targeted ranking (BETR), a simple method that selects pretraining documents based on similarity to benchmark training examples. BETR embeds benchmark examples and a sample of pretraining documents in a shared space, scores…

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining

This paper was accepted to the ACL 2025 main conference as an oral presentation. This paper was accepted at the Scalable Continual Learning for Lifelong Foundation Models (SCLLFM) Workshop at NeurIPS 2024. Large Language Models (LLMs) trained on historical web data inevitably become outdated. We investigate evaluation strategies and update…

June 26, 2025

In "FAANG"

Promoting Cross-Modal Representations to Improve Multimodal Foundation Models for Physiological Signals

Many healthcare applications are inherently multimodal, involving several physiological signals. As sensors for these signals become more common, improving machine learning methods for multimodal healthcare data is crucial. Pretraining foundation models is a promising avenue for success. However, methods for developing foundation models in healthcare are still in early exploration…

October 29, 2024

In "FAANG"

As AI Grows More Complex, Model Builders Rely on NVIDIA

Unveiling what it describes as the most capable model series yet for professional knowledge work, OpenAI launched GPT-5.2 today. The model was trained and deployed on NVIDIA infrastructure, including NVIDIA Hopper and GB200 NVL72 systems. It’s the latest example of how leading AI builders train and deploy at scale on…

December 13, 2025

In "FAANG"

AI Generated Robotic Content

Next From Linear Regression to XGBoost: A Side-by-Side Performance Comparison »

Previous « Build real-time travel recommendations using AI agents on Amazon Bedrock

Published by

AI Generated Robotic Content

Tags: ai/mlfaang

11 months ago

Nava – A 6.3B audio-video model .

Page: https://ernie-research.github.io/NAVA/ Model: https://huggingface.co/ernie-research/NAVA Github: https://github.com/ernie-research/NAVA NAVA is a 6.3 B-parameter joint audio-video generator that…

3 hours ago