Categories: AI/ML News

AI scaling laws: Universal guide estimates how LLMs will perform based on smaller models in same family

When researchers are building large language models (LLMs), they aim to maximize performance under a particular computational and financial budget. Since training a model can amount to millions of dollars, developers need to be judicious with cost-impacting decisions about, for instance, the model architecture, optimizers, and training datasets before committing to a model.

Why editing the knowledge of LLMs post-training can create messy ripple effects

After the advent of ChatGPT, the readily available model developed by Open AI, large language models (LLMs) have become increasingly widespread, with many online users now accessing them daily to quickly get answers to their queries, source information or produce customized texts. Despite their striking ability to rapidly define words…

August 3, 2024

In "AI/ML News"

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

June 15, 2024

In "FAANG"

Scalable Pre-training of Large Autoregressive Image Models

This paper introduces AIM, a collection of vision models pre-trained with an autoregressive objective. These models are inspired by their textual counterparts, i.e., Large Language Models (LLMs), and exhibit similar scaling properties. Specifically, we highlight two key findings: (1) the performance of the visual features scale with both the model…

February 2, 2024

In "FAANG"

AI Generated Robotic Content