Alibaba’s ZeroSearch method uses simulated search results to slash LLM training costs
A team of AI researchers at the Alibaba Group’s Tongyi Lab, has debuted a new approach to training LLMs; one that costs much less than those now currently in use. Their paper is posted on the arXiv preprint server.
New research shows that AI doesn’t need endless training data to start acting more like a human brain. When researchers redesigned AI systems to better resemble biological brains, some models produced brain-like activity without any training at all. This challenges today’s data-hungry approach to AI development. The work suggests smarter…
Mixture-of-Experts (MoE) models are crucial for scaling model capacity while controlling inference costs. While integrating MoE into multimodal models like CLIP improves performance, training these models is notoriously challenging and expensive. We propose CLIP-Upcycling (CLIP-UP), an efficient alternative training strategy that converts a pre-trained dense CLIP model into a sparse…
This paper has been accepted at the Data Problems for Foundation Models workshop at ICLR 2024. Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such data requires an abundance of both compute…