Categories: FAANG

Projected Language Models: A Large Model Pre-Segmented Into Smaller Ones

This paper has been accepted at the Foundation Models in the Wild workshop at ICML 2024.
Large language models are versatile tools but are not suitable for small inference budgets. Small models have more efficient inference but their lower capacity means that their performance can be good only if one limits their scope to a specialized domain. This paper explores how to get a small language model with good specialized accuracy, even when specialization data is unknown during pretraining. We propose a novel architecture, projected networks (PN). PN is a high capacity network whose parameters…

Speculative Streaming: Fast LLM Inference Without Auxiliary Models

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) workshop at NeurIPS 2024. Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific settings, it often involves…

October 30, 2024

In "FAANG"

Regularized Training of Nearest Neighbor Language Models

Including memory banks in a natural language processing architecture increases model capacity by equipping it with additional data at inference time. In this paper, we build upon kNN-LM, which uses a pre-trained language model together with an exhaustive kNN search through the training data (memory bank) to achieve state-of-the-art results.…

September 3, 2022

In "FAANG"

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) Workshop at NeurIPS 2024. The pre-training phase of language models often begins with randomly initialized parameters. With the current trends in scaling models, training their large number of parameters can be extremely slow and costly. In contrast,…

November 13, 2024

In "FAANG"

AI Generated Robotic Content

Next TempleOS had GenerativeAI (Doodle from God) way back in 2017, take that OpenAI »

Previous « How Deloitte Italy built a digital payments fraud detection solution using quantum machine learning and Amazon Braket

Published by

AI Generated Robotic Content

Tags: ai/mlfaang

2 years ago

Anima – Sharing Some Prompts and Results

Been experimenting with Anima lately and ended up spending way too much time refining prompts.…

21 hours ago

AI/ML News

Keychron K2 HE Concrete Edition Review: Rock-Solid Typing

Keychron's K2 HE Concrete Edition sounds like a cute gimmick, but as I discovered, there's…

22 hours ago

AI/ML News

AI generates full battery electrolyte recipes, matching top lithium metal battery performance

Battery electrolytes aren't just one chemical, but a complex mixture of salts, solvents, and additives…

22 hours ago

Image

Nava – A 6.3B audio-video model .

Page: https://ernie-research.github.io/NAVA/ Model: https://huggingface.co/ernie-research/NAVA Github: https://github.com/ernie-research/NAVA NAVA is a 6.3 B-parameter joint audio-video generator that…

2 days ago