Categories: AI/ML News

With encouragement, large language models devise more efficient prompts

One of the principal drivers of efficient large language model (LLM) tasks is the prompt.

Leaner large language models could enable efficient local use on phones and laptops

Large language models (LLMs) are increasingly automating tasks like translation, text classification and customer service. But tapping into an LLM's power typically requires users to send their requests to a centralized server—a process that's expensive, energy-intensive and often slow.

November 19, 2024

In "AI/ML News"

SPD: Sync-Point Drop for Efficient Tensor Parallelism of Large Language Models

With the rapid expansion in the scale of large language models (LLMs), enabling efficient distributed inference across multiple computing units has become increasingly critical. However, communication overheads from popular distributed inference techniques such as Tensor Parallelism pose a significant challenge to achieve scalability and low latency. Therefore, we introduce a…

May 23, 2025

In "FAANG"

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

We introduce SlowFast-LLaVA-1.5 (abbreviated as SF-LLaVA-1.5), a family of video large language models (LLMs) offering a token-efficient solution for long-form video understanding. We incorporate the two-stream SlowFast mechanism into a streamlined training pipeline, and perform joint video-image training on a carefully curated data mixture of only publicly available datasets. Our…

August 23, 2025

In "FAANG"

AI Generated Robotic Content