AI/ML Techniques

Effective KV Compression with TurboQuant

TurboQuant has recently been launched by Google as a novel algorithmic suite and library for applying advanced quantization and compression…

3 days ago

Building AI Agents with Local Small Language Models

The idea of building your own AI agent used to feel like something only big tech companies could pull off.

1 week ago

Train, Serve, and Deploy a Scikit-learn Model with FastAPI

FastAPI has become one of the most popular ways to serve machine learning models because it is lightweight, fast, and…

2 weeks ago

AI Agent Memory Explained in 3 Levels of Difficulty

A stateless AI agent has no memory of previous calls.

2 weeks ago

Getting Started with Zero-Shot Text Classification

Zero-shot text classification is a way to label text without first training a classifier on your own task-specific dataset.

2 weeks ago

Gradient-based Planning for World Models at Longer Horizons

GRASP is a new gradient-based planner for learned dynamics (a “world model”) that makes long-horizon planning practical by (1) lifting…

2 weeks ago

The Complete Guide to Inference Caching in LLMs

Calling a large language model API at scale is expensive and slow.

2 weeks ago