research

Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing

This article is divided into three parts; they are: • Floating-point Numbers • Automatic Mixed Precision Training • Gradient Checkpointing…

1 month ago

Practical Agentic Coding with Google Jules

If you have an interest in agentic coding, there's a pretty good chance you've heard of

1 month ago

Evaluating Perplexity on Language Models

This article is divided into two parts; they are: • What Is Perplexity and How to Compute It • Evaluate…

1 month ago

3 Smart Ways to Encode Categorical Features for Machine Learning

If you spend any time working with real-world data, you quickly realize that not everything comes in neat, clean numbers.

2 months ago

Pretraining a Llama Model on Your Local GPU

This article is divided into three parts; they are: • Training a Tokenizer with Special Tokens • Preparing the Training…

2 months ago

Rotary Position Embeddings for Long Context Length

This article is divided into two parts; they are: • Simple RoPE • RoPE for Long Context Length Compared to…

2 months ago

5 Agentic Coding Tips & Tricks

Agentic coding only feels "smart" when it ships correct diffs, passes tests, and leaves a paper trail you can trust.

2 months ago

How to Fine-Tune a Local Mistral or Llama 3 Model on Your Own Dataset

Large language models (LLMs) like Mistral 7B and Llama 3 8B have shaken the AI field, but their broad nature…

2 months ago

Top 5 Vector Databases for High-Performance LLM Applications

Building AI applications often requires searching through millions of documents, finding similar items in massive catalogs, or retrieving relevant context…

2 months ago

Transformer vs LSTM for Time Series: Which Works Better?

From daily weather measurements or traffic sensor readings to stock prices, time series data are present nearly everywhere.

2 months ago