A Gentle Introduction to SHAP for Tree-Based Models
Machine learning models have become increasingly sophisticated, but this complexity often comes at the cost of interpretability.
Machine learning models have become increasingly sophisticated, but this complexity often comes at the cost of interpretability.
Quantization is a frequently used strategy applied to production machine learning models, particularly large and complex ones, to make them lightweight by reducing the numerical precision of the model’s parameters (weights) — usually from 32-bit floating-point to lower representations like 8-bit integers.
This post is divided into five parts; they are: • Naive Tokenization • Stemming and Lemmatization • Byte-Pair Encoding (BPE) • WordPiece • SentencePiece and Unigram The simplest form of tokenization splits text into tokens based on whitespace.
Machine learning model development often feels like navigating a maze, exciting but filled with twists, dead ends, and time sinks.
In machine learning model development, feature engineering plays a crucial role since real-world data often comes with noise, missing values, skewed distributions, and even inconsistent formats.
Learning machine learning can be challenging.
This article is divided into three parts; they are: • Full Transformer Models: Encoder-Decoder Architecture • Encoder-Only Models • Decoder-Only Models The original transformer architecture, introduced in “Attention is All You Need,” combines an encoder and decoder specifically designed for sequence-to-sequence (seq2seq) tasks like machine translation.
“I’m feeling blue today” versus “I painted the fence blue.
We have seen a new era of agentic IDEs like Windsurf and Cursor AI.
If you’ve been into machine learning for a while, you’ve probably noticed that the same books get recommended over and over again.