AI/ML Research

AI/ML Research

5 Advanced RAG Architectures Beyond Traditional Methods

Retrieval-augmented generation (RAG) has shaken up the world of language models by combining the best of two worlds:

4 weeks ago

AI/ML Research

Mixture of Experts Architecture in Transformer Models

This post covers three main areas: • Why Mixture of Experts is Needed in Transformers • How Mixture of Experts…

4 weeks ago

AI/ML Research

Your First Local LLM API Project in Python Step-By-Step

Interested in leveraging a large language model (LLM) API locally on your machine using Python and not-too-overwhelming tools frameworks? In…

4 weeks ago

AI/ML Research

Linear Layers and Activation Functions in Transformer Models

This post is divided into three parts; they are: • Why Linear Layers and Activations are Needed in Transformers •…

1 month ago

AI/ML Research

LayerNorm and RMS Norm in Transformer Models

This post is divided into five parts; they are: • Why Normalization is Needed in Transformers • LayerNorm and Its…

1 month ago

AI/ML Research

7 AI Agent Frameworks for Machine Learning Workflows in 2025

Machine learning practitioners spend countless hours on repetitive tasks: monitoring model performance, retraining pipelines, data quality checks, and experiment tracking.

1 month ago

AI/ML Research

A Gentle Introduction to Attention Masking in Transformer Models

This post is divided into four parts; they are: • Why Attention Masking is Needed • Implementation of Attention Masks…

1 month ago

AI/ML Research

10 Essential Machine Learning Key Terms Explained

Artificial intelligence (AI) is an umbrella computer science discipline focused on building software systems capable of mimicking human or animal…

1 month ago

AI/ML Research

Combining XGBoost and Embeddings: Hybrid Semantic Boosted Trees?

The intersection of traditional machine learning and modern representation learning is opening up new possibilities.

1 month ago

AI/ML Research

A Gentle Introduction to Multi-Head Latent Attention (MLA)

This post is divided into three parts; they are: • Low-Rank Approximation of Matrices • Multi-head Latent Attention (MLA) •…

1 month ago

Show more Posts

Show previous Posts