AI/ML Research - Robotic Content

Mixture of Experts Architecture in Transformer Models

by AI Generated Robotic ContentAI/ML Research July 2, 2025Comments are Disabled

This post covers three main areas: • Why Mixture of Experts is Needed in Transformers • How Mixture of Experts Works • Implementation of MoE in Transformer Models The Mixture of Experts (MoE) concept was first introduced in 1991 by

Your First Local LLM API Project in Python Step-By-Step

by AI Generated Robotic ContentAI/ML Research July 2, 2025Comments are Disabled

Interested in leveraging a large language model (LLM) API locally on your machine using Python and not-too-overwhelming tools frameworks? In this step-by-step article, you will set up a local API where you’ll be able to send prompts to an LLM downloaded on your machine and obtain responses back.

Linear Layers and Activation Functions in Transformer Models

by AI Generated Robotic ContentAI/ML Research July 1, 2025Comments are Disabled

This post is divided into three parts; they are: • Why Linear Layers and Activations are Needed in Transformers • Typical Design of the Feed-Forward Network • Variations of the Activation Functions The attention layer is the core function of a transformer model.

LayerNorm and RMS Norm in Transformer Models

by AI Generated Robotic ContentAI/ML Research July 1, 2025Comments are Disabled

This post is divided into five parts; they are: • Why Normalization is Needed in Transformers • LayerNorm and Its Implementation • Adaptive LayerNorm • RMS Norm and Its Implementation • Using PyTorch’s Built-in Normalization Normalization layers improve model quality in deep learning.

7 AI Agent Frameworks for Machine Learning Workflows in 2025

by AI Generated Robotic ContentAI/ML Research June 27, 2025Comments are Disabled

Machine learning practitioners spend countless hours on repetitive tasks: monitoring model performance, retraining pipelines, data quality checks, and experiment tracking.

A Gentle Introduction to Attention Masking in Transformer Models

by AI Generated Robotic ContentAI/ML Research June 27, 2025Comments are Disabled

This post is divided into four parts; they are: • Why Attention Masking is Needed • Implementation of Attention Masks • Mask Creation • Using PyTorch’s Built-in Attention In the

10 Essential Machine Learning Key Terms Explained

by AI Generated Robotic ContentAI/ML Research June 27, 2025Comments are Disabled

Artificial intelligence (AI) is an umbrella computer science discipline focused on building software systems capable of mimicking human or animal intelligence capabilities to solve a task.

Combining XGBoost and Embeddings: Hybrid Semantic Boosted Trees?

by AI Generated Robotic ContentAI/ML Research June 25, 2025Comments are Disabled

The intersection of traditional machine learning and modern representation learning is opening up new possibilities.

A Gentle Introduction to Multi-Head Latent Attention (MLA)

by AI Generated Robotic ContentAI/ML Research June 24, 2025Comments are Disabled

This post is divided into three parts; they are: • Low-Rank Approximation of Matrices • Multi-head Latent Attention (MLA) • PyTorch Implementation Multi-Head Attention (MHA) and Grouped-Query Attention (GQA) are the attention mechanisms used in almost all transformer models.

Converting Pandas DataFrames to PyTorch DataLoaders for Custom Deep Learning Model Training

by AI Generated Robotic ContentAI/ML Research June 24, 2025Comments are Disabled

Pandas DataFrames are powerful and versatile data manipulation and analysis tools.