How to Fine-Tune a Local Mistral or Llama 3 Model on Your Own Dataset
Large language models (LLMs) like Mistral 7B and Llama 3 8B have shaken the AI field, but their broad nature limits their application to specialized areas.
Large language models (LLMs) like Mistral 7B and Llama 3 8B have shaken the AI field, but their broad nature limits their application to specialized areas.
Building AI applications often requires searching through millions of documents, finding similar items in massive catalogs, or retrieving relevant context for your LLM.
From daily weather measurements or traffic sensor readings to stock prices, time series data are present nearly everywhere.
Building newly trained machine learning models that work is a relatively straightforward endeavor, thanks to mature frameworks and accessible computing power.
This article is divided into four parts; they are: • How Logits Become Probabilities • Temperature • Top- k Sampling • Top- p Sampling When you ask an LLM a question, it outputs a vector of logits.
Machine learning models possess a fundamental limitation that often frustrates newcomers to natural language processing (NLP): they cannot read.
Data leakage is an often accidental problem that may happen in machine learning modeling.
This article is divided into three parts; they are: • Understanding the Architecture of Llama or GPT Model • Creating a Llama or GPT Model for Pretraining • Variations in the Architecture The architecture of a Llama or GPT model is simply a stack of transformer blocks.
In 2025, “using AI” no longer just means chatting with a model, and you’ve probably already noticed that shift yourself.
Let’s get started.