Categories: FAANG

A Unifying Theory of Distance from Calibration

We study the fundamental question of how to define and measure the distance from calibration for probabilistic predictors. While the notion of perfect calibration is well-understood, there is no consensus on how to quantify the distance from perfect calibration. Numerous calibration measures have been proposed in the literature, but it is unclear how they compare to each other, and many popular measures such as Expected Calibration Error (ECE) fail to satisfy basic properties like continuity.
We present a rigorous framework for analyzing calibration measures, inspired by the literature on…

The Calibration Generalization Gap

This paper was accepted at the Workshop on Distribution-Free Uncertainty Quantification at ICML 2022. Calibration is a fundamental property of a good predictive model: it requires that the model predicts correctly in proportion to its confidence. Modern neural networks, however, provide no strong guarantees on their calibration— and can be…

October 19, 2022

In "FAANG"

When Does Optimizing a Proper Loss Yield Calibration?

Optimizing proper loss functions is popularly believed to yield predictors with good calibration properties; the intuition being that for such losses, the global optimum is to predict the ground-truth probabilities, which is indeed calibrated. However, typical machine learning models are trained to approximately minimize loss over restricted families of predictors,…

October 3, 2023

In "FAANG"

Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs

Large Language Models (LLMs) often lack meaningful confidence estimates for their outputs. While base LLMs are known to exhibit next-token calibration, it remains unclear whether they can assess confidence in the actual meaning of their responses beyond the token level. We find that, when using a certain sampling-based notion of…

March 24, 2026

In "FAANG"

AI Generated Robotic Content