Categories: FAANG

Mean Estimation with User-level Privacy under Data Heterogeneity

A key challenge in many modern data analysis tasks is that user data is heterogeneous. Different users may possess vastly different numbers of data points. More importantly, it cannot be assumed that all users sample from the same underlying distribution. This is true, for example in language data, where different speech styles result in data heterogeneity. In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We…
AI Generated Robotic Content

Recent Posts

This sub right now

submitted by /u/ArtificialAnaleptic [link] [comments]

17 hours ago

Best Black Friday Deals 2025: We’ve Tested Every Item and Tracked Every Price

Our Reviews team has scoured the entire internet to find the best Black Friday deals…

18 hours ago

New insight into why LLMs are not great at cracking passwords

Large language models (LLMs), such as the model underpinning the functioning of OpenAI's conversational platform…

18 hours ago

The Journey of a Token: What Really Happens Inside a Transformer

Large language models (LLMs) are based on the transformer architecture, a complex deep neural network…

2 days ago

Pretrain a BERT Model from Scratch

This article is divided into three parts; they are: • Creating a BERT Model the…

2 days ago