Categories: FAANG

Designing Data: Proactive Data Collection and Iteration for Machine Learning

Lack of diversity in data collection has caused significant failures in machine learning (ML) applications. While ML developers perform post-collection interventions, these are time intensive and rarely comprehensive. Thus, new methods to track and manage data collection, iteration, and model training are necessary for evaluating whether datasets reflect real world variability. We present designing data, an iterative, bias mitigating approach to data collection connecting HCI concepts with ML techniques. Our process includes (1) Pre-Collection Planning, to reflexively prompt and document…
AI Generated Robotic Content

Recent Posts

Flux Krea Dev is hands down the best model on the planet right now

I started with trying to recreate SD3 style glitches but ended up discovering this is…

14 hours ago

Building a Transformer Model for Language Translation

This post is divided into six parts; they are: • Why Transformer is Better than…

14 hours ago

Peacock Feathers Are Stunning. They Can Also Emit Laser Beams

Scientists hope their plumage project could someday lead to biocompatible lasers that could safely be…

15 hours ago

Pirate VFX Breakdown | Made almost exclusively with SDXL and Wan!

In the past weeks, I've been tweaking Wan to get really good at video inpainting.…

2 days ago

Try Deep Think in the Gemini app

Deep Think utilizes extended, parallel thinking and novel reinforcement learning techniques for significantly improved problem-solving.

2 days ago

Introducing Amazon Bedrock AgentCore Browser Tool

At AWS Summit New York City 2025, Amazon Web Services (AWS) announced the preview of…

2 days ago