Categories: FAANG

ProText: A Benchmark Dataset for Measuring (Mis)gendering in Long-Form Texts

We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans three dimensions: Theme nouns (names, occupations, titles, kinship terms), Theme category (stereotypically male, stereotypically female, gender-neutral/non-gendered), and Pronoun category (masculine, feminine, gender-neutral, none). The dataset is designed to probe (mis)gendering in text transformations such as summarization and rewrites using state-of-the-art Large Language Models, extending beyond traditional pronoun resolution benchmarks and beyond the…
AI Generated Robotic Content

Recent Posts

iPhone 2007 [FLUX.2 Klein]

A Lora trained on photos taken with the original Apple iPhone (2007). Works with FLUX.2…

2 mins ago

Building a ‘Human-in-the-Loop’ Approval Gate for Autonomous Agents

In agentic AI systems , when an agent's execution pipeline is intentionally halted, we have…

2 mins ago

Build reliable AI agents with Amazon Bedrock AgentCore Evaluations

Your AI agent worked in the demo, impressed stakeholders, handled test scenarios, and seemed ready…

3 mins ago

Robotaxi Outage in China Leaves Passengers Stranded on Highways

A suspected system failure froze Baidu’s robotaxis across Wuhan, trapping passengers and reportedly causing traffic…

1 hour ago

Chip-scale light technology could power faster AI and data center communications

Researchers at Trinity have developed a new light-based technology on a tiny chip that could…

1 hour ago

Mugen – Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane

Your monthly "Anzhc's Posts" issue have arrived. Today im introducing - Mugen - continuation of…

1 day ago