The language models behind ChatGPT and other generative AI are trained on written words that have been culled from libraries, scraped from websites and social media, and pulled from news reports and speech transcripts from across the world. There are 250 billion such words behind GPT-3.5, the model fueling ChatGPT, for instance, and GPT-4 is now here.