Large language models behind popular generative AI platforms like ChatGPT gave different answers when asked to respond to the same reasoning test and didn’t improve when given additional context, finds a new study by researchers at University College London.