Reasoning and planning are the bedrock of intelligent AI systems, enabling them to plan, interact, adapt, and ultimately, operate independently. At Apple, understanding and advancing reasoning capablilities in AI systems has long been an area of active research, and has resulted in numerous publications that both explore new techniques to advance the frontier of reasoning, and further the field’s understanding of the capabilities (and limitations) of current approaches. Last year, Apple hosted the Workshop on Reasoning and Planning, bringing together Apple researchers and members of the…
While server-side Large Language Models (LLMs) demonstrate proficiency in tool integration and complex reasoning, deploying Small Language Models (SLMs) directly on devices brings opportunities to improve latency and privacy but also introduces unique challenges for accuracy and memory. We introduce CAMPHOR, an innovative on-device SLM multi-agent framework designed to handle…
Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear…
The new frontier in large language models is the ability to “reason” their way through problems. New research from Apple says it's not quite what it's cracked up to be.