Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents
This paper was accepted at the Fifth Workshop on Natural Language Generation, Evaluation, and Metrics at ACL 2026. Tool-calling agents are evaluated on tool selection, parameter accuracy, and scope recognition, yet LLM trajectory assessments remain inherently post-hoc. Disconnected from the active execution loop, such assessments identify errors that are usually addressed through prompt-tuning or retraining, …
Read more “Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents”