Picture3 1
This post is co-written with Lee Rehwinkel from Planview.
Businesses today face numerous challenges in managing intricate projects and programs, deriving valuable insights from massive data volumes, and making timely decisions. These hurdles frequently lead to productivity bottlenecks for program managers and executives, hindering their ability to drive organizational success efficiently.
Planview, a leading provider of connected work management solutions, embarked on an ambitious plan in 2023 to revolutionize how 3 million global users interact with their project management applications. To realize this vision, Planview developed an AI assistant called Planview Copilot, using a multi-agent system powered by Amazon Bedrock.
Developing this multi-agent system posed several challenges:
To overcome these challenges, Planview developed a multi-agent architecture built using Amazon Bedrock. Amazon Bedrock is a fully managed service that provides API access to foundation models (FMs) from Amazon and other leading AI startups. This allows developers to choose the FM that is best suited for their use case. This approach is both architecturally and organizationally scalable, enabling Planview to rapidly develop and deploy new AI skills to meet the evolving needs of their customers.
This post focuses primarily on the first challenge: routing tasks and managing multiple agents in a generative AI architecture. We explore Planview’s approach to this challenge during the development of Planview Copilot, sharing insights into the design decisions that provide efficient and reliable task routing.
We describe customized home-grown agents in this post because this project was implemented before Amazon Bedrock Agents was generally available. However, Amazon Bedrock Agents is now the recommended solution for organizations looking to use AI-powered agents in their operations. Amazon Bedrock Agents can retain memory across interactions, offering more personalized and seamless user experiences. You can benefit from improved recommendations and recall of prior context where required, enjoying a more cohesive and efficient interaction with the agent. We share our learnings in our solution to help you understanding how to use AWS technology to build solutions to meet your goals.
Planview’s multi-agent architecture consists of multiple generative AI components collaborating as a single system. At its core, an orchestrator is responsible for routing questions to various agents, collecting the learned information, and providing users with a synthesized response. The orchestrator is managed by a central development team, and the agents are managed by each application team.
The orchestrator comprises two main components called the router and responder, which are powered by a large language model (LLM). The router uses AI to intelligently route user questions to various application agents with specialized capabilities. The agents can be categorized into three main types:
After the agents have processed the questions and provided their responses, the responder, also powered by an LLM, synthesizes the learned information and formulates a coherent response to the user. This architecture allows for a seamless collaboration between the centralized orchestrator and the specialized agents, which provides users an accurate and comprehensive answers to their questions. The following diagram illustrates the end-to-end workflow.
Planview used key AWS services to build its multi-agent architecture. The central Copilot service, powered by Amazon Elastic Kubernetes Service (Amazon EKS), is responsible for coordinating activities among the various services. Its responsibilities include:
The router and responder are AWS Lambda functions that interact with Amazon Bedrock. The router considers the user’s question and chat history from the central Copilot service, and the responder considers the user’s question, chat history, and responses from each agent.
Application teams manage their agents using Lambda functions that interact with Amazon Bedrock. For improved visibility, evaluation, and monitoring, Planview has adopted a centralized prompt repository service to store LLM prompts.
Agents can interact with applications using various methods depending on the use case and data availability:
The following diagram illustrates the generative AI assistant architecture on AWS.
The router and responder components work together to process user queries and generate appropriate responses. The following prompts provide illustrative router and responder prompt templates. Additional prompt engineering would be required to improve reliability for a production implementation.
First, the available tools are described, including their purpose and sample questions that can be asked of each tool. The example questions help guide the natural language interactions between the orchestrator and the available agents, as represented by tools.
Next, the router prompt outlines the guidelines for the agent to either respond directly to user queries or request information through specific tools before formulating a response:
The following is a sample response from the router component that initiates the dataQuery tool to retrieve and analyze task assignments for each user:
The following is a sample response from the responder component that uses the dataQuery tool to fetch information about the user’s assigned tasks. It reports that the user has five tasks assigned to them.
Evaluating and monitoring generative AI model performance is crucial in any AI system. Planview’s multi-agent architecture enables assessment at various component levels, providing comprehensive quality control despite the system’s complexity. Planview evaluates components at three levels:
The following figure illustrates the evaluation framework for prompts and scoring.
To conduct these evaluations, Planview uses a set of carefully crafted test questions that cover typical user queries and edge cases. These evaluations are performed during the development phase and continue in production to track the quality of responses over time. Currently, human evaluators play a crucial role in scoring responses. To aid in the evaluation, Planview has developed an internal evaluation tool to store the library of questions and track the responses over time.
To assess each component and determine the most suitable Amazon Bedrock model for a given task, Planview established the following prioritized evaluation criteria:
Based on these criteria and the current use case, Planview selected Anthropic’s Claude 3 Sonnet on Amazon Bedrock for the router and responder components.
Over the past year, Planview Copilot’s performance has significantly improved through the implementation of a multi-agent architecture, development of a robust evaluation framework, and adoption of the latest FMs available through Amazon Bedrock. Planview saw the following results between the first generation of Planview Copilot developed mid-2023 and the latest version:
In this post, we explored how Planview was able to develop a generative AI assistant to address complex work management process by adopting the following strategies:
Planview is migrating to Amazon Bedrock Agents, which enables the integration of intelligent autonomous agents within their application ecosystem. Amazon Bedrock Agents automate processes by orchestrating interactions between foundation models, data sources, applications, and user conversations.
As next steps, you can explore Planview’s AI assistant feature built on Amazon Bedrock and stay updated with new Amazon Bedrock features and releases to advance your AI journey on AWS.
Discover how latent bridge matching, pioneered by the Jasper research team, transforms image-to-image translation with…
Machine learning models have become increasingly sophisticated, but this complexity often comes at the cost…
Gemini 2.5 Pro continues to be loved by developers as the best model for coding,…
Amazon SageMaker Projects empower data scientists to self-serve Amazon Web Services (AWS) tooling and infrastructure to…
Welcome to the second Cloud CISO Perspectives for May 2025. Today, Enrique Alvarez, public sector…