Thinking Outside the (Black) Box: Building More Transparent and Explainable AI Systems in AIP
(Engineering Responsible AI , #2)
Advanced LLMs display incredible capabilities for processing and generating natural language. As discussed in the first blog post in this series, this can be a double-edged sword: LLMs are prone to “hallucinating” nonsensical or fictitious outputs that nonetheless seem very convincing. Regardless of whether they are hallucinating, one related challenge is making sense of how and why an LLM returned an output in response to a particular input. It is for this reason that LLMs are often considered “black box” models. This characterization presents a significant challenge to the responsible development and deployment of LLM-based AI systems: without the capability to explain an LLM’s output, it becomes difficult to trust its outputs and make necessary adjustments when faced with unexpected or undesirable outcomes.
A recent strategy for improving the robustness and reliability of LLM-based AI systems is Chain-of-Thought (CoT) prompting. In this blog post, we explore CoT’s effectiveness in advancing more transparent and explainable AI systems. We also highlight Palantir AIP’s native support for this LLM prompting style and other features designed to enhance LLM system transparency and explainability.
“Explainability” Explained
Explainable AI is a subfield of artificial intelligence that focuses on building more interpretable AI models and better understanding the cause of their predictions [1]. One way to improve explainability is by using inherently interpretable models, such as low-dimensional linear regression models or decision trees [2]. With these kinds of models, it is easier to understand how a model arrived at its final decision by, for example, following the path along the decision tree or examining the weights assigned to different factors in a regression model.
However, this approach isn’t always viable, especially in the face of the complexity of modern AI models. For instance, the intricate structure of deep neural networks makes it challenging to easily understand how they produce specific outcomes. Explainability is especially difficult for Generative AI models, including Large Language Models, due to their billions of parameters and the complex dependencies between them. Moreover, the data used to train these models often comes from sources with unknown or opaque provenance. Since it’s difficult to inspect the inner workings of these models, a common strategy for achieving explainability is to retroactively analyze how a model reached a decision using tools such as counterfactual analysis, ancillary models, and statistical methods. Despite these efforts, complete explainability remains an unsolved challenge — one for which we might never find a wholly satisfying solution [3].
Though there exist many challenges with explainability for modern AI models, we can take steps today to build more explainable AI systems in practice. Even if we can’t fully understand why AI models produce certain outcomes, we can still build trust by understanding the AI system’s constituent parts and their interactions. When models are chained together into an AI system, understanding their interactions is essential, rather than focusing solely on the explainability of specific subcomponents. As such, explainability goes hand-in-hand with transparency.
Building transparent and explainable AI systems is essential for Responsible AI in practice so that users, developers, and impacted parties can better understand how an AI system arrives at specific outcomes. This also offers the potential to improve the AI system over time. In this blog post, we focus on just this: how to design more transparent and explainable Generative AI systems within Palantir AIP.
Beyond the Black Box? Discussing Chain-of-Thought Prompting
Chain-of-Thought (CoT) prompting is the practice of prompting LLMs to generate intermediate steps or “explanations” before returning their final output. This technique has emerged as a promising way to improve the performance of LLMs on certain tasks. On its surface, asking an LLM to provide an explanation of its predictions may seem like a silver bullet for achieving explainability. However, there are some important caveats to consider.
Let’s take a look at how Chain-of-Thought prompting works through an example in Palantir AIP. By default, AIP Logic functions are configured to use CoT prompting.
Under the hood, using CoT prompting within our AIP Logic function involves generating prompts in a very specific manner. The AIP Logic function prompts the LLM to respond iteratively based on a structured plan, like this:
We’re going to think about this step-by-step. You will now receive a statement from the user and should generate a series of steps that you will use to identify and resolve any issues encountered while solving the problem. You should output a bulleted list of steps in plain english that you’ll need perform.
In particular, the LLM is prompted to respond in the following format by appending the following text to each user-entered prompt:
Thought: <thought process on how to respond to the prompt>
Final Answer: <response to the prompt>
Let’s examine how CoT prompting affects the LLM’s response to a query about the distribution center locations of Titan Industries — our notional medical supply manufacturing company. Below is the exact output from the model when prompted with our question, as well as the instructions for CoT prompting:
Here, we see the “thought” — the intermediate plan that we asked the LLM to generate before producing the final answer. It’s tempting to treat the text presented as a “thought” as a genuine thought, reason, plan, or explanation. But, this is anything but the case.
As described in our previous blog post, LLMs are designed to predict the statistically most likely next token, one token at a time. The model is not reasoning; it’s emulating reasoning because these models are trained to mimic human dialogue. Given the prompt instructing the model to generate a “thought process,” the LLM generates text that structurally and semantically mimics human reasoning based on its training data. Therefore, we should be wary of hallucinations in the synthetic “reasoning.” For example, the generated “reasoning” might appear plausible and be hallucination-free, but it still may have nothing to do with the true reason for a model’s prediction. It is critical when analyzing the output of an LLM prompted with CoT to refrain from interpreting the generated explanation or “thought” as a faithful reflection of how or why it arrived at a particular final output.
However, all hope is not lost for understanding how an LLM-based system works. Using CoT prompting is still a helpful technique for transparency and explainability — but perhaps not because of the model’s purported “explanations.” Rather, CoT prompting enables us to build explainable AI systems in AIP by enabling transparent handoffs to more explainable tools.
Explainable Tools in AIP
Using the CoT prompting strategy in AIP Logic enables the use of tools that allow LLMs to access your Ontology — a trusted representation of the data, logic, and actions within your enterprise. This includes tools for various logical computations relevant to your organization. AIP Logic can provide the LLM with these tools through CoT by augmenting the user’s prompt to include descriptions of the configured tools, examples of their use, and then handing off to the tool when the LLM generates a structured request for a tool invocation. Tools greatly expand the universe of what’s functionally possible with LLMs as LLMs are not trained or built to carry out many kinds of complex logic.
Using tools can make our AI systems in AIP more explainable and transparent. Specifically, our systems become more explainable when they are composed of explainable components. By incorporating tools, we can delegate core pieces of functionality to more explainable pieces of logic — such as interpretable models or business logic functions — and allow the LLM to orchestrate their usage. This approach enables us to rely on trusted, more interpretable logic instead of having the LLM process the entire task within its “black box.” While this is does not provide insight into the internal workings of the LLM, itself, it is an important step in improving the transparency and explainability of the AI system. This, in turn, allows us to better understand why and how our AI systems are returning their outputs.
To learn more about building workflows with LLMs and logic tools, read our blog post: Building with Palantir AIP: Logic Tools for RAG/OAG.
Transparent Handoff in the AIP Logic Debugger
Using tools to delegate tasks to more interpretable pieces of logic is one piece of the puzzle, but it’s equally important to ensure that an LLM’s usage of such tools is transparent and easy to follow. To help users understand an LLM’s CoT response and tool usage, AIP Logic comes equipped with a comprehensive LLM Debugger.
For software engineers, a debugger is an indispensable tool for building reliable software. A debugger allows programmers to monitor the execution of a program and identify bugs in the code. Similarly, in the AIP Logic Debugger, you can input data into to your LLM-driven function and observe a detailed and easily navigable execution log as the LLM runs — including every instance it hands off to a tool. This is crucial for building transparency in an AI system. To see why, let’s take a look at an example of the AIP Logic Debugger in action.
As in our previous blog post, let’s assume the role of an employee at Titan Industries, a fictional medical supply manufacturer. In this scenario, our delivery trucks have gotten stranded, and we want to use AIP to help us determine which distribution center it’s closest to their location. We’ve provided the Logic function with a tool to query the Ontology, enabling the model to access to the locations of our distribution centers. We’ve also provided a Function Call tool to run custom business logic to compute the distance between one of our distribution centers and a given coordinate.
As we can see, Chain-of-Thought prompting enables the LLM to generate a decomposition of the task at hand into subtasks. It begins by orchestrating an invocation of the Query objects tool:
The Debugger provides visibility into invocation of tools, capturing the inputs and outputs of the Query tool. Subsequently, the LLM computes the distances between each distribution center and the provided coordinates using our custom business logic for distance computation:
Eventually, the LLM arrives at a final output:
As you can see, the Debugger provides AIP users with valuable insight into the inner workings of an AIP Logic function. In this example, we can see the intermediate steps and tool invocations generated through CoT prompting. The Debugger offers us visibility into the tool orchestration performed by the LLM, revealing the how of our AIP Logic functions.
Furthermore, we can inspect the underlying logic of our tools to better understand the why behind our system’s behavior. If our AI system begins to exhibit unexpected behaviors due to the use of a tool, we can examine the specific logic and steps of those tools to identify where the logic fails. If we want to dispute an outcome or an output generated by our AI system, we can better understand and evaluate its execution on the path to a final answer, particularly in instances where it has delegated to a tool we provided.
Conclusion
Building more explainable and transparent AI systems is critical for the effective and responsible use of Generative AI. Palantir AIP empowers users to gain insights into the operations of advanced LLM-backed systems by adopting techniques such as Chain-of-Thought prompting and delegating specific logical tasks to more interpretable tools. By utilizing the LLM Debugger in AIP Logic, users can track LLM orchestration and tool handoffs, fostering a deeper understanding and trust in their AI systems. These methods enable organizations to confidently apply LLMs to their most complex enterprise challenges, ensuring that their AI solutions are not only powerful but also transparent and trustworthy.
[1] See Tim Miller’s “Explanation in artificial intelligence: Insights from the social sciences.” for a comprehensive overview of Explainable AI from a social sciences perspective. Note that in this blog post, we treat “explainability” and “interpretability” as synonymous, though some AI scholars and practitioners do separate these two terms.
[2] See Christoph Molnar’s Interpretable ML for an overview of common Explainable AI approaches in practice.
[3] For more on why Explainable AI is a hard-to-achieve goal, see Zach Lipton’s Mythos of Model Explainability.
Thinking Outside the (Black) Box was originally published in Palantir Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.