Ethical AI in Defense Decision Support Systems

Ethical AI in Defense Decision Support Systems (Defense AI Ethics, #2)

Editors Note: In a previous post in our series on ethical defense and AI, we examined the ethical implications of technology providers in the defense domain. In this post, we delve into Palantir’s work on Decision Support Systems and how it reinforces critical military ethics and Law of War considerations in tactical and operational military settings. In future posts, we will explore the role of technology and military ethics for strategic deterrence.

Introduction

As a software company, we believe that the development of software and Decision Support Systems (DSS) must foremost be informed by and responsive to operational realities. What is required is an approach that compels the builders of digital infrastructure to confront the realistic contextual constraints and limitations — technological, procedural, moral, and normative — faced by the institutions that use these systems. Only in the field can we credibly evaluate and refine the integrity of the software, as well as our understanding of and ability to address the relevant ethical considerations. This is especially true in the case of software applied in the most consequential settings, including for military applications and warfighting.

Autonomy and AI are already changing both the makeup of defense forces and the nature of combat. As we have previously argued, the most critical ethical and efficacy questions come not from abstract hypothetical problems but are informed by the technology deployed today. While there is still value in having societal debates about the extent to which future decision-making should be delegated to automated systems, our warfighters are currently facing very real and urgent challenges about the appropriateness and utility of these tools.

This piece presents some initial thoughts on the role of AI Decision Support Systems (AI-DSS) in the military context and further elaborates on workflows presented in our earlier AIP for Defense video. We contend that focusing on the abstraction of “AI in warfighting” can obscure more than it illuminates. Instead, by examining each interaction between AI tools and human decision-makers, we can better understand how these tools can be leveraged appropriately and how society can best navigate the ethical implications of deploying this technology on the battlefield.

AI Ethics and Decision Support Systems

A Decision Support System (DSS) is, simply put, an information system that is leveraged to support decision-making within an institution. For our military partners, AI-DSS are tools that bring together relevant information from across the enterprise, alongside AI tools, to help analysts, operators, and other key stakeholders process large amounts of information and help the warfighter make critical choices about everything from troop movements to targeting decisions.

The ethical stakes of properly functioning AI-DDS are high. They not only relate to the ethics of novel technologies in high-stakes environments, but also the longstanding obligation of the warfighter to act according to the Law of War (LOW).[1] Georgia Hinds, a legal advisor with the International Committee of the Red Cross, has written that assessing the impact of an AI Decision Support System is not only a matter of evaluating whether it is suitable or inappropriate for a given task — though that must come first — but also if such systems might “assist parties to an armed conflict in satisfying their IHL obligations.”

Our contention is that such assistance comes not only from purpose-built applications designed to assist with IHL obligations, but also from a properly leveraged AI-DSS, which can, through its core functioning, improve the decision-making of warfighters in a way that enables them to be more effective with respect to both military and ethical objectives.

At a high level, one of the foremost goals of AI tools on the battlefield is to shorten the total time required to work through the necessary process surrounding a critical decision. This has two critical effects:

It can increase the amount of time and attention that the warfighter is able to dedicate to the most critical parts of that decision.
It can increase the total number of decisions that the operator is able to make when such choices need to be made at scale.

Examining the battlefield use of AI is not an academic exercise. Decisions about the use of force are critical to acting and adapting fast enough to deter and, where necessary, defeat a formidable, determined adversary when a military’s immediate advantages may be limited to the courage of its soldiers and the edge afforded by its available technologies. Responsibly, reliably, and effectively leveraging AI and AI-enabled tools in this context requires examining each specific decision point that warfighters traditionally must work through and how those decisions might be aided or augmented by an AI tool. We also must evaluate the trade-offs between human-driven and machine-assisted workflows, and determine when and how final human judgement should be applied — whether to mitigate potential sources of error or to reinforce critical procedural or moral considerations, especially for non-DIL (disconnected/intermittent/limited) environments.

In evaluating both a system’s suitability and its potential to assist a warfighter with their obligations under LOW or broader policy guidance, such as DoD’s Civilian Harm Mitigation Response and Action Plan [2], it is insufficient to discuss the abstraction of “Artificial Intelligence” in warfighting generally. Rather, we must focus on each interactionbetween AI tools and human decision-makers within the context of the entire process of making a specific decision. This decomposition of the decision-making cycle is necessary because — contrary to the common perception of a straightforward process through which militaries carry out consequential actions, such as targeting — these activities are conducted as a complex, interdependent series of decisions and actions with multiple decision-makers. Ultimately, this decision-making is an extensive process, not a single discrete point-in-time action.

The following sections unpack examples across three specific parts of any defense decision-making or command and control function: 1) Sensing and Integrating critical data; 2) Making Sense of that data to understand the holistic picture of the operational environment; and 3) Deciding and Acting, using that information and analysis to make a final decision and act on it.[3] In our examination of each phase, we focus on how an AI-DSS can help with the targeting lifecycle (nomination, development, and prosecution.). Although this process description culminates in one of the gravest consequences — the decision to use force — it is certainly not the only type of decision an AI-DSS can support. However, it is often the most starkly impactful and therefore understandably draws a great deal of both commentary and scrutiny for its operational utility and moral implications.

Each of these phases can help capture a sub-domain of tasks that drive core determinations around targeting and the use of force: who or what should be targeted and what kind of force should be employed against that target. AI-DSS operate upstream of the weapons systems that ultimately deploy force against a target, but the work they do to support a warfighter’s decision-making is just as critical as the operation of the weapons systems themselves. By unpacking a small portion of the work necessary to reach those conclusions, we hope to advance a more holistic understanding of the potential applications of AI in the defense space and ways to manage those workflows effectively and responsibly.

Sense + Integrate

We can think of the objective of sensing and integrating as detecting relevant signals in a sea of data that could inform a time-sensitive and critical decision. A sensor might, for instance, be an overhead satellite imaging an area of interest, and using an AI tool to look for particular indicators of an adversary’s activity, such as the movement of tanks across a contested area. This is a domain in which AI tools and automated processes can, and already do, provide meaningful lift.

To be effective in an environment in which the sheer volume of the data landscape is simply overwhelming, operators need some way to filter down and detect the relevant signal of interest. The core human decision (i.e., “what should my tools be looking for, and where should they be looking?”) remains critical for focusing effort on valuable and appropriate collection.

Though we cannot cover the entire universe of possibilities here, we can sketch what an appropriate use of AI might start to look like. For example, consider models loaded directly onto a satellite or other sensor (i.e., “at the edge”) that can detect objects of interest in an area of interest, such as tanks in a region of active contestation. Immediately, upon detection at a human-specified threshold, operators or automated processes can then task follow-up sensors to confirm the initial hypothesis regarding the nature of the detected object. Based on the additional collection, the results could then be packaged and sent downstream to human analysts for further review and work product development. By doing the work to process billions of pixels and identify relevant objects of interest for the human analysts to verify and act on, the AI-DSS can assist in prioritizing the information relevant for the analyst and prevent them from being overwhelmed by an otherwise unmanageable deluge of data.

In this product example, overhead collection detects an enemy tank, leading to follow-on tasking by an operator to have another asset confirm the detection. The content depicted in this image is notional and for illustrative purposes only.

While the opportunities for AI and Automation in this example are numerous, and may not immediately translate into high-risk activities, they raise relevant ethical questions. These questions focus on ensuring that the AI tools employed within the DSS are truly fit for purpose, thoughtfully evaluated, and appropriately deployed. Our colleagues discussed the tooling and processes necessary to enable responsible AI in depth in a previous blog post. The importance of these considerations in model development becomes even more pronounced when deploying AI/ML tools in military environments, where dire consequences can follow from a wide array of risk vectors, including adversarial attacks, false positives, and deceptive inferences.

But even with these risks in mind, there is real potential for sensing tasks to be mostly or entirely machine-driven. A robust suite of hardware, data integration, models, and orchestration can provide relevant tips and cues to decision-makers in the field. This, of course, presumes a baseline focus on critical tasks such as effective testing, validation, and evaluation throughout model deployment, as well as long-term model management and maintenance. Only by trusting the models and understanding the limits of their effectiveness can we rely on them to help sift through the sea of noise and derive meaningful signals for downstream human decision-makers.

*Model Testing and Evaluation is a critical component of any downstream deployment story. The content depicted in this image is notional and for illustrative purposes only.*

In some imagery exploitation workflows, human-machine pairing can unlock even more value than automation alone. The reality is that models, no matter how thoroughly tested and validated during development, often fall short of user requirements. Enabling users to collaborate with and enhance the output of a model is at the core of what is most needed by warfighters. This is why Palantir has invested in User Centered Machine Learning (UCML) — moving beyond static computer vision models to a user-driven experience that identifies objects of interest at scale while exploiting imagery to drive better decisions. As is often the case, effective and ethical uses of AI here go hand in hand. The analyst not only contributes to the ultimate speed and efficacy of the computer vision tool, but also applies human judgement to crucial questions about what they need to search for and what the models may be missing. And better data from sensors and the initial exploitation of imagery or other inputs make the task of making sense of that information not only faster but more comprehensive, aiding in both the pursuit of critical military objectives and the prevention of civilian harm.

Make Sense + Analyze

UCML is a workflow that sits on the border of Sense and Make Sense — taking the raw sensor outputs and using a combination of human judgement and AI tools to extract the relevant entities necessary for the user to understand their environment. As we delve deeper into the decision-making process, particularly in the most impactful of these processes — a targeting cycle — the warfighter must determine if action needs to be taken and, if so, what form that action will take. These determinations will be based on assessments of, among other considerations, whether there is a threat to friendly forces, the opportunity to gain military advantage, whether the target in question is an appropriate military objective, and which priority status should be granted to the designated target given the current operational environment.

Though drawing conclusions based on disparate data sources in a fast-moving environment is an area where human judgement remains particularly important, targeted use of AI tools can augment that judgement and ensure that human operators are making the best decisions as quickly as possible with as much relevant context as possible. Intelligent sensor fusion can lead the way here, ensuring that an entity of interest that may have been initially sourced by our UCML user through analysis of overhead imagery is cross-referenced with other sources of intelligence, such as electronic intelligence (ELINT), to help confirm or deny the status of a potential adversary.

Disparate sources of intelligence must come together to help construct a single, comprehensive, reliable, and traceable assembly of information for the warfighter. The content depicted in this image is notional and for illustrative purposes only.

We can think of this automation as a tool that helps the warfighter navigate the real-world operational and ethical challenges of accurately identifying potential targets. For instance, consider an analyst using a computer vision tool to identify ships in an area of interest. This analyst might also have access to other sources providing geo-located indicators of enemy electronic emissions and open-source datasets representing commercial shipping assets. By automatically integrating these disparate data sources, the analyst can more efficiently untangle a complex maritime environment. The Decision Support System might even automatically merge these disparate data sources and suggest potential overlaps or cross-validations between these discrete inputs for the human trying to understand the situation. Ultimately, this task involves entity resolution, where multiple discrete pieces of intelligence are analyzed and determined to be associated with the same real-world object.

When implemented at scale, automating entity resolution suggestions can provide the end user with more comprehensive context than piecemeal analysis, while maintaining transparency into the underlying sources of intelligence (and the potential risks of relying on those sources). But entity resolution is no trivial task, and any degree of its automation must be approached with care and precision. Intelligence data, which can vary in its authoritativeness, requires rigorous data cleaning to ensure valid and robust analysis. It is crucial that subject matter experts retain the ability to configure settings for how datasets are fused (e.g., specifying the required data density in a given area over a specified time range) to ensure that results are consistent with tradecraft best practices. Additionally, the choice between adopting more deterministic (or rules-based, e.g., PARM) approaches to entity resolution or more fluid, non-deterministic (or AI/ML, e.g., clustering and classification) approaches may depend on the assessed reliability and reproducibility of the different techniques.

Given the real-world harm that can result from inaccurate or inadequate information, it is critical to provide warfighters with as much relevant information as possible without overwhelming them with noise or undermining their ability to leverage their specialized training and subject matter expertise. Part of this automation can focus specifically on preventing civilian harm and ensuring warfighters uphold their obligations to the principles of distinction and proportionality.[4]

AI Decision Support Systems can help users understand the likely impact of any action across various categories. For instance, are there protected sites nearby or unexpected clustering of non-combatants? Are there alternative courses of action that would achieve comparably valuable military objectives but with reduced risk of civilian harm? Would a different munition pairing yield the intended results but with less likelihood of collateral damage? These and similar considerations demonstrate how DSS can be most useful in determining the optimal next decision to take out of many possible options (i.e., beyond the threshold go/no-go decision).

As the warfighter tees information up before ultimately deciding a course of action, sensor fusion and AI-informed suggestions can help minimize errors and propose paths for human decision-making that might not have been immediately apparent.

Decide + Act

Finally, the warfighter has to make a decision and act on that decision — determining how to take action against a target while incorporating operational constraints, rules of engagement, the disposition of partner forces, and all of the other myriad considerations that are part of an eventual command decision.

Eventual command decisions regarding proposed courses of action come at the end of the chain of viewing fused intelligence evaluated against a set of potential options, each of which factors in the range of critical evaluation parameters and considerations. The content depicted in this image is notional and for illustrative purposes only.

As indicated earlier, each final decision is just the end result of a longer process. Integrating the component parts of that process is critical to ensuring that the final decision is the best possible one given the circumstances. For instance, when deciding between different courses of action (COAs) in a complex operational environment, it is essential to incorporate the inputs of various teams. Many of these inputs may directly concern the potential impact of an action on non-combatants or the deconfliction of any action with friendly forces. This process should include members from different units, different services, and even different militaries. Relevant users and stakeholders will encompass not only intelligence and operational counterparts, but also military lawyers and commanding officers.

Process-related artifacts from this collaboration are critical for downstream decision-making. For instance, these could include the results of a Collateral Damage Estimation (CDE) process, the signoff from the relevant Judge Advocate General (JAG) or Commanding Officer (CO), and any other doctrinally required checks.

Aligning the information available to and from these components of any decision, while respecting controls on sensitive data and enabling maximum interoperability, is a core function of any software platform deployed in this context. Integrating AI tools in this process must be done thoughtfully, but also carries great potential to augment and enhance warfighter capabilities. Immediately valuable applications include translating critical information on the fly for a partner nation and flagging late-appearing objects of interest that change the calculus for potential civilian effects. Similarly, extracting information from relevant free-text reports represents an area where Large Language Models (LLMs) have great potential. However, they must remain grounded in and referenced to their source material to enable an end user to readily validate any summarization done by a model and thereby avoid any risks associated with model hallucinations.

Ultimately, it is in fusing together the outputs of AI tools alongside process-related artifacts that software systems generally, and AI Decision Support Systems specifically, demonstrate their greatest potential. Each output of an AI tool, taken in isolation, may not only have limited utility, but may also be fatally undermined by the challenges and constraints of AI/ML brittleness. However, when situated in the broader context of a warfighter’s information environment, legal and policy obligations, existing trusted data stores, and collaboration with their colleagues, the outputs of AI tools can become critically important components for making the correct final decision.

In this way, we arrive at a clearer understanding of the value of AI-enabled automation. Automation in this context should be understood to serve one, specific end: freeing up more space for the truly creative and critical human decision-making of the warfighter. There certainly will be process-related efficiencies that thoughtfully-developed software can enable, but by situating those efficiencies in a particular context, we orient around the specific human decisions and human context that we aim to maximize, rather than merely the seconds we can save.

Final targeting decisions and corresponding actions supported by an AI-DSS are the culmination of many detailed and process-laden steps by human decision-makers, each requiring methodical consideration of the appropriate roles for both artificial and human intelligence capabilities. The content depicted in this image is notional and for illustrative purposes only.

Conclusion

Making considered decisions quickly in future conflicts will increasingly require the support of AI tools across the entire decision support chain. We believe that technology can help manage the presumed tension between fighting wars effectively and fighting wars ethically and in a manner that helps to safeguard fundamental rights. Ultimately, our customers understand that these two considerations must — and do — go hand-in-hand. DoD’s Ethical AI Principles, DoD’s Responsible Artificial Intelligence Strategy and Implementation Pathway, DoD’s Instruction 3000.17, CDAO’s Responsible AI (RAI) Toolkit, State Department’s Responsible Military Use of AI Declaration, [5] and the broader conversation happening across partner militaries [6] to grapple with the role of AI and automation in defense workflows points to this reality.

However, as much as we believe that effectiveness requires thoughtful approaches to applied ethics, we also believe that ethics without effectiveness will make these tools irrelevant to our warfighters. If our tools enable our military partners to make good decisions — both in the qualitative and normative senses — we enhance our chances of winning the fight and maintaining the moral high ground that makes it worth fighting for. Thoughtfully applied AI and automation across the military decision-making lifecycle can help us raise the bar for civilian harm mitigation while enabling the United States and its allies to continue to move faster than our adversaries.

Authors

Peter Austin, Privacy and Civil Liberties Government and Military Ethics Lead, Palantir Technologies
Courtney Bowman, Global Director of Privacy and Civil Liberties Engineering, Palantir Technologies

Footnotes

[1] Sometimes called “Law of Armed Conflict” or “International Humanitarian Law,” the Law of War is specifically intended to address the circumstances of armed conflict, and comprises both customary international law and treaties. The International Committee of the Red Cross (ICRC) maintains introductory material at https://www.icrc.org/en/war-and-lawfor readers interested in the underlying treaties and customs, whose concepts are then implemented by state actors. For example, the United States Department of Defense’s Law of War Manual at https://dod.defense.gov/Portals/1/Documents/pubs/DoD%20Law%20of%20War%20Manual%20-%20June%202015%20Updated%20Dec%202016.pdf?ver=2016-12-13-172036-190.

[2] The DoD released its Civilian Harm Mitigation and Response Action Plan (CHMR-AP) on August 25, 2022. The document provides a comprehensive eleven point outline of major actions that DoD is obligated to implement in order to mitigate and respond to civilian harm. It is the work product of an earlier (January 27, 2022) memo issued by Defense Secretary Austin carrying forward a congressionally mandated effort earmarked in the 2019 National Defense Authorization Act (NDAA FY 19), but also building on a series of probing inquiries into the impact of collateral damage on civilians throughout the decades-long Afghanistan and Iraq conflicts.

[3] This framework is inspired by the Joint All-Domain Command and Control (JADC2) strategy promulgated in March 2022 by the Department of Defense, which outlines three critical command and control functions: Sense, Make Sense, and Act, and ties closely to that framework, but will not adhere in every specific instance.

[4] Distinction and Proportionality (along with Military Necessity and Minimizing Unnecessary Suffering) are principles at the core of the jus in bello (moral conduct of the war) branch of traditional Just War theory. For a canonical treatment, see Walzer, Michael (1978). Just and Unjust Wars. Basic Books.

[5] Details available at https://www.defense.gov/News/Releases/Release/Article/2091996/, https://www.ai.mil/docs/RAI_Strategy_and_Implementation_Pathway_6-21-22.pdf, https://rai.tradewindai.com/, and https://www.state.gov/political-declaration-on-responsible-military-use-of-artificial-intelligence-and-autonomy/, respectively.

[6] See, for example, REAIM 2024’s “Blueprint for Action” at https://thereadable.co/full-script-reaim-blueprint-for-action/.

Ethical AI in Defense Decision Support Systems was originally published in Palantir Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.