Advanced artificial intelligence (AI) systems are poised to transform military strategy, offering insights that could be decisive in future conflicts. However, these AI-generated plans may appear opaque and counterintuitive to human commanders. This creates a significant dilemma: how can military leaders develop justified trust in strategies they do not fully understand? This challenge is critical for the United States military to maintain a competitive advantage in an increasingly complex and fast-paced operational environment.
Key Takeaways
- Advanced AI will propose highly creative, effective, but often incomprehensible military strategies.
- The more creative an AI's strategy, the less intuitively understandable it becomes to humans.
- Traditional explainable AI methods may not fully convey the rationale for these complex decisions.
- A new oversight model, using 'calibration by consensus' and 'calibration by disagreement' from multiple independent AIs, can build justified trust.
- This approach aims to verify AI outputs through cross-checks rather than attempting full human comprehension of AI logic.
The AI-Command Dilemma in Modern Warfare
Modern warfare is evolving rapidly. Robotic systems, cyber operations, and electronic warfare are increasing the speed and complexity of military actions. To remain competitive, the U.S. military must integrate advanced AI systems into strategic and operational decision-making. This integration, however, introduces a novel challenge: relying on AI recommendations that may defy human understanding.
Historically, military leaders evaluate strategies based on training, experience, and established doctrine. Consider Major General Ulysses S. Grant's Vicksburg campaign in 1863. His plan to cut supply lines and attack from the rear was seen as reckless by Major General William Tecumseh Sherman, who predicted disaster. Sherman noted an enemy "would be glad to manoeuvre a year… to get [Grant] in." Grant's unconventional approach defied military orthodoxy of the time, which emphasized secure supply lines and concentrated forces. Despite initial skepticism, Sherman followed orders, and Grant's plan succeeded, leading to the fall of Vicksburg.
"The tension embodied by Sherman’s skepticism at Vicksburg — where a sound, conventional assessment clashes with an unconventional, risky, but brilliant approach — may find a strange echo in modern warfare."
Future AI systems are expected to present similar dilemmas. Their strategic proposals might seem alien, counterintuitive, and opaque. The core question becomes: how do commanders build confidence in these "alien oracles" when their reasoning cannot be fully grasped?
Historical Parallel
In 1863, Major General Ulysses S. Grant's Vicksburg campaign defied conventional military wisdom. His strategy of cutting off supply lines and attacking from the rear was considered highly risky by his subordinates, including Major General William Tecumseh Sherman. Despite initial doubts, the plan succeeded, leading to a major Union victory.
Strategic Acuity of Advanced AI Systems
The first principle governing advanced AI in warfare is its high level of strategic acuity. These systems can formulate judgments of exceptional creativity and effectiveness. They often surpass human capabilities in complex strategic domains. This superior judgment comes from technological advantages that allow AI to process information, learn, and strategize differently from human cognition.
Two key attributes contribute to AI's strategic acuity. First, modern AI, especially deep learning models, excels at mass-scale pattern recognition and computational depth. These systems can analyze vast datasets, such as millions of past wargames, sensor feeds, or historical scenarios. This enables them to identify intricate, subtle patterns and long-term implications that humans might miss.
One analysis of AI-enabled warfare suggests that AI can detect signals "far below the noise level for human observation." For example, an AI can evaluate billions of potential moves in complex games like Go or chess. It calculates probabilities and outcomes with unmatched depth and speed, foreseeing strategic advantages many turns ahead.
Second, modern AIs can engage in self-supervised learning and unconstrained exploration. Techniques like reinforcement learning and self-play allow AIs to learn and refine strategies without human instruction or even human-generated data. By playing against themselves in simulations, they explore problem spaces, discover novel solutions, and optimize strategies. This process is unconstrained by human biases or historical precedents.
AI Learning Methods
Reinforcement Learning: An AI learns by performing actions in an environment to achieve a goal, receiving rewards or penalties. It iteratively refines its strategy to maximize rewards.
Self-Play: An AI plays against itself, often in simulated environments, to generate vast amounts of training data and discover new strategies without human input.
AlphaZero, a notable example, achieved superhuman performance in chess, shogi, and Go within hours. It developed creative and unorthodox strategies that redefined optimal play in these games.
The Inverse Relationship: Creativity and Comprehensibility
The strategic acuity of AI leads to a second principle: there is an inverse relationship between an AI's creativity and its immediate comprehensibility to human commanders. The mechanisms that generate truly novel and superior strategic outcomes often make those solutions opaque to human understanding.
Unlike humans, who rely on learned heuristics and innate biases, advanced AI systems can operate on emergent, trans-human heuristics. These are optimized purely for performance, not for human interpretability. A classic example is AlphaGo's Move 37 against Lee Sedol. Human Go masters initially dismissed this move as a mistake, as it violated conventional wisdom. However, it proved to be a strategically pivotal and ultimately correct play.
This capacity for counterintuitive optimization extends to more complex strategic domains. DeepMind's AlphaStar mastered StarCraft II using strategies described as "unimaginably unusual" by top human players. Military commanders, who are accustomed to explanations based on familiar axioms, historical analogies, and clear causal links, will find these highly optimized AI-generated solutions difficult to intuitively grasp or trust.
The inherent disconnect between AI's alien logic and human intuition means that as AI becomes more strategically astute and genuinely innovative, the cognitive burden on human commanders to understand why a decision is optimal increases. This burden can exceed the limits of intuitive human comprehension.
Limits of Explainable AI
Given their strategic acuity and the inverse relationship between creativity and human comprehensibility, advanced AI systems will be inherently unexplainable in a way that fully conveys their rationale to human users in real-time decision-making. Demanding a fully human-comprehensible explanation for AI decisions, especially for the most creative insights, faces practical limits.
The sophisticated calculations driving an AI's decisions can be incompatible with human-interpretable logic. An AI might provide a post hoc rationalization that seems plausible and comforting. However, this explanation may bear little resemblance to the AI's actual computational path. This is similar to a parent explaining gifts with a story about Santa Claus. The explanation is comforting but fundamentally disconnected from the underlying reality. This can create an illusion of understanding and misplaced confidence.
Even well-intentioned explainable AI frameworks struggle with this tension. While providing meaningful explanations is vital, ensuring explanation accuracy is the core challenge. The explanation must genuinely reflect the AI's complex internal processes. For advanced, opaque models, verifying such accuracy is incredibly difficult, often impossible. Even if accurate, humans may struggle to distinguish a valid explanation from a fabricated one. Explanations can also be manipulated or oversimplified to achieve understandability at the cost of fidelity.
In wartime, extreme time pressures and cognitive load make exhaustive analysis of complex AI explanations unrealistic. Training human users to be better skeptics of AI is not enough. Such skepticism could create deep tension with the military's need for speed in decision-making.
AI and Decision Speed
A 2024 Carnegie Endowment study simulated a Taiwan crisis. It found that leaders hesitated to act on AI-generated recommendations. This hesitation slowed decisions as they questioned the system's logic. This highlights the tension between trust, comprehension, and the need for rapid action in military contexts.
One vision suggests that a "new coup d’œil" will be a form of intuition about when to trust AI and when to question it. But what if AI's strategic leaps are so far beyond human understanding that such intuition becomes impossible? Commanders would then face a tough choice: accept incomprehensible recommendations or reject them and risk being outmaneuvered by an AI-enabled adversary.
As Erik Lin-Greenberg notes, "militaries that effectively integrate AI technology will be better positioned to counter threats, while those that allow AI to stymie decision-making and operations may find themselves disadvantaged on the battlefield."
Building Justified Trust Without Full Explainability
Future AI systems must be designed not just for technical excellence, but with human psychology and decision-making under pressure in mind. This means designing AI that manages risk, communicates confidence levels, and supports commanders in making informed choices when faced with perplexing recommendations. Navigating this AI-command dilemma will be central to future military success.
The central challenge is how to produce justified trust in advanced AI without full explainability. If AI-enabled warfare is as fast as some predict, human-on-the-loop oversight will struggle to keep pace. Explanations for an AI system's most creative recommendations will be difficult, if not impossible, to verify at speed.
A proposed oversight mechanism adapts time-tested military principles for managing complex, high-risk systems. An example is field artillery's demand for independent verification of firing data. Applied to AI, justified trust can be generated not by explaining decisions, but by verifying consistent outputs from multiple, independently developed AIs.
This approach suggests bounding human oversight with two machine-speed gates: calibration by consensus and calibration by disagreement. Only AI outputs that survive independent cross-checks would reach a commander. Structured divergence would trigger human intervention.
Without these calibration gates, on-the-loop oversight risks becoming either a rubber-stamping of AI outputs or slowing them down to human speed. This defeats the purpose of AI-enabled warfare.
Calibration by Consensus
This method, an example of ensemble learning, uses multiple independent AI agents. These agents might have different algorithms or training data. They analyze the same problem. Just as artillery missions proceed only when independently calculated firing solutions match within tolerance, an AI solution gains justified trust when diverse AI agents converge on congruent outputs. This convergence indicates a higher likelihood of accuracy and reliability.
Calibration by Disagreement
Calibration by disagreement mirrors the artillery's adjustment of fire process. Initial shots are expected to miss, and their divergence from the target provides essential information for correction. In gunnery, accuracy is achieved not by assuming the first round is perfect. Instead, it comes from observing errors, diagnosing their causes, and iteratively refining aim until confidence is high enough to fire for effect.
Similarly, when multiple AI agents generate conflicting recommendations, the disagreement itself becomes a diagnostic signal. It reveals hidden biases, data anomalies, or unpredictable model behaviors that warrant human scrutiny. Trust emerges not from assuming transparency into the AI's "mind." Rather, it comes from the observable and verifiable process of convergence. Divergence is deliberately surfaced, interrogated, and used to correct course before decisions are executed.
In both calibration methods, the inner workings of individual AIs are less important than the observable and reliable effects of the multi-agent system. Hidden errors and biases are surfaced through divergence. Only recommendations that withstand this adversarial scrutiny are trusted for action.
The military has long understood that trust is earned through results. Grant's audacious Vicksburg campaign seemed reckless to Sherman, but Sherman knew and trusted his superior. The most impactful AI strategies will frequently defy human logic. The key to cultivating justified trust in these opaque systems is rigorous calibration and confidence built on experience, not explainability. A decisive advantage in tomorrow's complex battlespaces requires that the U.S. military develop calibration methods. These methods must enable commanders to confidently and swiftly execute AI-generated plans, even when their underlying genius remains a mystery.
This analysis was contributed by Andrew A. Hill, DBA, General Brehon Burke Somervell chair of management at the U.S. Army War College, and Dustin Blair, an Army officer serving as chief of fires at U.S. Army Cyber Command. Their views are their own and do not represent any official government or military positions.