Blog post: AI Doesn’t Solve Everything – But It Can Solve A Lot If Done Correctly

AI in the military is advancing, of that there is no doubt, but it is crucial to know what works for each solution area so that the customer doesn’t assume that what is done in the civilian market is directly translatable to the military. Mikael Grev, Avioniq's CEO, delves into some of challenges of incorporating AI into the military space, and the novel approach Avioniq is taking to harness its potential for improving air combat operations.

AI is a technology that’s on fire across most civilian tech sectors, and it’s here to stay. It’s an excellent tool in the toolbox for solving certain types of problems, though perhaps not the universal solution for all technical challenges that it’s sometimes marketed as, often by those selling the service. Comprehensive AI works well when the results can be a bit fuzzy, and when accuracy, repeatability, and understanding of underlying motives are not primary requirements.

At Avioniq, we have been using a variant of AI for the complex and sensitive military environment for many years, where humans will need to remain “in the loop” for the foreseeable future. It's crucial for them to understand what's happening, as it often involves human lives and tough decisions between various alternatives. But let’s first clear up why AI can be problematic and why civilian AI solutions can't simply be copied over to military applications.

The type of AI that has garnered the most interest and admiration recently is generative AI. ChatGPT, image and video creation services, which enchant us with one fantastic solution after another, are of this type. Generative AI essentially involves feeding data, such as a text query, an audio file, an image, or missile parameters, and receiving a response as data, like text, audio, image, or a series of maneuvers in air combat.

A characteristic of generative AI is that the number of distinct combinations of inputs and outputs is virtually infinite. When you generate an image from text, it will be different every time, even though it’s always a response to the text. This makes it challenging to verify that the result is correct – or even good – since these AI models rarely have a reference point for what the correct answer actually is. To know if the answer from generative AI is good, a domain expert must evaluate it, which is very resource intensive. In the case of ChatGPT, a large number of people (thousands), acting as domain experts for regular text, assess the quality of the answer. A special AI is even trained only to evaluate the quality of responses from the main AI, using these human assessments.

In fields like air combat, where the clarity of what constitutes a good response is uncertain and the answer can vary each time, it’s a problem because pilots are both in short supply and not prioritized to be domain experts for AI training.

Companies specializing in AI often market it as if AI is the solution to the entire problem, hence the term "comprehensive". Generative AI answers questions where the entire domain problem is contained within the question itself, which is the case for ChatGPT, image and audio generation. In these instances, it works excellently to have AI experts who solve the problem solely by being able to evaluate whether the answer is more or less correct in relation to the question. But when the solution is part of a larger context, with implicit information in the question that is not included as input data, and where boundaries between what AI should deliver are more difficult to define and where some information and conditions exist in other systems, it becomes trickier.

For instance, it is relatively simple to create AI that is incredibly efficient and almost unbeatable in air combat if the system can do as it pleases and there are only known, clear rules. But it quickly becomes very complex and requires significant domain knowledge from pilots if you want to create AI that conducts air combat in collaboration with an operator, where the latter might have conditions that the AI is not aware of. A simpler example is that it is relatively easy to create AI that plays chess incredibly well but challenging to create AI that plays equally well in cooperation with a human. The root of both problems lies in the boundaries where humans and machines must understand each other to work together. Today, AI cannot, and will probably never be able to—except in specific cases—effectively explain why it acts in a certain way, with the ability for the operator to change certain unprogrammed priorities and conditions.

AI companies' solution to the above is to make more information available for AI training, simply because it's the only thing that can be done without making the challenging boundary definitions that require the domain experts who are often not available. In some domains, this works, like sensor and image analysis, which have a natural boundary against humans and other systems. For other areas, like air combat—where the process is a continuous trade-off between different solutions—it becomes significantly more difficult.

To replace a fighter pilot, even during certain combat sequences, AI would need to be fed virtually the entire pilot training (as has been suggested...), risk assessments (difficult to quantify), and all documents involved in the operation (constantly changing) for AI to generate a solution that doesn’t require human understanding. Therefore, for the foreseeable future, the operator-in-the-loop will need to understand and continuously be able to correct what is happening, using their experience and all the “soft” information that changes during a mission.

Above are two of many characteristics AI has that become problematic in certain military contexts. The consequences of ChatGPT generating an incorrect answer are manageable, but for military products, we naturally place higher demands. This is especially true when it comes to the air arena with its rapid processes that do not allow for manual verification of the AI’s answers, and where mistakes often have significant consequences. This does not mean we cannot use AI for the military air arena; it just means higher demands and requires managing the problems early in development. It’s easy to be seduced by the possibilities of AI and extrapolate it too far.

At Avioniq, we have been using AI for air combat since 2016, but from the start, we have been aware of its limitations in air combat and have not built ourselves into the problems described above. Let’s take them one at a time, as the solutions are of entirely different natures.

To address the problem of AI-generated answers being difficult to verify, given the many possibilities of what is correct and the complex solution space, we have chosen to use something we call verifiable AI for these solution areas. This means we create smaller, simpler AI where the input and output data are well-defined and directly verifiable with a simulation of the entire process. These AI models still require enormous computing power to create, with billions of simulations per AI, but the structure is optimized for clarity and verifiability.

Instead of creating AI that provides the operator with a heading or flight path to win the battle, which is a poorly defined problem as it involves so many parts of “win the battle,” we ask simpler questions with clearer and well-defined answers. For example: “How many g's (how strongly) must the aircraft turn to avoid the incoming missile if it is of type AA 10-C?” The difference is that here it’s possible to simulate the exact answer, provided there’s a model of the missile, and verify that AI delivers the correct answer, which is a simple number between one and nine. Technically, one can conduct an exhaustive statistical verification of the results for all combinations of input parameters using deterministic simulations.

Verifiable AI doesn’t deliver the entire answer like a comprehensive AI does, which is the point since it also solves the other problem: the boundary between operator and AI. Instead of AI delivering the entire answer to the problem, which the operator finds difficult to understand and thus modify or quickly approve, verifiable AI delivers solutions to the exact questions the operator has. Knowing how many g's the aircraft just needs to turn with is an important parameter in air combat, as the pilot does not want to turn too little (getting hit) or too much (becoming passive and losing speed).

To give the operator the same help that comprehensive AI can deliver, we produce several verifiable AI models that are visualized for the operator in different ways depending on the platform and customer. For example: “When can I fire at the earliest to hit, even if the opponent turns away, but before the opponent can drop a bomb on my protected target.” Each of these AI can then be combined and integrated into combat aircraft, air defense systems, and C2 systems without replacing the human with a machine. This results in iterative development over time, without the Big Bang problem that comprehensive AI entails.

At Avioniq, we say that we augment—not replace—the operator by providing answers to the relevant questions that form the basis for decision making. The operator’s decisions are moved to a higher abstraction level, where humans still excel, and they are relieved of the cognitively demanding parts of air combat that a machine performs better and faster. Which verifiable AI solve the operator’s problem is something only a domain expert, such as a fighter pilot or air battle manager with a technical background, can figure out. That’s why we believe companies that combine AI engineers with domain experts are the optimal and perhaps only way forward in complex military areas that want to harness AI.

AI in the military is advancing, of that there is no doubt, but it is crucial to know what works for each solution area so that the customer doesn’t assume that what is done in the civilian market is directly translatable to the military. The future for technically inclined domain experts is bright and will become a bottleneck for developing new AI-based systems in the future.

AI is not magic where generalists solve the problems, but a tool to solve specific problems, with the help of domain experts. Together, engineers and domain experts can build systems that far exceed the operational effectiveness of today’s systems. Our and independent government tests show several hundred percent increase in air combat effectiveness simply by introducing the right verifiable AI. When an aircraft becomes worth three, it can be translated into billions in increased customer value per air force, so there is no doubt that software has a bright future in military systems.

As always, when there is a technological leap, those who act late or not at all will fall hopelessly behind, while those who act and act correctly can dominate the market for a long time to come. You all know which companies are used as examples of not acting.