
The Friendly Hackers team from Thales, a world leader in data protection and cybersecurity, has won the CAID[1] challenge organised by the French Ministry of Defence2 during the fifth edition of European Cyber Week in France (November 21 - 23, 2023).
The challenge, the first of its kind to be organised by the French Ministry of Defence, was designed to evaluate the extent to which teams of hackers could exploit certain intrinsic vulnerabilities of AI models.
Thaless work on AI security and trust is aligned with the requirements of both the defence community and civilian organisations such as critical infrastructure providers, which all face the same challenges of protecting their training datasets and intellectual property, and guaranteeing that AI-generated results can be trusted for critical decision-making.
Rodolphe LAMPE, Senior Data Scientist in the Thales team, with Alice H liou, Vincent Thouvenot, Cong-Bang Huynh and Baptiste MorisseThe French Ministry of Defences AI security challenge
Participants in the CAID challenge had to perform two tasks:
In a given set of images, determine which images were used to train the AI algorithm and which were used for the test.
An AI-based image recognition application learns from large numbers of training images. By studying the inner workings of the AI model, Thaless Friendly Hackers team successfully determined some of the images that had been used to create the application, gaining valuable information about the training methods used and the quality of the model.
2. Find the images of aircrafts used by a AI algorithm that had been protected using unlearning techniques.
An unlearning technique consists of deleting the data used to train a model, such as images, in order to preserve their confidentiality. This technique can be used, for example, to protect the sovereignty of an algorithm in the event of its export, theft or loss. For example, a drone equipped with AI must be able to recognise any enemy aircraft as a potential threat. On the other hand, models of aircraft from its own army would have to be learned to be identified as friendly, then erased by a technique known as unlearning. In this way, even if the drone was stolen or lost, the sensitive aircraft data contained in the AI model could not be extracted for malicious purposes. However, the Friendly Hackers team from Thales managed to re-identify the data that was supposed to have been erased from the model, thereby overriding the unlearning process.
Exercises like this help to assess the vulnerability of training data and trained models, which are valuable tools and can deliver outstanding performance but also represent new attack vectors for the armed forces. An attack on training data or trained models could have significant consequences in a military context, where this type of information could give an adversary the upper hand. Risks include model theft, theft of the data used to recognise military hardware or other features in a theatre of operations, and backdoors to impair the operation of the system using the AI. While AI in general, and generative AI in particular, offers significant operational benefits and provides military personnel with intensively trained decision support tools to reduce their cognitive burden, the national defence community needs to address new threats to this technology as a matter of priority.
The Thales BattleBox approach to tackle AI vulnerabilities
The protection of training data and trained models is critical in the defence sector. AI cybersecurity is becoming more and more crucial, and needs to be autonomous to thwart the many new opportunities that the world of AI is opening up to malicious actors. Responding to the risks and threats involved in the use of artificial intelligence, Thales has developed a set of countermeasures called the BattleBox to provide enhanced protection against potential breaches.
BattleBox Training provides protection from training-data poisoning, preventing hackers from introducing a backdoor.
BattleBox IP digitally watermarks the AI model to guarantee authenticity and reliability.
BattleBox Evade aims to protect models from prompt injection attacks, which can manipulate prompts to bypass the safety measures of chatbots using Large Language Models (LLMs), and to counter adversarial attacks on images, such as adding a patch to deceive the detection process in a classification model.
BattleBox Privacy provides a framework for training machine learning algorithms, using advanced cryptography and secure secret-sharing protocols to guarantee high levels of confidentiality.
To prevent AI hacking in the case of CAID challenge tasks, countermeasures such as encryption of the AI model could be one of the solutions to be implemented.
AI provides considerable operational benefits, but it requires high levels of security and cybersecurity protection to prevent data breaches and misuse. Thales implements a large range of AI-based solutions for all types of civil and military use cases. Intended to be explainable, embeddable and integrated within robust critical systems, they are also designed to be sovereign, frugal and reliable thanks to advanced methods and tools used for qualification and validation. Thales has the dual AI and line-of-business expertise needed to incorporate these solutions into its systems to significantly improve their operational capabilities, said David Sadek, Thales VP Research, Technology & Innovation in charge of Artificial Intelligence.
Thales and AI
As the Groups defence and security businesses address critical requirements, often with safety-of-life implications, Thales has developed an ethical and scientific framework for the development of trusted AI based on the four strategic pillars of validity, security, explainability and resp