In 2019, two Israeli cybersecurity researchers demonstrated the fragility of hospital IT systems. Using malware, the hackers accessed almost 70 lung scans and “injected” tumors, causing medical staff to diagnose cancers incorrectly.
This attack highlights the general need to develop digital forensic units, and more specifically, to strengthen hospital network cybersecurity. It is vital to be able to protect health data, detect when images (especially medical ones) have been falsified, and secure AI decision support tools used by hospital staff.
This is why the Cybaile industrial chair was created, with the involvement of IMT Atlantique. The project focuses on federated machine learning and aims to develop robust and secure algorithms for processing sensitive medical data, distributed across different sites. It also aims to address intellectual property protection for AI models and uses various experimental platforms to validate the solutions put forward.
Combining medical expertise and cybersecurity
It all began in 2022, when the “Cyber Health” team was created at the Laboratory of Medical Information Processing, LaTIM Inserm UMR1101, shared by IMT Atlantique, Université de Bretagne Occidentale (UBO) and Inserm. To support the team’s activities, the Cybaile chair was launched in 2023, dedicated to the cybersecurity of medical technologies. Its aim is to develop robust AI and trust in healthcare, thanks to externalized data protection tools.
“We know that medical data is going to be shared and taken out of the original system. Our aim is to think about how we can continue to protect this data when it is no longer ‘within the walls’ of the IT system,” says Gouenou Coatrieux, cybersecurity researcher at IMT Atlantique and head of the Cyber Health team and the Cybaile Chair.
For its research, the chair benefits from solid grounding in the medical field thanks to LaTIM’s long-standing partners, the Brest University Hospital and Sophia Genetics, which offers an AI-powered medical data analysis platform, among other things. The project is supported by the region of Brittany, a partnership with AiiNTENS, a start-up specializing in clinical decision support in neuroscience and intensive care, and Thalès, which is contributing its expertise in cybersecurity research.
Federated learning to minimize data transfer
To develop intelligent healthcare decision support models, we need to feed the neural networks used for such models. To identify diseases from images, for example, neural networks need to learn how to detect these diseases and recognize their signs. This process requires very large volumes of data. In a medical context, this data is particularly sensitive, and it is essential to secure it in order to prevent it from being illegally shared.
To remedy this problem, scientists at the Cybaile Chair are using federated learning, a technique that enables several entities to collaboratively train a model, without sharing their data. Each entity uses its own data to improve the neural network model, and only updates are centralized. “This makes it possible to use large volumes of distributed medical data, while respecting people’s right to privacy,” says Gouenou Coatrieux. “However, the disadvantage is that we don’t know what exactly has gone into the model.”
Addressing model poisoning
In federated learning, a central server creates the average of all the models developed by entities to which it is connected. It collects all updates sent to it, which may therefore include malicious contributions. Such updates can pollute training data or make it easier to install a backdoor, a secret way to influence the system’s behavior. The consequences can range from diagnostic errors to outright blockage of the model’s learning process.
The chair’s aim is therefore to implement defensive solutions to exclude suspicious participating entities and prevent such attacks. Scientists are developing tools to enable the central server to identify malicious updates. But, once developed, a model can still be replicated using its parameters or inputs and outputs.
Protecting the intellectual property of AI systems is another major challenge for the Cybaile Chair. “As we‘ve seen, developing such systems is a very long and complex process: it requires expertise in data science and computing power, to which must be added issues of data access, specialized tools, medical expertise for certain pathologies etc.”, says Gouenou Coatrieux. “These models are very expensive, so we have to protect them.” To prevent models from being copied, the chair’s scientists propose watermarking them.
Protecting data with watermarks
The technique of data watermarking has been studied for several years now by Gouenou Coatrieux at IMT Atlantique. It typically involves modifying a few pixels in an image to encode a message, such as an identifier, and is used by video on demand (VOD) services to protect videos available for subscribers only. “This technique doesn’t prevent data from being leaked, but it does make it possible to identify who retrieved and resold the data if it’s found ‘in the wild‘,” says the researcher. And vice versa. This way, if a model is stolen, it is possible to prove to whom it belongs, thanks to the message watermarked on it.
In concrete terms, watermarking an AI model involves either significantly changing the value of its parameters or using the model’s behavior, a more original approach. “A neural network provides a response for a given input, so we could imagine creating a tag that consists of a response for a particular input,” says Gouenou Coatrieux.
Scientists are also studying the use of encryption to protect models, which would mean that processing would take place in the cloud without any knowledge of either the input data or the output result. The chair’s various tools are being gradually tested on various platforms offered by partners, including clinical data centers at Brest University Hospital and the Inserm Cloud. Federated AI models have already proved their worth in image classification, paving the way for the development of secure, robust models capable of segmenting 3D medical images.