Dive Brief:
- Companies designing AI systems should protect training data from tampering and strictly limit access to its underlying infrastructure, the U.S. and three allies said in a joint guidance document published on Thursday.
- The AI security guidance addresses multiple topics, including protecting data throughout the AI systems’ life cycle, supply chain considerations and ways to mitigate possible attacks on large data sets.
- The multilateral warning reflects concerns in the U.S. and allied nations about powerful AI models containing vulnerabilities that can ripple across critical infrastructure.
Dive Insight:
The FBI, the Cybersecurity and Infrastructure Security Agency and the National Security Agency collaborated with cybersecurity agencies from Australia, New Zealand and the U.K. to produce a best-practices document for secure AI development.
“The principles outlined in this information sheet provide a robust foundation for securing AI data and ensuring the reliability and accuracy of AI-driven outcomes,” the countries said.
The advisory arrives at a key moment, as more and more companies rush to integrate AI into their operations, sometimes with little forethought or oversight. Western governments have grown increasingly concerned that Russia, China and other adversaries will find ways to exploit AI vulnerabilities in unforeseen ways.
These risks have only increased as critical infrastructure operators build AI into operational technology that controls essential elements of daily life, from power to water to health care.
With data constituting the foundational element of AI systems, the document addresses ways to protect information at different stages of the AI life cycle, including planning, data collection, model development, deployment and operations. It encourages the use of digital signatures that authenticate modifications, trusted infrastructure that prevents unauthorized access and ongoing risk assessments that can identify emerging concerns.
The guidance focuses on ways to prevent data quality issues, whether accidental or deliberate, from undermining the safety and reliability of AI models. Cryptographic hashes can ensure that raw data is not modified after being incorporated into a model, according to the document, and regular curation can weed out problems with data sets found on the web. The guidance also recommends the use of anomaly detection algorithms that can “remove malicious or suspicious data points before training.”
The joint guidance also addresses concerns such as statistical bias, inaccurate information, duplicate records and “data drift,” a natural degradation in the characteristics of the input data.