An international academic team proposes a unified directory of more than 700 risks associated with AI, particularly in business environments. This database aims to provide an overview and a common language to technical, regulatory and industrial actors confronted with these complex issues.
As production-grade AI solutions penetrate the corporate world, a detailed understanding of the risks they carry is a major governance imperative. And yet, until now, there has been no structured framework for systematically aggregating and analyzing all the risks identified in the literature. To fill this gap, a multidisciplinary team of researchers from MIT, the University of Queensland, the Université Catholique de Louvain and the Future of Life Institute has published a unified directory of more than 700 risks related to AI.
The objective is to provide a common language for all stakeholders - engineers, auditors, policy makers, legal experts - involved in the design, evaluation or regulation of AI systems. The repository is accessible online [1] and can be downloaded in Google Sheets or OneDrive format in order to facilitate its operational adoption.
The development of this database is based on a methodical review of the literature. The researchers analyzed 43 classifications extracted from more than 17,000 documents and reports on the topic. They extracted 777 unique risks, which they then synthesized within an analytical framework structured around two main taxonomies.
The first one is a causal taxonomy. It aims to categorize risks according to their generation dynamics: at what point do they occur in the life cycle of an AI system? What is the root cause, human or technical? Was the action that led to the risk intentional or unintentional? This approach makes it possible to model the mechanisms underlying incidents, avoided or not, and damage.
The second one is a thematic taxonomy, which groups the risks into seven domains:
Each of these domains is then subdivided into a total of 23 subdomains, thus providing a sufficient level of granularity to identify the threats specific to particular contexts of application (health, justice, cybersecurity, etc.).
Although their database is the most detailed to date, the authors emphasize that it has limitations inherent to any classification attempt. The spectrum covered is still partial in relation to all existing work. In addition, the framework tends to standardize risks that are sometimes quite heterogeneous, at the risk of not fully reflecting the complexity of certain specialized use cases.
Another limitation is that this classification reflects the state of knowledge at the time of its design. It does not take into account emerging risks, nor those that remain confidential or not publicly documented. As the analyzed corpus is mainly anglophone and comes from academic and institutional publications, the team emphasizes that certain methodological biases are also possible. To mitigate these biases, the researchers recommend supplementing their classification with other existing databases, such as MITRE ATT&CK [2], a taxonomy of adversarial tactics and techniques, or the AI Vulnerability Database [3], an open source database dedicated to known vulnerabilities.
By analyzing the data collected, the researchers identified several significant trends, which they published in a preprint article [4].
First observation: 51% of the risks are caused directly by the AI systems themselves, compared to 34% of human origin. This figure highlights the need for in-depth technical audits in addition to the identification of human errors or intentional user behavior.
Second observation: the majority of risks (65%) appear after the system has been deployed, often in real-life interaction contexts, while only 10% are revealed during the design or training phases. This asymmetry calls for a strengthening of post-deployment monitoring and production response capabilities. Hence, between us, the importance of application dashboards such as those that GenerIA delivers with each of its solutions.
Thematically, three categories are the focus of most of the researchers' attention: security flaws in AI systems, socioeconomic and environmental impacts, and biases related to discrimination or toxicity of the models. These areas are addressed in more than 70% of the publications analyzed.
Conversely, the risks associated with human-machine interaction or misinformation are less systematically studied (< 50% of the documents). Some issues, such as the emergence of objectives not aligned with the primary missions of the solutions or the low robustness of the models, are particularly well documented. Others, such as the degradation of the information ecosystem, the possible rights of intelligent agents or the competitive dynamics between AI producers and device manufacturers, remain marginal in the analyses.
This directory is a major step forward in structuring the debate on the governance of AI systems. By providing a unified and interoperable mapping of the risks identified to date, it brings an actionable framework to decision-makers, auditors and developers who are aware of the need to implement a documented approach in their development, regulation or deployment policies.
More than a simple inventory, this work paves the way for a systemic approach to risk in AI, where interactions between technologies, uses and socio-technical environments can be better anticipated, analyzed and regulated. In other words: a must-read!
References
[1] What are the risks from Artificial Intelligence?
[2] MITRE ATT&CK
[4] P. Clark, I. Cowhey, O. Etzioni, T. Khot, A. Sabharwal, C. Schoenick, O. Tafjord, AI Vulnerability Database
[4] P. Slattery, A.K. Saeri, E.A.C. Grundy, J. Graham, M. Noetel, R. Uuk, J. Dao, S. Pour, S. Casper, N. Thompson, The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence
In the GenerIA blog: