AI safety

Is a

Industry

Industry attributes

Parent Industry

Artificial Intelligence (AI)

Child Industry

AI alignment

Overview

Artificial Intelligence (AI) Safety is a research field focused on ensuring AI is deployed in ways that do not harm humanity. The field, which includes AI alignment, has grown in importance given the significant developments in AI and machine learning, leading to more sophisticated models finding greater use across a range of applications, especially in areas where safety and security are critical. AI safety aims to mitigate the risks associated with utilizing AI models by researching "safe" AI and machine learning techniques that identify causes of unintended behavior and develop tools to reduce the likelihood of it occurring. This area of AI safety focuses on the technical solutions required to guarantee AI systems operate correctly and reliably. But AI safety also includes AI governance, policy, and strategy focused on developing legislation and agreements to prevent the malevolent use of AI.

The risks posed by AI can be divided into different subfields. The Center for AI Safety uses four main categories:

Malicious use—People intentionally harnessing powerful AIs to cause widespread harm.
AI race—Competition could push nations and corporations to rush AI development, relinquishing control to these systems.
Organizational risks—There are risks that organizations developing advanced AI cause catastrophic accidents, particularly if they prioritize profits over safety.
Rogue AIs—Losing control over AIs as they become more capable.

The rapid deployment of AI models has led to calls to regulate their use. On October 30, 2023, the US government signed an "Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence." The executive order sets out new standards for AI safety and security in the US, including the creation of the US Artificial Intelligence Safety Institute (USAISI) and a related consortium. USAISI will be established by the National Institute of Standards and Technology (NIST). NIST is inviting organizations to provide letters of interest describing technical expertise, products, data, and models to support and enable safe AI systems. In November 2023, twenty-eight countries (including the US, China, the European Union, and the UK) signed the Bletchley Declaration, the world-first agreement for the need for AI regulation. A number of businesses are actively researching AI safety, and industry organizations have been set up to help improve the safety of AI models. These include the AI Incident Database to collect incident reports related to AI safety.