Skip to content

Prioritizing Security: A Key Consideration

Redefining the purpose of the UK AI Safety Institute within a broader UK administrative system

Prioritizing Security at All Costs: An Examination of Safety Measures Implemented
Prioritizing Security at All Costs: An Examination of Safety Measures Implemented

Prioritizing Security: A Key Consideration

In a recent summit, world leaders acknowledged the urgent need to address the risks posed by advanced AI systems. One of the key commitments secured by the UK prior to the event was from tech giants Google, OpenAI, Microsoft, and Anthropic, who agreed to provide the UK's AI Safety Institute (AISI) with pre-release access to their latest models for risk evaluations.

However, the current state of AI safety evaluation leaves much to be desired. Without knowing the contents of the dataset, it is challenging to determine if a model possesses unforeseen or dangerous capabilities, or if more could have been done to mitigate risks during the pre-training stage.

The presence of pre-market approval powers will be crucial to ensure that evaluation methods develop consistently with the public interest. The limits of the voluntary regime extend beyond access and also affect the design of evaluations, with current practices being better suited to the interests of companies rather than the public or regulators.

Currently, the UK AISI lacks the legal powers to block a company from releasing an AI model or impose conditions on its release. To improve this, the AISI should work with regulators to study applications of advanced AI systems, such as the AI Red Box, to understand their risks and impacts on human operators.

Evaluations of foundation models like GPT-4 tell us little about the overall safety of a product built on them. To address this, the AISI should implement comprehensive testing regimes, including offline dataset evaluations, continuous online monitoring in production, adversarial testing (red teaming), and human-in-the-loop oversight for ambiguous or high-risk outputs.

As the AISI moves forward, it must acknowledge the limitations of the voluntary regime and prioritise independent public interest research. This could involve the development of standardised, transparent, and empirically validated metrics tied closely to real-world risk profiles.

Access granted to AISIs usually consists of prompting the model via an Application Programming Interface (API), with no ability to scrutinise the datasets used for training. To improve this, the AISI should require information on the model supply chain, such as estimates of energy and water costs for training and inference, and labour practices.

Delivering an effective AI governance ecosystem will not be cheap and may require fees or levies on industry. A credible governance regime for AI will ultimately require legislation, and this should be an urgent priority for the next Parliament. Comprehensive legislation will be necessary to provide the statutory powers needed, as well as to fix other gaps in the UK's regulatory framework.

In summary, the safety of an AI system is not an inherent property that can be evaluated in a vacuum. To effectively regulate AI, the UK AISI will need new capabilities underpinned by legislation, including powers to compel companies to provide access to AI models, their training data, and accompanying documentation. AI safety should mean keeping people and society safe from the range of risks and harms that AI systems cause.

References:

  1. Future of Life Institute’s 2025 AI Safety Index on model evaluations
  2. Expert best practice guidance in guardrail evaluation
  3. Critical analyses of AI governance including UK and international standards

Sources:

  • [1] https://www.futureresearch.org/ai-safety-index/
  • [2] https://arxiv.org/abs/2203.01633
  • [3] https://arxiv.org/abs/2203.17301
  • [4] https://arxiv.org/abs/2204.02041
  1. The pre-release access to latest models provided by tech giants to the UK's AI Safety Institute (AISI) for risk evaluations underscores the importance of technology in AI safety, yet the current state of evaluation shows there's a pressing need for more transparency and regulation.
  2. To effectively regulate AI and ensure technology is used safely, the UK AISI may need to develop new capabilities underpinned by legislation, such as powers to compel companies to provide access to AI models, their training data, and accompanying documentation, as doing so would help in keeping people and society safe from the range of risks and harms that AI systems cause.

Read also:

    Latest