Who will protect the risks of AI? Turing Award winner: It's still AI.
Known as one of the "godfathers" of modern artificial intelligence and a Turing Award laureate, Yoshua Bengio is fully supporting a project to embed safety mechanisms into artificial intelligence systems, which is funded by the UK government.
This project, named "Safeguarded AI," aims to build an artificial intelligence system capable of checking the safety of other AI systems deployed in critical areas. Bengio will join the project as the scientific director and will provide key opinions and scientific advice. The project will receive a funding of 59 million pounds over the next four years, which comes from the UK Advanced Research and Invention Agency (ARIA). Established in January last year, the agency aims to invest in potentially transformative scientific research.
The goal of "Safeguarded AI" is to build artificial intelligence systems that can provide quantitative guarantees, such as risk scores, to assess their impact on the real world. According to David Dalrymple, the head of the "Safeguarded AI" program at ARIA, the idea is to use mathematical analysis to complement manual testing and evaluate the potential hazards of new systems.
Advertisement
The project hopes to build AI safety mechanisms by combining scientific models that are essentially simulations of the world with mathematical proofs. These proofs will include explanations of the AI's work, and the task of humans is to verify whether the AI model's safety checks are correct.Bengio stated that he wishes to ensure that future AI systems do not cause serious harm. "We are rushing towards a fog, and behind the fog could be a cliff," he said, "We don't know how far the cliff is, or even if such a cliff exists, it could be something that happens in a few years or decades later, we also don't know how severe it will be... We need to develop tools to disperse this fog to ensure that we don't rush towards the cliff."
"Technology companies currently cannot mathematically ensure that AI systems will operate as intended," he added, "This unreliability could lead to catastrophic consequences."
Dalrymple and Bengio believe that the current technical means used to mitigate the risks of advanced AI systems (such as red team testing, where people detect whether there are flaws in the AI system) have serious limitations and cannot fully rely on them to ensure that critical systems do not deviate from their course.
Instead, they hope that the plan will provide new ways to ensure the safety of AI systems, reducing reliance on human labor and relying more on mathematical certainty. Their vision is to build a "gatekeeper" AI responsible for understanding and reducing the safety risks of other AI agents. This gatekeeper AI will ensure that AI agents operating in high-risk areas, such as transportation or energy systems, can operate as expected. Dalrymple said that the idea of the plan is to work with companies as early as possible to understand how AI safety mechanisms can be applied to different industries.
Bengio believes that the complexity of advanced systems means we have no choice but to use AI to ensure the safety of AI. "This is the only way, because at a certain stage, when these AIs become too complex, even the AI we have now cannot really break down the answer into a series of reasoning steps that humans can understand," he said.The next step is to actually build models that can inspect other AI systems, which is where "AI Safety Guardians" and ARIA hope to change the current state of the AI industry.
ARIA also provides funding to individuals or organizations in high-risk industries such as transportation, telecommunications, supply chain, and medical research to help them develop applications that may benefit from AI safety mechanisms. The total funding for the first year is 5.4 million pounds, followed by another 8.2 million pounds the following year, with the application deadline being October 2.
The agency is also actively looking for talents interested in building "AI Safety Guardians" safety mechanisms through a non-profit organization. ARIA is expected to invest up to 18 million pounds to establish such an organization and will accept funding applications at the beginning of next year.
Dalrymple said that the plan is looking for proposals to start a non-profit organization with a diverse board of directors, covering many different industries, in order to carry out this work in a reliable and credible manner. This is similar to the original purpose of OpenAI before it changed its strategy to focus more on products and profits.
The board of directors of the organization will not only be responsible for overseeing the CEO but will also participate in deciding whether to undertake certain research projects, as well as whether to publish specific papers and APIs, he added.The "AI for Security" project is part of the United Kingdom's mission to position itself as a "pioneer in AI security." In November 2023, the country hosted its first AI Security Summit, bringing together world leaders and technology experts to discuss how to build technology in a safe manner.
Although the funding program tends to favor local applicants in the UK, ARIA is looking for talent interested in coming to the UK on a global scale, according to Dalrymple. In addition, ARIA has an intellectual property mechanism for funding overseas for-profit companies, allowing royalties to return to the UK.
Bengio led the "International High-Level AI Safety Science Report," which involved 30 countries as well as the European Union and the United Nations. He said he was attracted to the project because it promotes international cooperation in AI safety, and as an active advocate for AI safety, he has participated in an influential lobbying group warning of the existential risks that super-intelligent AI might bring.
"We need to expand the discussion on how to deal with AI risks to a broader range of global participants," said Bengio. "This project brings us closer to this goal."
Comment