In-depth analysis: Have AI giants like Google and Microsoft fulfilled their prom
A year ago, seven artificial intelligence companies, including Amazon, Microsoft, Google, Meta, OpenAI, Anthropic, and Inflection, reached eight voluntary commitments with the White House on how to develop artificial intelligence in a safe and trustworthy manner.
These commitments include improving testing and transparency of artificial intelligence systems, and sharing information about potential harms and risks.
On the first anniversary of the signing of the voluntary commitment, MIT Technology Review asked the AI companies that signed the commitment for some details of the work done so far. Their responses indicate that the tech industry has made some encouraging progress, but there are also some significant warnings.
Advertisement
These voluntary commitments were made at a time when the fervor for generative AI was "possibly reaching the most bubble," with companies competing to launch their own models and make them bigger and better than their competitors' models. At the same time, we also began to see debates around issues such as copyright and deep fakes. Influential tech figures such as Geoffrey Hinton have also raised concerns that artificial intelligence may pose an existential risk to humanity. Suddenly, everyone started talking about the urgent need to ensure the safety of artificial intelligence, and regulatory agencies everywhere are under pressure to take action.
Until recently, the development of artificial intelligence has been like the "Wild West." Traditionally, the United States has been reluctant to regulate its tech giants, instead relying on self-regulation. Voluntary commitments are a good example: this is some of the prescriptive rules the United States has for the field of artificial intelligence, but they are still voluntary and cannot be enforced. The White House then issued an executive order that expanded these commitments and applied to other technology companies and government departments."A year has passed, and we have seen some companies adopt some good practices for their products, but in terms of good governance or protecting fundamental rights, they are far from the level we need," said Merve Hickok, President and Research Director of the Center for Artificial Intelligence and Digital Policy. She reviewed the companies' responses at the request of MIT Technology Review. "Among them, many companies are still continuing to make unverified claims about their products, such as claiming that they can surpass human intelligence and capabilities," she added.
A trend that emerged in the responses from these technology companies is that companies are taking more measures to seek technical solutions, such as red teaming (humans exploring the flaws in AI models) and adding watermarks to AI-generated content.
Rishi Bommasani, head of the Center for the Study of Foundational Models at Stanford University, said it is unclear what changes these commitments have undergone, and it is also unclear whether these companies will implement these measures. He also reviewed these responses for MIT Technology Review.
For the field of artificial intelligence, a year is a long time. Since signing the voluntary commitment, Mustafa Suleyman, founder of Inflection AI, has left the company and joined Microsoft to lead work related to artificial intelligence. Inflection declined to comment.
White House spokesperson Robyn Patterson said: "We are grateful for the progress made by leading companies in fulfilling voluntary commitments beyond the requirements of the executive order. However, the President continues to call on Congress to pass bipartisan legislation on artificial intelligence."Brandie Nonnecke, the director of the CITRIS Policy Lab at the University of California, Berkeley, stated that without comprehensive federal legislation, what the United States can do now is to demand that companies fulfill these voluntary commitments.
However, it is important to remember that "these companies are essentially studying for the exams they are being given," said Brandie Nonnecke. "So we must carefully observe whether they are truly verifying themselves in a genuinely rigorous manner."
Below is our assessment of the progress these artificial intelligence companies have made in the past year.
Commitment 1: The companies commit to conducting internal and external security testing of their AI systems before their release. This testing, which will be carried out in part by independent experts, guards against some of the most significant sources of AI risks, such as biosecurity and cybersecurity, as well as its broader societal effects.Commitment 1: Conduct internal and external security testing of artificial intelligence systems before their release. A portion of this testing will be carried out by independent experts, aimed at guarding against some of the most significant sources of AI risks, such as biosafety, cybersecurity, and broader societal impacts.
All companies (excluding Inflection, which chose not to comment) stated that they have engaged in red team-blue team exercises, allowing internal and external testers to explore the vulnerabilities and risks of their models. OpenAI has indicated that it has an independent readiness team responsible for testing models for cybersecurity, chemical, biological, radiological, and nuclear threats, as well as complex AI models that can do or persuade a person to do things that could lead to harm. Anthropic and OpenAI also stated that they conduct these tests in collaboration with external experts before launching new models. For instance, for the release of its latest model Claude 3.5, Anthropic conducted pre-deployment tests with experts from the UK-based AI Safety Research Institute, and Anthropic also allowed the non-profit research institution METR to conduct a "preliminary exploration" of Claude 3.5's autonomous driving capabilities. Google stated that it also carried out internal red team-blue team exercises on its model Gemini to test the boundaries of election-related content, social risks, and national security issues. Microsoft stated that it has collaborated with third-party evaluators from NewsGuard, an organization promoting news integrity, to assess and mitigate risks of deepfake abuse in Microsoft's text-to-image tool. Meta stated that, in addition to red team-blue team exercises, it also evaluated its latest model Llama 3 to understand its performance in a range of risk areas including weapons, cyberattacks, and child exploitation.
"In terms of testing, it is not enough to simply report that companies are taking action," said Rishi Bommasani. For example, Amazon and Anthropic stated that they have partnered with the non-profit organization Thorn to address the risks AI poses to children's safety. He hopes to learn more details about how the interventions being implemented by companies are genuinely reducing these risks.
"We should clearly recognize that it is not just that companies are doing things, but that these things are having the desired effect," said Rishi Bommasani.
Outcome: Good. Promoting red team-blue team exercises and testing for various risks is an important task. However, Merve Hickok hopes to see independent researchers have broader access to the models of companies.Commitment 2: The companies commit to sharing information across the industry and with governments, civil society, and academia on managing AI risks. This includes best practices for safety, information on attempts to circumvent safeguards, and technical collaboration.
After signing the voluntary commitment, Google, Microsoft, Anthropic, and OpenAI jointly established the "Frontier Model Forum," a non-profit organization aimed at promoting discussions and actions related to AI safety and responsibility. Later, Amazon and Meta also joined.
Rishi Bommasani stated that collaborating with non-profit organizations funded by AI companies themselves may not align with the spirit of the voluntary commitment. In his view, the Frontier Model Forum could be a way for these companies to cooperate with each other and convey safety information, which is usually difficult for them as competitors.
"Even if they don't disclose information to the public, you might hope that they can at least collectively find ways to reduce risks," said Rishi Bommasani.All seven signatories are also members of the Artificial Intelligence Safety and Innovation Consortium (AISIC) established by the National Institute of Standards and Technology (NIST), which sets guidelines and standards for AI policy and AI performance assessment. It is a large consortium composed of participants from both the public and private sectors. Google, Microsoft, and OpenAI also have representatives in the United Nations High-Level Advisory Group on Artificial Intelligence.
Many companies also emphasize their research collaborations with the academic community. For example, Google is part of MLCommons, where it conducts cross-industry AI safety benchmark research with scholars. Google also stated that it actively contributes tools and resources such as computing credits to projects like the National Science Foundation's National AI Research Resource Pilot Program, which aims to democratize artificial intelligence research in the United States.
Many companies have also contributed to the "Partnership on AI," another non-profit organization co-founded by Amazon, Google, Microsoft, Facebook, DeepMind, and IBM, responsible for the deployment of foundational models.
Result: More work is still needed. As the industry works together to make AI systems safe and reliable, sharing more information is undoubtedly an important step in the right direction. However, it is not yet clear how much meaningful change the announced efforts will actually lead to, and how much is just window dressing.Commitment 3: The companies commit to investing in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights. These model weights are the most essential part of an AI system, and the companies agree that it is vital that the model weights be released only when intended and when security risks are considered.
Many companies have implemented new cybersecurity measures in the past year. For example, Microsoft launched the "Secure Future Program" to address the expanding scale of cyber attacks. Microsoft stated that its model weights are encrypted to reduce the potential risk of model theft and apply strong authentication and access control when deploying highly customized models.
Google also introduced an artificial intelligence cyber defense program. In May of this year, OpenAI shared six new measures it is developing to complement its existing cybersecurity practices, such as extending encryption protection to artificial intelligence hardware. It also has a cybersecurity funding program that allows researchers to use its models to build cybersecurity defenses.
Amazon stated that it has also taken specific measures against attacks unique to generative AI, such as "data poisoning" and "prompt injection," which may use prompts to guide language models to ignore previous instructions and security measures.After signing the voluntary commitment, Anthropic released detailed information about its protective measures a few days later, which included common cybersecurity practices such as controlling who has access to the models and model weights, as well as checking and controlling third-party supply chains. The company also worked with independent assessment agencies to evaluate whether the control measures it designed meet cybersecurity needs.
Result: Good. All companies have indicated that they have taken additional measures to protect their models, although there seems to be little consensus on the best methods for protecting AI models.
Commitment 4. The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their AI systems. Some issues may persist even after an AI system is released, and a robust reporting mechanism enables them to be found and fixed quickly.
For fulfilling this commitment, one of the most popular methods is to implement a "vulnerability bounty" program, which rewards individuals who discover defects in AI systems. Companies including Google, Microsoft, Meta, Anthropic, and OpenAI have all launched such programs for AI systems. Amazon and Anthropic also stated that they have established forms on their websites where security researchers can submit vulnerability reports.In Brandie Nonnecke's view, it may take several years to figure out how to conduct third-party audits effectively. "This is not just a technical challenge, but also a socio-technical challenge. It will take us years not only to understand the technical standards of artificial intelligence but also to figure out the socio-technical standards, which are complex and difficult," she said.
Brandie Nonnecke expressed her concern that the first companies to implement third-party audits might set a bad precedent for how to think about and address the socio-technical risks of artificial intelligence. For example, audits might define, assess, and address certain risks but overlook others.
Result: More work is still needed. Bug bounties are a good approach, but they are not comprehensive enough. New laws, such as the European Union's Artificial Intelligence Act, will require technology companies to conduct audits, and it would be even better if these companies shared successful cases of such audits.
Commitment 5. The companies commit to developing robust technical mechanisms to ensure that users know when content is AI-generated, such as a watermarking system. This action enables creativity with AI to flourish while reducing the risks of fraud and deception.Many companies have already built watermark systems for content generated by artificial intelligence. For example, Google has launched a watermark tool called SynthID for images, audio, text, and videos generated by Gemini. Meta has developed an image watermark tool called "Stable Signature" and a voice watermark tool called "AudioSeal". Amazon now adds "invisible watermarks" to images generated by its Titan image generation model. OpenAI has used watermarks in its custom voice model Voice Engine and has built an image detector classifier for images generated by DALL-E 3. Anthropic is the only company that has not yet built a watermark tool, as watermarks are mainly used for images, and the company's Claude model does not support images.
All companies except Inflection, Anthropic, and Meta are also members of the "Content Source and Authenticity Alliance" (C2PA), which is an industry alliance that embeds information about the time of content creation and whether the content was created or edited by artificial intelligence or humans into image metadata. Microsoft and OpenAI automatically attach C2PA source metadata to images created with DALL-E 3 and videos created with Sora. Although Meta is not a member of the alliance, it has announced that it is using the C2PA standard to identify AI-generated images on its platform.
"The six companies that have signed a voluntary commitment naturally tend to adopt technical methods to address risks, especially watermark systems," said Rishi Bommasani.
"The question is, can 'technical solutions' make meaningful progress and address the underlying social issues that prompt us to wonder whether the content is machine-generated?" he added.
Result: Good. Overall, this is an inspiring result. Although watermark systems are still in the experimental stage and still unreliable, it is still good to see research around it and commitment to the C2PA standard. This is better than nothing, especially in a busy election year.Commitment 6. The companies commit to publicly reporting their AI systems' capabilities, limitations, and areas of appropriate and inappropriate use. This report will cover both security risks and societal risks, such as the effects on fairness and bias.
The White House's commitment leaves a lot of room for interpretation. For example, as long as companies take action in this direction, they can technically meet the requirements for such public disclosure, and the level of transparency can vary greatly.
Here, the most common solution provided by technology companies is the so-called "model cards." Although each company may call them slightly different, essentially they serve as a product description for AI models. They can cover everything from the model's capabilities and limitations (including how to measure fairness and interpretability benchmarks) to authenticity, robustness, governance, privacy, and security. Anthropic stated that it will also test whether the model has potential security issues that may arise in the future.Microsoft has released its annual "Responsible AI Transparency Report," which provides an in-depth look at how the company builds applications using generative artificial intelligence, makes decisions, and oversees the deployment of these applications. Microsoft also stated that it has clearly indicated where and how artificial intelligence is used in its products.
Results: There is still more work to be done. Merve Hickok said that improving the transparency of governance structures and financial relationships between companies will be an area that all companies need to improve on. She also hopes to see companies being more open about data sources, model training processes, security incidents, and energy usage.
Commitment 7. The companies commit to prioritizing research on the societal risks that AI systems can pose, including on avoiding harmful bias and discrimination, and protecting privacy. The track record of AI shows the insidiousness and prevalence of these dangers, and the companies commit to rolling out AI that mitigates them.
Technology companies have been busy with safety research and integrating the results into their products. Amazon has built "guardrails" for "Amazon Bedrock" that can detect hallucinations and also apply safety, privacy, and authenticity protection. Anthropic stated that the company has hired a research team focused on studying societal risks and privacy, and over the past year, the company has released research on deception, jailbreaking, discrimination reduction strategies, and emerging capabilities such as model tampering with its own code or persuasion. OpenAI stated that it has trained its models to avoid generating "hateful content" and refuses to generate hate or extremist content; it has also trained GPT-4V to refuse many requests that require answering based on stereotypes. Google DeepMind has also published research reports on assessing dangerous capabilities and conducted research on the misuse of generative artificial intelligence.All companies have invested heavily in research in this field. For example, Google has invested tens of millions of dollars to create a new artificial intelligence security fund, promoting research in this field through the Frontier Models Forum. Microsoft has stated that it has committed to providing $20 million in funding to study social risks through the National Artificial Intelligence Research Resources, and has launched an artificial intelligence model research accelerator project, namely the "Accelerate Basic Model Research" plan. The company has also hired 24 researchers focusing on artificial intelligence and sociology.
Result: Very good. This is a commitment that is easy to fulfill because the signatories are some of the world's largest and wealthiest corporate artificial intelligence research laboratories. While more research on how to ensure the safety of artificial intelligence systems is a commendable step, critics point out that the focus on safety research will divert attention and resources from artificial intelligence research that focuses on more direct harms, such as discrimination and bias.
Commitment 8. The companies commit to developing and deploying advanced AI systems to help address society's greatest challenges. From cancer prevention to mitigating climate change to so much in between, AI—if properly managed—can contribute enormously to the prosperity, equality, and security of all.
Since making this commitment, technology companies have been addressing a variety of issues. For example, Pfizer uses Claude to assess trends in cancer treatment research after collecting relevant data, while the American biopharmaceutical company Gilead uses Amazon Web Services' generative AI for feasibility assessments of clinical research and data set analysis.Google DeepMind has a strong track record in launching artificial intelligence tools that can assist scientists. For instance, AlphaFold 3 can predict the structure and interactions of nearly all molecules of life. AlphaGeometry's level of problem-solving in geometry is comparable to that of an excellent high school student. GraphCast is an AI model capable of making medium-range weather forecasts. Meanwhile, Microsoft is using satellite imagery and artificial intelligence to improve responses to wildfires on Maui in Hawaii and to map populations vulnerable to climate impacts, which helps researchers uncover risks such as food insecurity, forced migration, and disease.
At the same time, OpenAI has announced collaborations and funding for various research projects, such as projects on how educators and scientists can safely use multimodal AI models in laboratory settings. The company also provides funding to help researchers develop clean energy "hackathons" on their platform.
Result: Very good. Some of the work using artificial intelligence to facilitate scientific discovery or predict weather is indeed exciting. AI companies have not yet used artificial intelligence to prevent cancer, after all, it is quite a high threshold.
Overall, there have been some positive changes in the way artificial intelligence is built, such as red-blue confrontation, watermark systems, and new ways of sharing best practices across the industry. However, these are just some clever technical solutions found to address the chaotic socio-technical problem of AI hazards, and there is still a lot of work to be done. After a year, the promise is still overly focused on a special type of AI safety, which focuses on "hypothetical risks" such as biological weapons, without mentioning consumer protection, deep fakes, data and copyright, and the environmental footprint of AI, which are very strange omissions today.
Comment