Over the last few months, experts and lawmakers have become increasingly concerned that advances in artificial intelligence could help bad actors develop biological threats. But so far there have been no reported biological misuse examples involving AI or the AI-driven chatbots that have recently filled news headlines. This lack of real-world wrongdoing prevents direct evaluation of the changing threat landscape at the intersection of AI and biology.
Nonetheless, researchers have conducted experiments that aim to evaluate sub-components of biological threatssuch as the ability to develop a plan for or obtain information that could enable misuse. Two recent effortsby RAND Corporation and OpenAIto understand how artificial intelligence could lower barriers to the development of biological weapons concluded that access to a large language model chatbot did not give users an edge in developing plans to misuse biology. But those findings are just one part of the story and should not be considered conclusive.
In any experimental research, study design influences results. Even if technically executed to perfection, all studies have limitations, and both reports dutifully acknowledge theirs. But given the extent of the limitations in the two recent experiments, the reports on them should be seen less as definitive insights and more as opportunities to shape future research, so policymakers and regulators can apply it to help identify and reduce potential risks of AI-driven misuse of biology.
The limitations of recent studies. In the RAND Corporation report, researchers detailed the use of red teaming to understand the impact of chatbots on the ability to develop a plan of biological misuse. The RAND researchers recruited 15 groups of three people to act as red team bad guys. Each of these groups was asked to come up with a plan to achieve one of four nefarious outcomes (vignettes) using biology. All groups were allowed to access the internet. For each of the four vignettes, one red team was given access to an unspecified chatbot and another red team was given access to a different, also unspecified chatbot. When the authors published their final report and accompanying press release in January, they concluded that large language models do not increase the risk of a biological weapons attack by a non-state actor.
This conclusion may be an overstatement of their results, as their focus was specifically on the ability to generate a plan for biological misuse.
The other report was posted by the developers of ChatGPT, OpenAI. Instead of using small groups, OpenAI researchers had participants work individually to identify key pieces of information needed to carry out a specific defined scenario of biological misuse. The OpenAI team reached a conclusion similar to the RAND teams: GPT-4 provides at most a mild uplift in biological threat creation accuracy. Like RAND, this also may be an overstatement of results as the experiment evaluated the ability to access information, not actually create a biological threat.
The OpenAI report was met with mixed reactions, including skepticism and public critique regarding the statistical analysis performed. The core objection was the appropriateness of the use of a correction during analysis that re-defined what constituted a statistically significant result. Without the correction, the results would have been statistically significantthats to say, the use of the chatbot would have been judged to be a potential aid to those interested in creating biological threats.
Regardless of their limitations, the OpenAI and RAND experiments highlight larger questions which, if addressed head-on, would enable future experiments to provide more valuable and actionable results about AI-related biological threats.
Is there more than statistical significance? In both experiments, third-party evaluators assigned numeric scores to the text-based participant responses. The researchers then evaluated if there was a statistically significant difference between those who had access to chatbots and those who did not. Neither research team found one. But typically, the ability to determine if a statistically significant difference exists largely depends on the number of data points; more data points allow for a smaller difference to be considered statistically significant. Therefore, if the researchers had many more participants, the same differences in score could have been statistically significant.
Reducing text to numbers can bring other challenges as well. In the RAND study, the teams, regardless of access to chatbots, did not generate any plans that were deemed likely to succeed. However, there may have been meaningful differences in why the plans were not likely to succeed, and systematically comparing the content of the responses could prove valuable in identifying mitigation measures.
In the OpenAI work, the goal of the participants was to identify a specific series of steps in a plan. However, if a participant were to miss an early step in the plan, all the remaining steps, even if correct, would not count towards their score. This meant that if someone made an error early on, but identified all the remaining information correctly, they would score similarly to someone who did not identify any correct information. Again, researchers may gain insight from identifying patterns in which steps and why participants failed.
Are the results generalizable? To inform an understanding of the threat landscape, conclusions must be generalizable across scenarios and chatbots. Future evaluators should be clear on which large language models are used (the RAND researchers were not). It would be helpful to understand if researchers achieve a similar answer with different models or different answers with the same model. Knowing the specifics would also enable comparisons of results based on the characteristics of the chatbot used, enabling policymakers to understand if models with certain characteristics have unqiue capabilities and impact.
The OpenAI experiment used just one threat scenario. There is not much reason to believe that this one scenario is representative of all threat scenarios; the results may or may not generalize. There is a tradeoff in using one specific scenario; it becomes tenable for one or two people to evaluate 100 responses. On the other hand, the RAND work was much more open-ended as participant teams were given flexibility in how they decided to achieve their intended goal. This makes the results more generalizable, but required a more extensive evaluation procedure that involved many experts to sufficiently examine 15 diverse scenarios.
Are the results impacted by something else? Part way through their experiment, the RAND researchers enrolled a black cell, a group with significant experience with large language models. The RAND researchers made this decision because they noticed that some of their studys red teams were struggling to bypass safety features of the chatbots. In the end, the black cell received an average score almost double that of the corresponding red teams. The black cell participants didnt need to rely only on their expertise with large language models; they were also adept at interpreting the academic literature about those models. This provided a valuable insight to the RAND researchers, which is [t]herelative outperformance of the black cell illustrates that a greater source of variability appears to be red team composition, as opposed to LLM access. Simply put, it probably matters more who is on the team than if the team has access to a large language model or not.
Moving forward. Despite their limitations, red teaming and benchmarking efforts remain valuable tools for understanding the impact of artificial intelligence on the deliberate biological threat landscape. Indeed, the National Institute of Standards and Technologys Artificial Intelligence Safety Institute Consortiuma part of the US Department of Commercecurrently has working groups focused on developing standards and guidelines for this type of research.
Outside of technical design and execution of the experiments, challenges remain. The work comes with meaningful financial costs including the compensation of participants for their time (OpenAI pays $100 per hour to experts); for indviduals to recruit participants, design experiments, administer the experiments, and analyze data; and of biosecurity experts to evaluate the responses. Therefore, it is important to consider who will fund this type of work in the future. Should artificial intelligence companies fund their own studies, a perceived conflict of interest will linger if the results are intended to be used to inform governance or public perception of their models risks. But at the same time, funding that is directed to nonprofits like RAND Corporation or to academia does not inherently enable researchers access to unreleased or modified models, like the version used in the OpenAI experiment. Future work should learn from these two reports, and could benefit from considering the following:
The path toward more useful research on AI and biological threats is hardly free of obstacles. Employees at the National Institute of Standards and Technology have reportedly expressed outrage regarding the recent appointment of Paul Christianoa former OpenAI researcher who has expressed concerns that AI could pose an existential threat to humanityto a leadership role at the Artificial Intelligence Safety Institute. Employees are concerned that Christianos personal beliefs about catastrophic and extistential risk posed by AI broadly will affect his ability to maintain the National Institute of Standards and Technologys commitment to objectivity.
This internal unrest comes on the heels of reporting that the physical buildings that house the institute are falling apart. As Christiano looks to expand his staff, he will also need to compete against the salaries paid by tech companies. OpenAI, for example, is hiring for safety-related roles with the low end of the base salary exceeding the high end of the general service payscale (federal salaries). It is unlikely that any relief will come from the 2024 federal budget, as lawmakers are expected to decrease the institutes budget from 2023 levels. But if the United States wants to remain a global leader in the development of artificial intelligence, it will need to make financial commitments to ensure that the work required to evaluate artificial intelligence is done right.
See the rest here:
- NMSU research team focuses on cancer biology through partnership, increase underrepresented student research - NMSU Newsroom - November 12th, 2024 [November 12th, 2024]
- Sea angels and devils: could plankton unlock the secrets of human biology? - The Guardian - November 12th, 2024 [November 12th, 2024]
- Synthetic Biology Market to Hit USD 31.52 Billion by 2029 with 20.6% CAGR | MarketsandMarkets - PR Newswire - November 12th, 2024 [November 12th, 2024]
- How ecDNA Fuels Cancer by Breaking the Laws of Biology - Howard Hughes Medical Institute - November 12th, 2024 [November 12th, 2024]
- Research Spotlight: Biology and Environmental Science, Psychology, and Religious Studies Faculty to Present Current Projects - Sherman Denison Herald... - November 12th, 2024 [November 12th, 2024]
- Symmetry in biology: A look into how bees actively organize nests in mirroring patterns - Phys.org - November 12th, 2024 [November 12th, 2024]
- UKRI invests 5.8m in engineering biology - Research Professional News - November 12th, 2024 [November 12th, 2024]
- HTAN Members Deliver Wealth of Tumor Biology Insights - Inside Precision Medicine - November 12th, 2024 [November 12th, 2024]
- Tenure-Track: Assistant Professor in Marine Biology job with Texas A&M University - Galveston | 37740878 - The Chronicle of Higher Education - October 26th, 2024 [October 26th, 2024]
- Teaching Assistant/Associate Professor, Chemistry and Chemical Biology - The Chronicle of Higher Education - October 26th, 2024 [October 26th, 2024]
- Emerging strategies to investigate the biology of early cancer - Nature.com - October 26th, 2024 [October 26th, 2024]
- Future Medicine: Physics, Biology, And AI Will Transform Human Health - Forbes - October 26th, 2024 [October 26th, 2024]
- NATIONAL VIEW: When AI looked at biology, the result was astounding - Odessa American - October 26th, 2024 [October 26th, 2024]
- Can We Pick Winners With Causal Human Biology? Vertex Makes the Case - Timmerman Report - October 26th, 2024 [October 26th, 2024]
- What Remains of Edith Finch Developers Next Game is About the Wonders and Horrors of Biology&... - GamingBolt - October 26th, 2024 [October 26th, 2024]
- Castle Biosciences to Support the 71st Annual Montagna Symposium on the Biology of the Skin - BioSpace - October 13th, 2024 [October 13th, 2024]
- 'Where we are today in biology AI is similar to GPT in 2020': An interview with the CEO of Africa's biggest AI startup - TechCrunch - October 13th, 2024 [October 13th, 2024]
- Bruker spools up spatial biology division from NanoString, Canopy assets - Fierce Biotech - October 13th, 2024 [October 13th, 2024]
- Enhanced efficiency in the bilingual brain through the inter-hemispheric cortico-cerebellar pathway in early second language acquisition |... - October 13th, 2024 [October 13th, 2024]
- Recursions Fast-Track Road to Therapeutics Using AI-Based Maps of Biology - Genetic Engineering & Biotechnology News - October 13th, 2024 [October 13th, 2024]
- The Biology of 'Precancer': Stopping Cancer Before It Starts - Medscape - October 13th, 2024 [October 13th, 2024]
- URMC Researcher wins 2024 Albany Medical Center Prize in Medicine and Biology - 13WHAM-TV - October 13th, 2024 [October 13th, 2024]
- Opinion | When AI looked at biology, the result was astounding - The Washington Post - October 13th, 2024 [October 13th, 2024]
- Castle Biosciences to Support the 71st Annual Montagna Symposium on the Biology of the Skin - Business Wire - October 13th, 2024 [October 13th, 2024]
- Inside the ISS: Astronauts Push the Limits of Biology As Crew-8 Departure Looms - SciTechDaily - October 13th, 2024 [October 13th, 2024]
- Implications of RNA pseudouridylation for cancer biology and therapeutics: a narrative review - Journal of Translational Medicine - October 13th, 2024 [October 13th, 2024]
- The fruit fly revolutionized biology. Now its boosting science in Africa - Science News Magazine - October 2nd, 2024 [October 2nd, 2024]
- Richard Dawkins on biology, genes and his 38-year-old girlfriend - The Times - October 2nd, 2024 [October 2nd, 2024]
- Smithsonians National Zoo and Conservation Biology Institute Repatriates Kiwi Feathers to New Zealand - Smithsonian's National Zoo and Conservation... - October 2nd, 2024 [October 2nd, 2024]
- CWRU debuts state-of-the-art biology laboratory classrooms and collaborative spaces - The Daily | Case Western Reserve University - October 2nd, 2024 [October 2nd, 2024]
- Biology professor honored with Award of Excellence for his contributions to algae research - University of Alabama at Birmingham - October 2nd, 2024 [October 2nd, 2024]
- Ohio Northern University Hosts Mock Crime Scene Investigation with Forensic Biology and Nursing Students - WKTN Radio - October 2nd, 2024 [October 2nd, 2024]
- UWO alumnus, biology researcher is featured guest on prominent science podcast - UW Oshkosh Today - October 2nd, 2024 [October 2nd, 2024]
- Biology Students experience international research through RISE Fellowship Grant - Illinois State University News - October 2nd, 2024 [October 2nd, 2024]
- New chairperson to diversify research in biochemistry and molecular biology - University of Nevada, Reno - October 2nd, 2024 [October 2nd, 2024]
- Discover Magazine Speaks with Biology Professor Bruce Robertson About Evolutionary Traps - Bard College - October 2nd, 2024 [October 2nd, 2024]
- Creature Feature: Meet the "Freshwater Giant" Arapaima - Smithsonian's National Zoo and Conservation Biology Institute - October 2nd, 2024 [October 2nd, 2024]
- A Quiet Revolution: The Global Race to Control Human Biology and Its Implications - HSToday - October 2nd, 2024 [October 2nd, 2024]
- Improving biology education here, there, and everywhere - MIT News - September 23rd, 2024 [September 23rd, 2024]
- 'It smells like a food bin that's overflowing': The weird biology of the giant smelly 'corpse plant' - BBC.com - September 23rd, 2024 [September 23rd, 2024]
- Instructional Professor (Open Rank) in Computational Biology - The Chronicle of Higher Education - September 23rd, 2024 [September 23rd, 2024]
- UFs Rob Ferl and Anna-Lisa Paul Talk About Blue Origin Spaceflight and Space Biology Experiments - WUFT - September 23rd, 2024 [September 23rd, 2024]
- New Alzheimer's studies reveal disease biology, risk for progression and the potential for a novel blood test - Medical Xpress - September 23rd, 2024 [September 23rd, 2024]
- David Rubenstein Donates $10 Million to Smithsonians National Zoo and Conservation Biology Institutes Giant Panda Program - Smithsonian Institution - September 23rd, 2024 [September 23rd, 2024]
- Optimization: A Theoretical Principle That Is Predictive for Biology - Discovery Institute - September 23rd, 2024 [September 23rd, 2024]
- SOMETHING FISHY: CSUB biology professors find hundreds of dead fish in dry Kern River - MSN - September 15th, 2024 [September 15th, 2024]
- Star Trek Changes Its Iconic Tribbles Forever, With Shock Revelation About Their Biology - Screen Rant - September 15th, 2024 [September 15th, 2024]
- Penn's Biology Department removes concentrations, prompting mixed reactions from students - The Daily Pennsylvanian - September 15th, 2024 [September 15th, 2024]
- Google DeepMind And Isomorphic Labs Are Making Rapid Progress In Biology And Drug Discovery - Forbes - September 15th, 2024 [September 15th, 2024]
- Bridging biology and art: An interview with Nigerian artist Samuel Ubong - Global Voices - September 15th, 2024 [September 15th, 2024]
- SOMETHING FISHY: CSUB biology professors find hundreds of dead fish in dry Kern River - KERO 23 ABC News Bakersfield - September 15th, 2024 [September 15th, 2024]
- From the marriage of physics and biology emerged a technology that has revolutionised ophthalmology laser - The Hindu - September 2nd, 2024 [September 2nd, 2024]
- Groves named head of developmental biology - Washington University School of Medicine in St. Louis - September 2nd, 2024 [September 2nd, 2024]
- John Callaghan, biology professor at USC Dornsife, served as university marshal for 30 years - USC Dornsife College of Letters, Arts and Sciences - September 2nd, 2024 [September 2nd, 2024]
- Altered expression of vesicular trafficking machinery in prostate cancer affects lysosomal dynamics and provides insight into the underlying biology... - September 2nd, 2024 [September 2nd, 2024]
- A frugal CRISPR kit for equitable and accessible education in gene editing and synthetic biology - Nature.com - August 5th, 2024 [August 5th, 2024]
- UM Announces $5 Million Endowment to Create Chair in Fisheries Science for Wildlife Biology Program - University of Montana - August 5th, 2024 [August 5th, 2024]
- New Insights into MaleFemale Biology from Platypus and Chicken Chromosomes - Technology Networks - August 5th, 2024 [August 5th, 2024]
- Meta alum launches AI biology model that simulates 500 million years of evolution - VentureBeat - June 27th, 2024 [June 27th, 2024]
- The strategy behind one of the most successful labs in the world - Nature.com - June 27th, 2024 [June 27th, 2024]
- Following the 'BATT Signal:' A new signaling pathway controlling planarian germ cells - EurekAlert - June 27th, 2024 [June 27th, 2024]
- Doctor Who's two hearts explained by USC Dornsife alumna - USC Dornsife College of Letters, Arts and Sciences - June 27th, 2024 [June 27th, 2024]
- Evolving Education - Ohio Wesleyan University - June 27th, 2024 [June 27th, 2024]
- Biology Camp gives kids a jump start on science - Odessa American - June 27th, 2024 [June 27th, 2024]
- Special Issue of Applied Biosafety focuses on synthetic genomics - EurekAlert - June 27th, 2024 [June 27th, 2024]
- Sandra Shumway Named Fellow of the Marine Biological Association - UConn Today - University of Connecticut - June 27th, 2024 [June 27th, 2024]
- Hendrix biology professor publishes research paper | News | thecabin.net - Log Cabin Democrat - June 27th, 2024 [June 27th, 2024]
- Conagen: Deep dive into synthetic biology processes and innovation for beauty with Casey Lippmeier - Personal Care Insights - June 27th, 2024 [June 27th, 2024]
- Seeking refuge in science - ASBMB Today - June 27th, 2024 [June 27th, 2024]
- UNF biology professor discovers northernmost mangroves ever recorded - UNF Spinnaker - June 27th, 2024 [June 27th, 2024]
- EvolutionaryScale Raises $142 Million To Transform Biology With AI - Finimize - June 27th, 2024 [June 27th, 2024]
- Guiding humanity beyond the moon: OHIO researchers push to revolutionize human space biology - Ohio University - June 27th, 2024 [June 27th, 2024]
- Olson offers students a window into aquatic world - Nebraska Today - June 27th, 2024 [June 27th, 2024]
- Now you can get a bachelor's degree in biology in Greenland - Polarjournal - June 27th, 2024 [June 27th, 2024]
- The Biology of Butterflies in the Stomach - Medscape - June 27th, 2024 [June 27th, 2024]
- On the water front: invasive lake species - UMN News - June 27th, 2024 [June 27th, 2024]
- UNCW marine biology professor shares what people need to know about shark bites, beach safety - WRAL News - June 27th, 2024 [June 27th, 2024]
- Network-driven cancer cell avatars for combination discovery and biomarker identification for DNA damage response ... - Nature.com - June 27th, 2024 [June 27th, 2024]
- Armenian students win eight medals at 4th International Applied Biology Olympiad Public Radio of Armenia - Public Radio of Armenia Official Web site - June 27th, 2024 [June 27th, 2024]
- New tomato, potato family tree shows that fruit color and size evolved together - EurekAlert - June 27th, 2024 [June 27th, 2024]