DeepMind Asks: How Much Can Humans Teach AI? – Futurism

In BriefDeepMind is collaborating with humans so that its AI can learnusing human feedback instead of collecting rewards as it exploresits environment. This work will help AI systems perform moreeffectively and safely, and do what we want them to do. Humans Teaching Robots Artificial Intelligence (AI) has the potential to advance humanity and civilization than any technology that came before it. However, AI carries risks, and heavy responsibilities, with it. DeepMind, owned by Alphabet (Googles parent company), and OpenAI, a non-profit AI research company, are working to alleviate some of these concerns. They are collaborating with people (who dont necessarily have any special technical skills themselves) touse human feedback to teach AI. Not only because this feedback helps AIlearn more effectively, but also because the method providesimproved technical safety and control.

Not only because this feedback helps AIlearn more effectively, but also because the method providesimproved technical safety and control.

Among the first collaboration conclusions: AI learns by trial and error, and doesnt need humans to give it an end goal. This is good, because we already know that setting a goal thats even a little off can have disastrous results. In practice, the system used feedback to learn how to make a simulated robot do backflips.

The system is unusual because it learns by training the reward predictor, an agent from a neural network, instead of collecting rewards as it explores an environment. A reinforcement learning agent still explores the environment, but the difference is that clips of its behavior are then sent to a human periodically. That human then chooses the better behavior based on whatever the ultimate goal is. Its those human selectionsthat train the reward predictor, who in turn trains the learning agent. Finally, thelearning agent eventually learns how to improve its behavior enough to maximize its rewards which it can only do by pleasing the human.

This approach allows humans to detect and correct any behaviors that are undesirable, which ensures safety without being too burdensome for human stewards. Thats a good thing, because they need toreview about 0.1% of the agents behavior to teach it. That may not seem like much at first, butthat could well mean thousands of clips to review something the researchers are working on.

Human feedback can also help AI achieve superhuman results at least in some video games. Researchers are now parsing out why the human feedback system achieves wildly successful results with some tasks, average or even ineffective results with others.For example, no amount of human feedback could help the system master Breakout or Qbert. They are also working to fix the problem of reward hacking, in which early discontinuation of human feedback causes the system to game its reward function for bad results.

Understanding these problems is essential to building AI systems that behave as we intend them to safely and effectively. Other future goals may include reducing the amount of human feedback required, or changing the way its provided; perhaps eventually facilitatingface to face exchanges that offer the AI more opportunities to learn from actual human behavior.

Editors Note: This article has been updated to note the contributions made by OpenAI.

See the original post:
DeepMind Asks: How Much Can Humans Teach AI? - Futurism

Vet shares warning against common human behavior that gives dogs anxiety - The Mirror US - March 30th, 2025 [March 30th, 2025]
BBVA Foundation awards the psychologists who changed the way we understand and predict human behavior - WebWire - March 15th, 2025 [March 15th, 2025]
Human behavior is driven by fifteen key motives - Earth.com - February 25th, 2025 [February 25th, 2025]
Nature Human Behavior is back, this time touting allyship - Why Evolution Is True - February 25th, 2025 [February 25th, 2025]
30 Times Courtrooms Became The Stage For The Strangest Human Behavior - Bored Panda - February 3rd, 2025 [February 3rd, 2025]
The Impact of AI on Human Behavior: Insights and Implications - iTMunch - January 23rd, 2025 [January 23rd, 2025]
Disturbing Wildlife Isnt Fun: IFS Parveen Kaswan Raises Concern Over Human Behavior in Viral Clip - Indian Masterminds - January 15th, 2025 [January 15th, 2025]
The interplay of time and space in human behavior: a sociological perspective on the TSCH model - Nature.com - January 1st, 2025 [January 1st, 2025]
Thinking Slowly: The Paradoxical Slowness of Human Behavior - Caltech - December 23rd, 2024 [December 23rd, 2024]
From smog to crime: How air pollution is shaping human behavior and public safety - The Times of India - December 9th, 2024 [December 9th, 2024]
The Smell Of Death Has A Strange Influence On Human Behavior - IFLScience - October 26th, 2024 [October 26th, 2024]
"WEIRD" in psychology literature oversimplifies the global diversity of human behavior. - Psychology Today - October 2nd, 2024 [October 2nd, 2024]
Scientists issue warning about increasingly alarming whale behavior due to human activity - Orcasonian - September 23rd, 2024 [September 23rd, 2024]
Does AI adoption call for a change in human behavior? - Fast Company - July 26th, 2024 [July 26th, 2024]
Dogs can smell human stress and it alters their own behavior, study reveals - New York Post - July 26th, 2024 [July 26th, 2024]
Trajectories of brain and behaviour development in the womb, at birth and through infancy - Nature.com - June 18th, 2024 [June 18th, 2024]
AI model predicts human behavior from our poor decision-making - Big Think - June 18th, 2024 [June 18th, 2024]
ZkSync defends Sybil measures as Binance offers own ZK token airdrop - TradingView - June 18th, 2024 [June 18th, 2024]
On TikTok, Goldendoodles Are People Trapped in Dog Bodies - The New York Times - June 18th, 2024 [June 18th, 2024]
10 things only introverts find irritating, according to psychology - Hack Spirit - June 18th, 2024 [June 18th, 2024]
32 animals that act weirdly human sometimes - Livescience.com - May 24th, 2024 [May 24th, 2024]
NBC Is Using Animals To Push The LGBT Agenda. Here Are 5 Abhorrent Animal Behaviors Humans Shouldn't Emulate - The Daily Wire - May 24th, 2024 [May 24th, 2024]
New study examines the dynamics of adaptive autonomy in human volition and behavior - PsyPost - May 24th, 2024 [May 24th, 2024]
30000 years of history reveals that hard times boost human societies' resilience - Livescience.com - May 12th, 2024 [May 12th, 2024]
Kingdom of the Planet of the Apes Actors Had Trouble Reverting Back to Human - CBR - May 12th, 2024 [May 12th, 2024]
The need to feel safe is a core driver of human behavior. - Psychology Today - April 15th, 2024 [April 15th, 2024]
AI learned how to sway humans by watching a cooperative cooking game - Science News Magazine - March 29th, 2024 [March 29th, 2024]
We can't combat climate change without changing minds. This psychology class explores how. - Northeastern University - March 11th, 2024 [March 11th, 2024]
Bees Reveal a Human-Like Collective Intelligence We Never Knew Existed - ScienceAlert - March 11th, 2024 [March 11th, 2024]
Franciscan AI expert warns of technology becoming a 'pseudo-religion' - Detroit Catholic - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - messenger-inquirer - March 11th, 2024 [March 11th, 2024]
Astrocytes Play Critical Role in Regulating Behavior - Neuroscience News - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - Sunnyside Sun - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - Blue Mountain Eagle - March 11th, 2024 [March 11th, 2024]
7 Books on Human Behavior - Times Now - March 11th, 2024 [March 11th, 2024]
Euphemisms increasingly used to soften behavior that would be questionable in direct language - Norfolk Daily News - February 29th, 2024 [February 29th, 2024]
Linking environmental influences, genetic research to address concerns of genetic determinism of human behavior - Phys.org - February 29th, 2024 [February 29th, 2024]
Emerson's Insight: Navigating the Three Fundamental Desires of Human Nature - The Good Men Project - February 29th, 2024 [February 29th, 2024]
Dogs can recognize a bad person and there's science to prove it. - GOOD - February 29th, 2024 [February 29th, 2024]
What Is Organizational Behavior? Everything You Need To Know - MarketWatch - February 4th, 2024 [February 4th, 2024]
Overcoming 'Otherness' in Scientific Research Commentary in Nature Human Behavior USA - English - USA - PR Newswire - February 4th, 2024 [February 4th, 2024]
"Reichman University's behavioral economics program: Navigating human be - The Jerusalem Post - January 19th, 2024 [January 19th, 2024]
Of trees, symbols of humankind, on Tu BShevat - The Jewish Star - January 19th, 2024 [January 19th, 2024]
Tapping Into The Power Of Positive Psychology With Acclaimed Expert Niyc Pidgeon - GirlTalkHQ - January 19th, 2024 [January 19th, 2024]
Don't just make resolutions, 'be the architect of your future self,' says Stanford-trained human behavior expert - CNBC - December 31st, 2023 [December 31st, 2023]
Never happy? Humans tend to imagine how life could be better : Short Wave - NPR - December 31st, 2023 [December 31st, 2023]
People who feel unhappy but hide it well usually exhibit these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
If you display these 9 behaviors, you're being passive aggressive without realizing it - Hack Spirit - December 31st, 2023 [December 31st, 2023]
Men who are relationship-oriented by nature usually display these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
A look at the curious 'winter break' behavior of ChatGPT-4 - ReadWrite - December 14th, 2023 [December 14th, 2023]
Neuroscience and Behavior Major (B.S.) | College of Liberal Arts - UNH's College of Liberal Arts - December 14th, 2023 [December 14th, 2023]
The positive health effects of prosocial behaviors | News | Harvard ... - HSPH News - October 27th, 2023 [October 27th, 2023]
The valuable link between succession planning and skills - Human Resource Executive - October 27th, 2023 [October 27th, 2023]
Okinawa's ants show reduced seasonal behavior in areas with more human development - Phys.org - October 27th, 2023 [October 27th, 2023]
How humans use their sense of smell to find their way | Penn Today - Penn Today - October 27th, 2023 [October 27th, 2023]
Wrestling With Evil in the World, or Is It Something Else? - Psychiatric Times - October 27th, 2023 [October 27th, 2023]
Shimmying like electric fish is a universal movement across species - Earth.com - October 27th, 2023 [October 27th, 2023]
Why do dogs get the zoomies? - Care.com - October 27th, 2023 [October 27th, 2023]
How Stuart Robinson's misconduct went overlooked for years - Washington Square News - October 27th, 2023 [October 27th, 2023]
Whatchamacolumn: Homeless camps back in the news - News-Register - October 27th, 2023 [October 27th, 2023]
Stunted Growth in Infants Reshapes Brain Function and Cognitive ... - Neuroscience News - October 27th, 2023 [October 27th, 2023]
Social medias role in modeling human behavior, societies - kuwaittimes - October 27th, 2023 [October 27th, 2023]
The gift of reformation - Living Lutheran - October 27th, 2023 [October 27th, 2023]
After pandemic, birds are surprisingly becoming less fearful of humans - Study Finds - October 27th, 2023 [October 27th, 2023]
Nick Treglia: The trouble with fairness and the search for truth - 1819 News - October 27th, 2023 [October 27th, 2023]
Science has an answer for why people still wave on Zoom - Press Herald - October 27th, 2023 [October 27th, 2023]
Orcas are learning terrifying new behaviors. Are they getting smarter? - Livescience.com - October 27th, 2023 [October 27th, 2023]
Augmenting the Regulatory Worker: Are We Making Them Better or ... - BioSpace - October 27th, 2023 [October 27th, 2023]
What "The Creator", a film about the future, tells us about the present - InCyber - October 27th, 2023 [October 27th, 2023]
WashU Expert: Some parasites turn hosts into 'zombies' - The ... - Washington University in St. Louis - October 27th, 2023 [October 27th, 2023]
Is secondhand smoke from vapes less toxic than from traditional ... - Missouri S&T News and Research - October 27th, 2023 [October 27th, 2023]
How apocalyptic cults use psychological tricks to brainwash their ... - Big Think - October 27th, 2023 [October 27th, 2023]
Human action pushing the world closer to environmental tipping ... - Morung Express - October 27th, 2023 [October 27th, 2023]
What We Get When We Give | Harvard Medicine Magazine - Harvard University - October 27th, 2023 [October 27th, 2023]
Psychological Anime: 12 Series You Should Watch - But Why Tho? - October 27th, 2023 [October 27th, 2023]
Roosters May Recognize Their Reflections in Mirrors, Study Suggests - Smithsonian Magazine - October 27th, 2023 [October 27th, 2023]
June 30 Zodiac: Sign, Traits, Compatibility and More - AZ Animals - May 13th, 2023 [May 13th, 2023]
Indiana's Funding Ban for Kinsey Sex-Research Institute Threatens ... - The Chronicle of Higher Education - May 13th, 2023 [May 13th, 2023]
Have AI Chatbots Developed Theory of Mind? What We Do and Do ... - The New York Times - March 31st, 2023 [March 31st, 2023]
Scoop: Coming Up on a New Episode of HOUSEBROKEN on FOX ... - Broadway World - March 31st, 2023 [March 31st, 2023]