AI learned how to sway humans by watching a cooperative cooking game – Science News Magazine

If youve ever cooked a complex meal with someone, you know the level of coordination required. Someone dices this, someone sauts that, as you dance around holding knives and hot pans. Meanwhile, you might wordlessly nudge each other, placing ingredients or implements within the others reach when youd like something done.

How might a robot handle this type of interaction?

Research presented in late 2023 at the Neural Information Processing Systems, or NeurIPS, conference, in New Orleans, offers some clues. It found that in a simple virtual kitchen, AI can learn how to influence a human collaborator just by watching humans work together.

In the future, humans will increasingly collaborate with artificial intelligence, both online and in the physical world. And sometimes well want an AI to silently guide our choices and strategies, like a good teammate who knows our weaknesses. The paper addresses a crucial and pertinent problem, how AI can learn to influence people, says Stefanos Nikolaidis, who directs the Interactive and Collaborative Autonomous Robotic Systems (ICAROS)lab at the University of Southern California in Los Angeles, and was not involved in the work.

The new work introduces a way for AI to learn to collaborate with humans, without even practicing with us. It could help us improve human-AI interactions, Nikolaidis says, and detect when AI might take advantage of us whether humans have programmed it to do so, or, someday, it decides to do so on its own.

There are a few ways researchers have already trained AI to influence people. Many approaches involve whats called reinforcement learning (RL), in which an AI interacts with an environment which can include other AIs or humans and is rewarded for making sequences of decisions that lead to desired outcomes. DeepMinds program AlphaGo, for example, learned the board game Go using RL.

But training a clueless AI from scratch to interact with people through sheer trial-and-error can waste a lot of human hours, and can even presents risks if there are, say, knives involved (as there might be in a real kitchen). Another option is to train one AI to model human behavior, then use that as a tireless human substitute for another AI to learn to interact with. Researchers have used this method in, for example, a simple game that involved entrusting a partner with monetary units. But realistically replicating human behavior in more complex scenarios, such as a kitchen, can be difficult.

The new research, from a group at the University of California, Berkeley, used whats called offline reinforcement learning. Offline RL is a method for developing strategies by analyzing previously documented behavior rather than through real-time interaction. Previously, offline RL had been used mostly to help virtual robots move or to help AIs solve mazes, but here it was applied to the tricky problem of influencing human collaborators. Instead of learning by interacting with people, this AI learned by watching human interactions.

Humans already have a modicum of competence at collaboration. So the amount of data needed to demonstrate competent collaboration when two people are working together is not as much as would be needed if one person were interacting with an AI that had never interacted with anyone before.

In the study, the UC Berkeley researchers used a video game called Overcooked, where two chefs divvy up tasks to prepare and serve meals, in this case soup, which earns them points. Its a 2-D world, seen from above, filled with onions, tomatoes, dishes and a stove with pots. At each time step, each virtual chef can stand still, interact with whatever is in front of it, or move up, down, left or right.

The researchers first collected data from pairs of people playing the game. Then they trained AIs using offline RL or one of three other methods for comparison. (In all methods, the AIs were built on a neural network, a software architecture intended to roughly mimic how the brain works.) In one method, the AI just imitated the humans. In another, it imitated the best human performances. The third method ignored the human data and had AIs practice with each other. And the fourth was the offline RL, in which AI does more than just imitate; it pieces together the best bits of what it sees, allowing it to perform better than the behavior it observes. It uses a kind of counterfactual reasoning, where it predicts what score it would have gotten if it had followed different paths in certain situations, then adapts.

The AIs played two versions of the game. In the human-deliver version, the team earned double points if the soup was delivered by the human partner. In the tomato-bonus version, soup with tomato and no onion earned double points. After the training, the chefbots played with real people. The scoring system was different during training and evaluation than when the initial human data were collected, so the AIs had to extract general principles to score higher. Crucially, during evaluation, humans didnt know these rules, like no onion, so the AIs had to nudge them.

On the human-deliver game, training using offline RL led to an average score of 220, about 50 percent more points than the best comparison methods. On the tomato-bonus game, it led to an average score of 165, or about double the points. To support the hypothesis that the AI had learned to influence people, the paper described how when the bot wanted the human to deliver the soup, it would place a dish on the counter near the human. In the human-human data, the researchers found no instances of one person passing a plate to another in this fashion. But there were events where someone put down a dish and ones where someone picked up a dish, and the AI could have seen value in stitching these acts together.

The researchers also developed a method for the AI to infer and then influence humans underlying strategies in cooking steps, not just their immediate actions. In real life, if you know that your cooking partner is slow to peel carrots, you might jump on that role each time until your partner stops going for the carrots. A modification to the neural network to consider not only the current game state but also a history of their partners actions would give a clue as to what their partners current strategy is.

Again, the team collected human-human data. Then they trained AIs using this offline RL network architecture or the previous offline RL one. When tested with human partners, inferring the partners strategy improved scores by roughly 50 percent on average. In the tomato-bonus game, for example, the bot learned to repeatedly block the onions until people eventually left them alone. That the AI worked so well with humans was surprising, says study coauthor Joey Hong, a computer scientist at UC Berkeley.

Avoiding the use of a human model is great, says Rohan Paleja, a computer scientist at MIT Lincoln Laboratory in Lexington, Mass., who was not involved in the work. It makes this approach applicable to a lot of real-world problems that do not currently have accurate simulated humans. He also said the system is data-efficient; it achieved its abilities after watching only 20 human-human games (each 1,200 steps long).

Nikolaidis sees potential for the method to enhance AI-human collaboration. But he wishes that the authors had better documented the observed behaviors in the training data and exactly how the new method changed peoples behaviors to improve scores.

In the future, we may be working with AI partners in kitchens, warehouses, operating rooms, battlefields and purely digital domains like writing, research and travel planning. (We already use AI tools for some of these tasks.) This type of approach could be helpful in supporting people to reach their goals when they dont know the best way to do this, says Emma Brunskill, a computer scientist at Stanford University who was not involved in the work. She proposes that an AI could observe data from fitness apps and learn to better nudge people to meet New Years exercise resolutions through notifications (SN: 3/8/17). The method might also learn to get people to increase charitable donations, Hong says.

On the other hand, AI influence has a darker side. Online recommender systems can, for example,try to have us buy more, or watch more TV, Brunskill says, not just for this moment, but also to shape us into being people who buy more or watch more.

Previous work, which was not about human-AI collaboration, has shown how RL can help recommender systems manipulate users preferences so that those preferences would be more predictable and satisfiable, even if people didnt want their preferences shifted. And even if AI means to help, it may do so in ways we dont like, according to Micah Carroll, a computer scientist at UC Berkeley who works with one of the paper authors. For instance, the strategy of blocking a co-chefs path could be seen as a form of coercion. We, as a field, have yet to integrate ways for a person to communicate to a system whattypes of influence they are OK with, he says. For example, Im OK with an AI trying to argue for a specific strategy, but not forcing me to do it if I dont want to.

Hong is currently looking to use his approach to improve chatbots (SN: 2/1/24). The large language models behind interfaces such as ChatGPT typically arent trained to carry out multi-turn conversations. A lot of times when you ask a GPT to do something, it gives you a best guess of what it thinks you want, he says. It wont ask for clarification to understand your true intent and make its answers more personalized.

Learning to influence and help people in a conversation seems like a realistic near-term application. Overcooked, he says, with its two dimensions and limited menu, is not really going to help us make better chefs.

How Smart Cities Are Redesigning Human Behavior - Lakeland Connect - June 10th, 2025 [June 10th, 2025]
HUMAN TRAFFICKING | 'That was normal behavior': Victim recalls being 'sold' by her mother, then the aftermath of abuse - The Tribune-Democrat - June 10th, 2025 [June 10th, 2025]
Tech company unveils eerie new way to map human behavior: 'We're tokenizing the invisible ones' - The Cool Down - June 1st, 2025 [June 1st, 2025]
Simulating Human Behavior with AI Agents - Stanford HAI - May 21st, 2025 [May 21st, 2025]
'Human behavior is the basis of the energy transition' - ioplus.nl - May 21st, 2025 [May 21st, 2025]
Driverless taxi ride surprises with human-like behavior - Alton Telegraph - May 21st, 2025 [May 21st, 2025]
VeChains Bold Vision to Tokenize Human Behavior - 99Bitcoins - May 21st, 2025 [May 21st, 2025]
Study links most alligator attacks to risky human behavior - Gulf Coast News and Weather - Southwest Florida News - April 27th, 2025 [April 27th, 2025]
UF study finds risky human behavior is the cause for most alligator bites - The Palm Beach Post - April 19th, 2025 [April 19th, 2025]
Study Finds 96% of Gator Bites Are the Result of Risky Human Behavior - Gizmodo - April 19th, 2025 [April 19th, 2025]
A Growing Pathway to Understanding Human Behavior - University of Northern Colorado - April 19th, 2025 [April 19th, 2025]
The Rehearsal S2: Nathan Fielder Explores Human Behavior - Hollywood.com - April 19th, 2025 [April 19th, 2025]
A Bad Rap: Most alligator bites are caused by risky human behavior, UF researchers say - WCJB TV20 - April 19th, 2025 [April 19th, 2025]
AI humanoid robot learns to mimic human emotions and behavior - Fox News - April 19th, 2025 [April 19th, 2025]
INTERVIEW: Dying for Sex Director Shannon Murphy on Portraying Authentic Human Behavior by Blending Comedy & Drama - The Knockturnal - April 10th, 2025 [April 10th, 2025]
7 Must-Read Psychology Books That Will Help You Decode Human Behavior - Times Now - April 10th, 2025 [April 10th, 2025]
Vet shares warning against common human behavior that gives dogs anxiety - The Mirror US - March 30th, 2025 [March 30th, 2025]
BBVA Foundation awards the psychologists who changed the way we understand and predict human behavior - WebWire - March 15th, 2025 [March 15th, 2025]
Human behavior is driven by fifteen key motives - Earth.com - February 25th, 2025 [February 25th, 2025]
Nature Human Behavior is back, this time touting allyship - Why Evolution Is True - February 25th, 2025 [February 25th, 2025]
30 Times Courtrooms Became The Stage For The Strangest Human Behavior - Bored Panda - February 3rd, 2025 [February 3rd, 2025]
The Impact of AI on Human Behavior: Insights and Implications - iTMunch - January 23rd, 2025 [January 23rd, 2025]
Disturbing Wildlife Isnt Fun: IFS Parveen Kaswan Raises Concern Over Human Behavior in Viral Clip - Indian Masterminds - January 15th, 2025 [January 15th, 2025]
The interplay of time and space in human behavior: a sociological perspective on the TSCH model - Nature.com - January 1st, 2025 [January 1st, 2025]
Thinking Slowly: The Paradoxical Slowness of Human Behavior - Caltech - December 23rd, 2024 [December 23rd, 2024]
From smog to crime: How air pollution is shaping human behavior and public safety - The Times of India - December 9th, 2024 [December 9th, 2024]
The Smell Of Death Has A Strange Influence On Human Behavior - IFLScience - October 26th, 2024 [October 26th, 2024]
"WEIRD" in psychology literature oversimplifies the global diversity of human behavior. - Psychology Today - October 2nd, 2024 [October 2nd, 2024]
Scientists issue warning about increasingly alarming whale behavior due to human activity - Orcasonian - September 23rd, 2024 [September 23rd, 2024]
Does AI adoption call for a change in human behavior? - Fast Company - July 26th, 2024 [July 26th, 2024]
Dogs can smell human stress and it alters their own behavior, study reveals - New York Post - July 26th, 2024 [July 26th, 2024]
Trajectories of brain and behaviour development in the womb, at birth and through infancy - Nature.com - June 18th, 2024 [June 18th, 2024]
AI model predicts human behavior from our poor decision-making - Big Think - June 18th, 2024 [June 18th, 2024]
ZkSync defends Sybil measures as Binance offers own ZK token airdrop - TradingView - June 18th, 2024 [June 18th, 2024]
On TikTok, Goldendoodles Are People Trapped in Dog Bodies - The New York Times - June 18th, 2024 [June 18th, 2024]
10 things only introverts find irritating, according to psychology - Hack Spirit - June 18th, 2024 [June 18th, 2024]
32 animals that act weirdly human sometimes - Livescience.com - May 24th, 2024 [May 24th, 2024]
NBC Is Using Animals To Push The LGBT Agenda. Here Are 5 Abhorrent Animal Behaviors Humans Shouldn't Emulate - The Daily Wire - May 24th, 2024 [May 24th, 2024]
New study examines the dynamics of adaptive autonomy in human volition and behavior - PsyPost - May 24th, 2024 [May 24th, 2024]
30000 years of history reveals that hard times boost human societies' resilience - Livescience.com - May 12th, 2024 [May 12th, 2024]
Kingdom of the Planet of the Apes Actors Had Trouble Reverting Back to Human - CBR - May 12th, 2024 [May 12th, 2024]
The need to feel safe is a core driver of human behavior. - Psychology Today - April 15th, 2024 [April 15th, 2024]
We can't combat climate change without changing minds. This psychology class explores how. - Northeastern University - March 11th, 2024 [March 11th, 2024]
Bees Reveal a Human-Like Collective Intelligence We Never Knew Existed - ScienceAlert - March 11th, 2024 [March 11th, 2024]
Franciscan AI expert warns of technology becoming a 'pseudo-religion' - Detroit Catholic - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - messenger-inquirer - March 11th, 2024 [March 11th, 2024]
Astrocytes Play Critical Role in Regulating Behavior - Neuroscience News - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - Sunnyside Sun - March 11th, 2024 [March 11th, 2024]
Freshwater resources at risk thanks to human behavior - Blue Mountain Eagle - March 11th, 2024 [March 11th, 2024]
7 Books on Human Behavior - Times Now - March 11th, 2024 [March 11th, 2024]
Euphemisms increasingly used to soften behavior that would be questionable in direct language - Norfolk Daily News - February 29th, 2024 [February 29th, 2024]
Linking environmental influences, genetic research to address concerns of genetic determinism of human behavior - Phys.org - February 29th, 2024 [February 29th, 2024]
Emerson's Insight: Navigating the Three Fundamental Desires of Human Nature - The Good Men Project - February 29th, 2024 [February 29th, 2024]
Dogs can recognize a bad person and there's science to prove it. - GOOD - February 29th, 2024 [February 29th, 2024]
What Is Organizational Behavior? Everything You Need To Know - MarketWatch - February 4th, 2024 [February 4th, 2024]
Overcoming 'Otherness' in Scientific Research Commentary in Nature Human Behavior USA - English - USA - PR Newswire - February 4th, 2024 [February 4th, 2024]
"Reichman University's behavioral economics program: Navigating human be - The Jerusalem Post - January 19th, 2024 [January 19th, 2024]
Of trees, symbols of humankind, on Tu BShevat - The Jewish Star - January 19th, 2024 [January 19th, 2024]
Tapping Into The Power Of Positive Psychology With Acclaimed Expert Niyc Pidgeon - GirlTalkHQ - January 19th, 2024 [January 19th, 2024]
Don't just make resolutions, 'be the architect of your future self,' says Stanford-trained human behavior expert - CNBC - December 31st, 2023 [December 31st, 2023]
Never happy? Humans tend to imagine how life could be better : Short Wave - NPR - December 31st, 2023 [December 31st, 2023]
People who feel unhappy but hide it well usually exhibit these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
If you display these 9 behaviors, you're being passive aggressive without realizing it - Hack Spirit - December 31st, 2023 [December 31st, 2023]
Men who are relationship-oriented by nature usually display these 9 behaviors - Hack Spirit - December 31st, 2023 [December 31st, 2023]
A look at the curious 'winter break' behavior of ChatGPT-4 - ReadWrite - December 14th, 2023 [December 14th, 2023]
Neuroscience and Behavior Major (B.S.) | College of Liberal Arts - UNH's College of Liberal Arts - December 14th, 2023 [December 14th, 2023]
The positive health effects of prosocial behaviors | News | Harvard ... - HSPH News - October 27th, 2023 [October 27th, 2023]
The valuable link between succession planning and skills - Human Resource Executive - October 27th, 2023 [October 27th, 2023]
Okinawa's ants show reduced seasonal behavior in areas with more human development - Phys.org - October 27th, 2023 [October 27th, 2023]
How humans use their sense of smell to find their way | Penn Today - Penn Today - October 27th, 2023 [October 27th, 2023]
Wrestling With Evil in the World, or Is It Something Else? - Psychiatric Times - October 27th, 2023 [October 27th, 2023]
Shimmying like electric fish is a universal movement across species - Earth.com - October 27th, 2023 [October 27th, 2023]
Why do dogs get the zoomies? - Care.com - October 27th, 2023 [October 27th, 2023]
How Stuart Robinson's misconduct went overlooked for years - Washington Square News - October 27th, 2023 [October 27th, 2023]
Whatchamacolumn: Homeless camps back in the news - News-Register - October 27th, 2023 [October 27th, 2023]
Stunted Growth in Infants Reshapes Brain Function and Cognitive ... - Neuroscience News - October 27th, 2023 [October 27th, 2023]
Social medias role in modeling human behavior, societies - kuwaittimes - October 27th, 2023 [October 27th, 2023]
The gift of reformation - Living Lutheran - October 27th, 2023 [October 27th, 2023]
After pandemic, birds are surprisingly becoming less fearful of humans - Study Finds - October 27th, 2023 [October 27th, 2023]
Nick Treglia: The trouble with fairness and the search for truth - 1819 News - October 27th, 2023 [October 27th, 2023]