August 19, 2025

One smart tip: How AI can help workers learn faster on the job 

Featured Researcher

Park Sinchaisri

Assistant Professor, OITM

By

Laura Counts

Image: Snvv for Adobe Stock

Mastering a new job not only takes time but often involves a good deal of trial and error, especially for people who work remotely or on their own.

This pervasive challenge inspired researchers at UC Berkeley Haas and the University of Pennsylvania to develop an artificial intelligence system that helped people learn faster and make better decisions at work. And, it worked better than advice generated by humans that seemed more intuitive.

“Think about gig workers or physicians in rural areas who don’t have a chance to learn from their peers every day,” says Park Sinchaisri, an assistant professor of operations and IT management at UC Berkeley Haas. “Of course performance suffers during this learning period.”

Sinchaisri co-authored the study, published in the journal Management Science, with Hamsa Bastani and Osbert Bastani of the University of Pennsylvania,

“What makes this algorithm so good is that it identifies the things that are hard for humans to get to on their own, sometimes because the strategy is counterintuitive,” Sinchaisri says. “What was especially interesting is that people in our study didn’t blindly follow the tips but combined them with their own experience to learn parts of the optimal strategy that weren’t even mentioned.”

This suggests a model for human-AI collaboration where machines don’t replace human judgement but provide targeted guidance to help people learn faster and develop the best strategies themselves, Sinchaisri says. “The idea is to help organizations identify the largest gap that they can close.”

The challenge of multistep decisions

While one-off decisions with immediate consequences are relatively easy to optimize, it’s much tougher to parse the results of decisions in a sequence. Each choice can affect future options and long-term results, and workers struggle to learn what drives the best outcome.

But with organizations now collecting a vast troves of “trace data” on their workers—from the movements of gig drivers to the actions that physicians log in medical records—the researchers wondered if there was a way to automatically extract best practices and distill them into a simple rule that would make the most difference.

Learning to manage a kitchen

To test their idea, Sinchaisri and his colleagues built a virtual kitchen management game that involved scheduling challenges and multiple subtasks. Players had to assign tasks like chopping, cooking, and plating to virtual workers with different skill levels, with the goal of getting food to customers as quickly as possible. In some sessions, a disruption removed a key worker partway through. 

In early rounds, the researchers used a reinforcement learning algorithm to identify the optimal policies, find the most consequential performance gap, and turn it into a simple rule. They tested it on 2,300 participants who were randomly assigned to receive no advice, a tip from peers, a tip from a simpler computer program, or a tip from the AI algorithm.

Across all scenarios, players who got the AI tip completed their orders significantly faster than the other groups. And in the most complicated scenarios that involved a disruption, 19% of participants receiving the AI tip achieved optimal performance—completing the task in 34 steps—compared to less than 1% in the other groups.

Human fallibility and counterintuitive advice

Some of the advice backfired: Those who followed the tip from the baseline computer program, which was derived from the optimal policy but missed the critical bottleneck, actually performed worse than those in other groups because they failed to adjust their strategy. And the tips suggested by humans were often too general or wrong.

“The algorithm captured the discrepancy between the existing human action and the optimal policy, which helped identify the best performance-enhancing tip,” Sinchaisri says. “This opens up exciting possibilities for using the wealth of workplace data that companies already collect to automatically identify and share best practices.”

Even though players weren’t told whether they were getting advice from a human or an AI, they were more likely to follow the human-suggested tip, likely because it “better matches human intuition,” the researchers wrote. Meanwhile, the AI-generated tip was counterintuitive, so players were more likely to ignore it in their first rounds. They only adopted it as they gained experience and began to understand its importance, and then to combine it with other strategies.  

Beyond following instructions

The research also showed that making advice more intuitive or increasing compliance didn’t necessarily improve performance. Rather than just mimicking the highest performers, the AI-generated tip was counterintuitive but highly consequential, while human-suggested tips were easier to follow but less impactful.

The findings could have broad applications across industries where workers make sequential decisions, often amid uncertainty. The next step, Sinchaisri says, is to apply the learnings to even more complex scenarios.

“This suggests that AI can guide human learning in ways that go beyond simple instruction-following,” Sinchaisri said. “In most settings, there are nuances where it’s just not feasible to automate everything. But if you can find that one simple and effective piece of advice, people can start to figure out best practices on their own.”

Read the full paper: 

Improving Human Sequential Decision Making with Reinforcement Learning

By Hamsa Bastani, Osbert Bastani, and Wichinpong Park Sinchaisri 

Management Science (Published online May 2025)