In big data course, MBA students learn how to second-guess an AI

Berkeley MBA students listen intently in the "Big Data, Better Decisions" course.
MBA students listen intently during a session of the new “Big Data, Better Decisions” course. | Copyright Noah Berger 2018

“Classified” is a series spotlighting some of the more powerful lessons faculty are teaching in Haas classrooms.

On a recent spring morning, the future of business education was on full display in Connie & Kevin Chou Hall on the Haas School of Business campus. Standing before a 5th-floor classroom packed with aspiring MBAs, Assoc. Prof. Jonathan Kolstad was talking advanced data science as part of his new course, “Big Data and Better Decisions.”

Kolstad asked students to pick between two common models for designing a predictive algorithm. “Tell me a situation where you would want to use a regression tree over a linear regression,” said Kolstad. A student spoke up: “When you’re using Facebook likes to analyze Metallica fans versus people who like to eat hamburgers.”

“Nice,” said Kolstad. “You can reach a subtler conclusion by using a tree that puts hamburger ‘likes’ and Metallica ‘likes’ together—you can see when a combination of these factors matters, rather than looking at each one alone.”  

Assoc. Prof Jonathan Kolstad immerses himself in his teaching. | Copyright Noah Berger, 2018

If ever there was a moment that captured how far business education has come from its 19th century roots, this was it. With data now pervasive across company operations—from marketing and sales to human resources and finance—there’s a demand for business managers and leaders who know a lot more than the basics. Companies are embracing data-driven machine learning to forecast everything from consumer behavior to worker productivity. They’re also relying more on randomized trials, a process long associated with medicine, to test sales promotions or advertising campaigns before they launch.

“There’s a growing need within companies for MBAs trained in data analytics,” says Prof. Paul Gertler, an economist who is co-teaching the course. “This class is designed to prepare students to be part of the modern labor force and leaders of industry.”

New breed of data-driven MBAs

Kolstad and Gertler aren’t looking to train data scientists. Their goal is to enable MBAs to bring a question-the-status-quo mindset to data-driven initiatives within their organizations—whether they are themselves part of a data team, managing one, or tasked as a C-suite leader with setting strategy based on what the numbers suggest. Students learn to blend business strategy with new tools from machine learning.

Yet equipping students with the right data skills means one thing: delving into the nitty-gritty of data analytics—even if, as Gertler says “they never touch another piece of data in their lives.”

“This isn’t a theory course,” said Gertler, who is the Li Ka Shing Professor at Haas with a joint appointment at the UC Berkeley School of Public Health. “This is a get-your-hands-dirty-up-to-your-elbows-in-data course.”

This meant, among other things, requiring students to learn a programming language called R, which is popular for data mining. They also had to complete a prerequisite, “Data and Decisions,” and a core statistics course.

“This isn’t a theory course. This is a get-your-hands-dirty-up-to-your-elbows-in-data course.” —Prof. Paul Gertler

To be sure, this hands-on approach made the class a stretch for students without experience in big data analytics—but also made it particularly rewarding for those who pushed through the technical challenges.

Laura Andersen, MBA 19, appreciated that the course was the real deal. She says data is changing the game in her chosen field of social services, but non-profits and government agencies often lack the resources to take full advantage of it.

“I need to know the vocabulary and to be able to pull some of the best practices around data from the private sector,” says Anderson, who plans to learn more by tapping into university-wide resources like UC Berkeley’s D-Lab, which offers classes and other services that support data-intensive research in social sciences. “This class has been a great investment, and experiencing it on a campus with some of the best academic resources in the world on these topics has been a great way to get started.”

Causation + prediction

Prof. Paul Gertler, center, joins students in listening to a course session. | Copyright Noah Berger, 2018

Led by Gertler, the first months of the class were devoted to the growing use of randomized trials in business and different tests for finding causation—a critical step in data analysis. “Businesses are no longer saying, ‘Let’s try this and see whether we feel like it worked or not,'” says Gertler, who is also the scientific director at the UC Berkeley Center for Effective Global Action. “They want hard evidence, not only that a strategy worked but also by how much.”

To that end, students learned how one randomized trial showed that buyer discounts drove car sales faster than boosting incentives for sales teams. They studied how eBay ran a series of random experiments to test its Google advertising and found zero upside to running ads alongside brand-specific searches. In a third case study, generic drug makers called off a planned $5 million marketing campaign after a randomized trial showed it would have no effect on its target audience: physicians prescribing brand-name drugs.

“Randomized trials are as much about finding what doesn’t work—the counterfactual—as it is about finding what does,” Gertler told the class. “That’s the dirty little secret about data analysis.”

How to 2nd-guess an AI

For Kolstad, who taught the machine learning portion of the class, a key theme was the “black box.” As artificial intelligence has grown, so, too, have instances where an algorithm makes accurate predictions yet nobody really understands why. For example, a program could know that a certain consumer will click on an ad, but the “black box” means it can’t tell whether that’s because the online user was young and female or liked Metallica and hamburgers.

“Analytics consultants often come to the C-suite and say, ‘We have these fancy models that predict really well,'” said Kolstad, who also co-directs the health initiative at the UC Berkeley Opportunity Lab and is co-founder of Philadelphia startup Picwell, which provides personalized insurance recommendations.

“Now that you understand what’s really behind these models, you know the right questions to ask,” Kolstad tells students. “You can say, ‘I don’t care just about predictive performance. I want to know my customers better.’ And you can ask, ‘How did you avoid ‘overfitting’ the data? Did you leave data out? Can I provide data of my own for you to test?’”

Students like KC Loder, MBA/MPH 18, signed up for the class because they, too, recognize that advanced data analytics is a must-have skill for today’s business leaders. Before coming to Haas, he worked in health care and saw firsthand how hospitals and other providers leverage data to keep costs down and improve patient care.

“Whether you’re working in health care or education or big tech, if you are the strategic thinker and you can’t understand what good data science is and what the data scientists on your team are doing, then you’re going to be obsolete,” says Loder.

Prof. Paul Gertler speaks with a student after class. | Noah Berger, 2018