A team of Master of Financial Engineering (MFE) students took first place in “The Data Open” this month, besting 25 teams who competed across UC Berkeley in the regional data science competition.
The winning team—Li Cao, Pierre Foret, Teddy Legros, and Hosang Yoon, all MFE 19—won $20,000 in the competition semi-final, sponsored by Citadel LLC, a global asset management firm, and operated by data scientist recruiting firm Correlation One.
The Berkeley Haas semi-finalists will travel to the New York Stock Exchange later this fall to compete against top universities for a $100,000 grand prize.
The Sept. 8 competition asked teams to solve a complex problem by analyzing complex data sets.
More than 400 students across UC Berkeley applied to participate, and 100 were accepted. Event organizers grouped the contestants into teams, many of whom—including the winners—had not worked together before.
Choosing a question
The night before the event, the teams were given a description of data sets—data that applied to everything from 311 calls in New York state to restaurant inspection data. The teams were challenged choosing a question and finding the answer in the data.
“Most data science competitions focus on achieving the best possible score for a predefined metric,” Foret said. “In contrast, we had to come up with our own question, which is a lot closer to the type of problems a data scientist has to tackle in the industry.”
The team outlined two goals: identifying the top factors that influence public health in New York state and providing local government officials with policy recommendations based on their analysis.
Working through the night, they prepared code that allowed them to test their hypotheses quickly, so they were able to spend most of the competition time running experiments, analyzing the data, and choosing the most relevant problem to solve.
Legros said their data analysis tracked factors that affect residents’ health, including both traditional socioeconomic ones and several lifestyle considerations—such as smoking habits, eating habits and living conditions—that evolve over time. “Next, we identified clusters of regions where some of these determining lifestyle factors are more dominant. This can help policymakers develop regional proposals to improve the health of residents.”
“It was one of the most intense 24 hours I have spent,” Yoon said. “While the competition itself was short, I really learned a lot from our preparation.”
The judges, announcing the winner, said they impressed by the originality of the team’s question and the strategy to focus on comprehensive view of the factors involved.