News

Nearly 200 Harvard Affiliates Rally on Widener Steps To Protest Arrest of Columbia Student

News

CPS Will Increase Staffing At Schools Receiving Kennedy-Longfellow Students

News

‘Feels Like Christmas’: Freshmen Revel in Annual Housing Day Festivities

News

Susan Wolf Delivers 2025 Mala Soloman Kamm Lecture in Ethics

News

Harvard Law School Students Pass Referendum Urging University To Divest From Israel

Harvard Computer Science Professor Fernanda Viégas Addresses AI Bias in Radcliffe Institute Talk

Fernanda B. Viégas, a Harvard Computer Science professor, gave a talk about generative artificial intelligence at the Radcliffe Institute for Advanced Study Wednesday. By Soumyaa Mazumder

By Xinni (Sunshine) Chen, William C. Mao, and Olivia W. Zheng, Crimson Staff Writers

November 9, 2023

Harvard Computer Science professor Fernanda B. Viégas spoke about bias in generative artificial intelligence at a talk hosted by the Radcliffe Institute for Advanced Study on Wednesday.

During the event, titled “What’s Inside a Generative Artificial-Intelligence Model? And Why Should We Care?,” Viégas spoke about experiments that showed AI models responded differently based on how researchers presented themselves.

In one experiment, Viégas said, she began a conversation with a chatbot in Portuguese — a language which uses gendered pronouns — about what she might wear to a hypothetical dinner.

At first, the chatbot addressed Viégas using masculine pronouns, she said. After she mentioned wearing a dress at the meal, the chatbot addressed Viégas in feminine pronouns without acknowledging the sudden shift.

“This got me thinking about the fact that there might be something internally in the system that actually cares about gender,” Viégas said. “Was there an internal model of the user’s gender or not?”

Viegás said AI models can also exhibit sycophancy, which she defined as mirroring the user’s beliefs. In one study, she said, a chatbot gave different answers about the ideal size of government depending on whether the user self-identified as conservative or liberal.

She suggested that AI chatbots may possess a more cohesive worldview rather than merely predicting text based on user input.

“Are they the kinds of systems that all they’re doing is memorizing?” she said. “Or are they doing something that goes beyond just statistics, where they can glimpse something about the structure of the world?”

Viégas proposed developing an AI dashboard which would display assumptions an AI model makes about a user, such as gender, education, and income, as well as the model’s assessment of its own utility.

“If they internalize some notion of our world,” Viégas said, “Wouldn’t it be nice if we at least knew about it so we could do something about it?”

Viégas said she drew the inspiration for the AI dashboard from a visit she took last summer to the National Railway Museum in England. There, she said, she learned about how railway engineers collected data on early models of locomotives, then a new and potentially dangerous technology.

Though the field of generative AI is still new, Viégas said she felt comforted that past engineers have adapted to novel and unfamiliar technologies.

“What they were doing is what we’re doing,” she said.

“It’s building all these gizmos, and all these ways of measuring something that turns out to be quite powerful, and not fully understood,” Viégas added. “So that gave me a lot of hope, like, ‘Okay, we’ve not fully understood things before that turned out to be incredibly important.’”

Want to keep up with breaking news? Subscribe to our email newsletter.

Harvard Computer Science Professor Fernanda Viégas Addresses AI Bias in Radcliffe Institute Talk

Most Read