Medical student evaluations appear riddled with racial and gender biases
Men are described as ‘scientific,’ while women are ‘fabulous’ and minorities ‘pleasant’
Men are “scientific,” women are “lovely” and underrepresented minorites are “pleasant” and “nice.” If those sound like stereotypes, they are. But they’re also words commonly used to evaluate medical students, a study finds.
Analysis of nearly 88,000 evaluations of third-year medical students written from 2006 to 2015 revealed evidence of implicit bias. White women and underrepresented minority groups were more often described by words about their personalities, while men were evaluated with more words describing their competancy.
The results, published online April 16 in the Journal of General Internal Medicine, give “a good idea of what kind of words are being used,” says Carol Isaac, an education researcher at Mercer University in Atlanta not involved in the study.
Evaluations are standard after third-year students spend six to eight weeks in a hospital or clinic clerkship, during which they typically help observe patients, stitch up incisions and deliver babies. Students’ grades for this period are partly informed by performance evaluations from the physicians they study under.
Along with grades and resumes, the evaluations are “one of the main determining factors,” for where a student ends up in residency, says study coauthor Urmimala Sarkar, a physician who studies health services at the University of California, San Francisco. Choice quotes from the evaluations might also be included in “dean’s letters” sent to hospitals when students apply for medical residency — the three-plus years of postgraduate training in a chosen medical specialty.
Sarkar had noticed language differences while serving on commitees to select medical residents. “The gendered language in many letters bothered me for years,” she says. But she couldn’t tell if the differences amounted to implicit bias.
So Sarkar and colleagues, including UCSF medical student Alexandra Rojek, used a technique called natural language processing to analyze evaluations given to 8,913 students attending the UCSF School of Medicine or Brown University in Providence, R.I. The technique involved pulling different adjectives from the text, and then associating those words with students’ gender, minority status and grades. A computer program then searched the text for signs of implicit bias, Rojek explains.
Out of 1,312 descriptive words, the top 10 most-used words, including “energetic” and “smart,” appeared in equal proportion among white men, women and minorities.
But 37 other words were used differently between men and women. Twenty-three of those words described personal attributes, and of those, 57 percent were used more often for women, implying bias, the researchers say. For example, 0.4 percent of the women were described as “lovely,” while only 0.2 percent of the men were.
For minorities, researchers identified an additional 53 commonly used words that were used differently between minority and nonminority groups. Of those, 30 percent related to demeanor, and 81 percent of those were used more often for minority students, including words such as such as “pleasant” and “nice.”
“These numbers may seem small, but on the scale of this dataset, their significance is very striking,” Rojek says.
While the researchers revealed bias in evaluations, they did not attempt to tie it to which residencies students received or which specialties they pursued. Still, the analysis demonstrates that “we are not evaluating third-year medical students effectively or fairly,” Sarkar says. “It is time to rethink our long-held practices.”
Apart from career opportunities, the evaluations also play into students’ self-esteem, says Bridget Keenan, a hematologist and oncologist at UCSF not affiliated with the study. When she was in medical school, she says, “I really wanted to see how people felt about me to see if the specialty was a good fit.” An “outstanding” or “scientific” descriptor might give a student confidence toward entering surgery, neurology or pediatrics, while a more critical word choice could have the opposite effect, she says.
Keenan says that, because most everyone has biases even if they don’t know it, she tries to check herself when evaluating students. “If I’m trying to write a letter for a white woman or minority, I ask ‘would I use the same language to describe a white man?’ ”