In one of his sketches, comedian Eddie Izzard talks about how English speakers see bilingualism: “Two languages in one head? No one can live at that speed! Good lord, man. You’re asking the impossible,” he says. This satirical view used to be a serious one. People believed that if children grew up with two languages rattling around their heads, they would become so confused that their “intellectual and spiritual growth would not thereby be doubled, but halved,” wrote one professor in 1890. “The use of a foreign language in the home is one of the chief factors in producing mental retardation,” said another in 1926.

A century on, things are very different. Since the 1960s, several studies have shown that bilingualism leads to many advantages, beyond the obvious social benefits of being able to speak to more people. It also supposedly improves executive function—a catch-all term for advanced mental abilities that allow us to control our thoughts and behavior, such as focusing on a goal, ignoring distractions, switching attention, and planning for the future.

Bilinguals have lots of experience with these skills. “The bilingual mind is in constant conflict,” explains Ellen Bialystok from York University, one of the leading researchers in this field. “For every utterance, a choice is made to focus on the target language, so there is a constant need to select.” She says that this constant experience leaves its mark on the brain, strengthening the regions involved in executive function.  

It’s an intuitive claim, but also a profound one. It asserts that the benefits of bilingualism extend well beyond the realm of language, and into skills that we use in every aspect of our lives. This view is now widespread, heralded by a large community of scientists, promoted in books and magazines, and pushed by advocacy organizations. Its proponents point to reams of studies showing benefits of bilingualism on executive function, in everyone from pre-schoolers to elderly adults.

But a growing number of psychologists say that this mountain of evidence is actually a house of cards, built upon flimsy foundations. According to Kenneth Paap, a psychologist at San Francisco State University and the most prominent of the critics, bilingual advantages in executive function “either do not exist or are restricted to very specific and undetermined circumstances.”

Paap started looking into bilingualism in 2009, having spent 30 years studying the psychology of language. He began by trying to replicate some seminal experiments, including a classic 2004 paper by Bialystok involving the Simon task. In that task, volunteers press two keys in response to colored objects on a screen—for example, right key for red objects, left for green. People react faster if the position of the keys and objects match (red object on right half of the screen) than if they don’t (red object on left). But Bialystok found that twenty Tamil-English bilinguals from India were faster and more accurate at these mismatched trials than twenty English-speaking monolinguals from Canada. They were better at suppressing the location of the objects and focusing on their color—a sign of superior executive function.

“It was a really exciting finding and one that I thought would be easy to study with my students,” says Paap. “But we just couldn't replicate any of the effects.” After years of struggling, he he published his results in 2013: three studies, 280 local college students, four tests of mental control including the Simon task, and no sign of a bilingual advantage.“That broke the dam,” he says. “Others started submitting negative results and getting their articles published.”

Jon Andoni Duñabeitia, a cognitive neuroscientist at the Basque Center on Cognition, Brain, and Language, was one of them. In two large studies, involving 360 and 504 children respectively, he found no evidence that Basque kids, raised on Basque and Spanish at home and at school, had better mental control than monolingual Spanish children. “I am a multilingual researcher working in a multilingual society,” says Duñabeitia. “I'd be very happy to see an advantage for bilinguals! But science is what it is. We find no difference and we have replicated it several times, in older adults, kids, and young adults at university.”

Similar controversies have popped up throughout psychology, fueling talk of a “reproducibility crisis” in which scientists struggle to duplicate classic textbook results. In many of these cases, classic psychological phenomena that seem to be backed by years of supportive evidence, suddenly become fleeting and phantasmal. The causes are manifold. Journals are more likely to accept positive, attention-grabbing papers than negative, contradictory ones, which pushes scientists towards running small studies or tweaking experiments on the fly—practices that lead to flashy, publishable discoveries that may not actually be true.

In a recent review, which reads like a calm but clinical excoriation, Paap argues that many of these problems apply to research on the bilingualism advantage. “There’s a tendency to conduct multiple, small-sample studies that are underpowered,” he says. “That  increases the likelihood of false positives. The problem is compounded by confirmation biases, or motivations to report only the studies that work.” (He stresses that he’s only talking about the purported cognitive advantages, not social or personal ones, of which there are patently many.)

For example, one group of researchers analyzed 104 abstracts on bilingualism that were presented at scientific conferences. They found that 68 percent of abstracts that found an executive-function advantage were eventually published in journals, compared to just 29 percent that found no advantage. This publication bias, a common problem in psychology and science as a whole, means that the evidence for the phenomenon seems stronger than it actually is.

But Paap doesn’t think much of the published evidence either. He found that a bilingual advantage only shows up in one in six tests of executive function, and mostly in small studies involving 30 or fewer volunteers. The largest studies, involving a hundred or more, all found negative results.

There are other problems, too. Many studies compare monolingual and bilingual people who vary in more ways than the number of languages they speak, including their nationality, educational level, socioeconomic background, immigrant status, and cultural traits. Any of these “confounding factors” could explain why bilinguals sometimes perform better in tests of attention or mental control, and very few studies satisfactorily account for them.

For that matter, Paap argues that it’s not clear what those tests are actually measuring. In his 2013 paper, he put his volunteers through four tests that are commonly used to study executive function. The scores on those tests should correlate with each other if they were actually measuring the same cognitive skill—but they didn’t. In some cases, the correlations were near zero.

“It was a hypothesis that many of us wanted to believe in,” he says. But he no longer does.

Paap’s review triggered 21 commentaries from other scientists, 15 of which were supportive. One, by Raymond Klein from Dalhousie University, is notable because he was a co-author on Bialystok’s seminal 2004 paper. “There were always aspects of the results that I was surprised with, mostly how big some of the effects were,” Klein tells me, “but I didn’t question whether those effects were right or not.”

He was writing positively about the bilingualism advantage until 2011, when he encouraged a student, Matt Hilchey, to review the studies that had accumulated in the previous seven years. “I think he was a little embarrassed,” Klein recalls. “He was meant to write a term paper for me, and I hadn’t suspected there’d be so much negative evidence.” Hilchey’s review highlighted the same problems as Paap’s—small studies, weak evidence, confounding factors—and came to the same conclusion.

Among the other responses to Paap’s paper, the leading bilingual researchers, Bialystok included, are notable in their absence. “Most people in the field thought that it was inappropriate to answer [the review] it was so obviously badly done,” says Thomas Bak from the University of Edinburgh, one of the few who wrote a rejoinder.

Bialystok echoed his sentiments in an interview. For starters, the charge of publication bias is “utter nonsense,” she says. “Not every study coming out of every lab will get published. But is there insidious bias? Or suppression of relevant information. I see absolutely no evidence that there is.”

She also accuses Paap of selectively focusing on studies of young adults, who are least likely to show a bilingual advantage; they’re already at the peak of their cognitive powers and are unlikely to improve significantly further, bilingualism or no. When you look at what she describes as “the actual literature,” you see “an enormous amount of evidence from studies across the lifespan, using a wide variety of research methods, which show that bilingual minds and brains are significantly different from those of monolinguals.” (Paap counters that most of existing studies were done with young adults. “There’s probably a higher percentage of bilingual advantage in studies with the elderly, but most of the more recent research shows no differences in that population either,” he says.)

“Paap is not a bilingualism researcher and he doesn’t understand the field,” says Bak. “He thinks that if you have an experiment, you should do it everywhere and get the same result. When you have something as complicated as bilingualism interacting with so many variables, you’d expect varying results, depending on the circumstances and populations.”

But Paap agrees that advantages might only turn up in some age groups, specific circumstances, or certain groups of bilinguals. It’s just that those conditions seem to vary from study to study. “I think we’ll eventually triangulate on a way to do these studies and produce consistent advantages, but however that comes out, it won’t yield results that are exciting as the early ones,” he says. “And it won't have the same impact on society.”

The one point Bialystok concedes is that existing tests of executive function aren’t up to scratch. “These tests are terrible and that’s not my fault,” she says. “We try not to use them anymore, and we’re trying to find better ways of testing these ideas.” But she argues that these flaws don’t undermine the existence of the bilingual advantage, which has been supported through a “growing, new, and very consistent” line of evidence: brain-scanning studies.

For example, she and others have shown that when bilinguals switch between languages, they activate parts of the brain involved in executive function, in a way that monolinguals do not. That’s evidence that bilingualism reorganizes these parts of the brain.

Sure, says Klein, but so what? A reorganized brain isn’t necessarily a superior one.“Brain data seems to carry so much weight, but we can’t infer an advantage from a difference in brain organization,” he says. To do that, you’d have to show that neural differences align with behavioral improvements and, as he and Paap have argued, the evidence for the latter is weak.“It’s as if the individuals who are keen on this hypothesis move from task to task,” he says. “Maybe these new ones are true, or maybe we’ll see they’re not consistent in showing advantages either.”

The debate has clearly become acrimonious. Paap says that he has been ignored; Bialystok feels she is being personally attacked. Perhaps, wrote Eric-Jan Wagenmakers from the University of Amsterdam, the two sides could form an “adversarial collaboration.” That is, they would work together to test the bilingual advantage once and for all, through a large study, perhaps involving many labs. The teams would pre-register all their experimental plans and agree to publish the results no matter what, so there could be no accusation of questionable research practices or publication bias. Then let the chips fall where they may. This approach has already been used to some success in other areas of psychology.

Paap is up for it; Bialystok is not. “That’s not how science works: You don’t sign a contract to not change anything in your protocol as the research evolves,” she says. “And they’ve set up so many toxins in this area that no good collaboration can result.” Bak feels similarly. “How much would I win by working with someone who doesn’t understand the area and is already biased?”

That makes no sense, says J. Bruce Morton, who studies executive function at the University of Western Ontario. “This isn’t like trying to decipher the entrails of some subatomic collisions that take place over a billionth of a second,” he says, referring to CERN. “We’re talking about psychology for crying out loud! If people were really committed to getting to the bottom of this, we'd get together, pool our resources, study it, and that would be the end of the issue. The fact that people won’t do that suggest to me that there are those who are profiting from either perpetuating the myth or the criticism.”

In light of the controversy, some researchers are giving up entirely. “When I talk to non-scientific friends, they ask me: Why do you keep on doing experiments?” says Duñabeitia. “If there is a difference, you’ll only find it in some places and some populations, so it’s not a global thing. Why don't you go and do something more important for the world? So, we’re stopping with this line of research.”

Perhaps none of this matters. As I’ve said, there are plenty of other advantages to being bilingual, whether you bring in executive function or not. But ultimately, this isn’t about whether it’s better to know more languages or not. It’s about how science is done, and what counts as decent evidence. It’s about the role of outsiders, and whether they’re best-placed to see through the biases that permeate a field, or incapable of judging it on technical grounds. And it’s about how researchers negotiate disagreements of opinion. It is deeply ironic that a topic that’s about shared language, mental control, and communication should have spawned a debate characterized by harsh words, flaring tempers, and a refusal to speak.