Many developmental psychologists buy into an argument that suggests that children are dumber than rats. Should you?
Human cognition is geared towards the central task of predicting the world around it. As you may remember from an earlier post I did on the A-not-B task in infants, children aren't born understanding causal relationships right off the bat -- as a kid, you need to learn that when batter goes into the oven, it comes out as cake; when a dog jumps in water, it comes out wet; and when a shaggy-dog runs dripping through the house, mommy gets mad. As an adult, prediction operates in just about everything you do, from how much you drink at a party (who do you really want to be going home with?) to how hard you push down on the breaks (how fast do you need the car to stop?) to what you think I'm going to say next (yep, there's lots of evidence that you're predicting my words in a manner not wholly unlike Google auto-complete).
One thing that matters immensely in all of this is informativity. There are many illusory correlations in the world that you might forge -- how do you establish the causal links that matter and are meaningful?
A simple way to begin answering that question is by asking -- how do other animals do it? The behavior of rats in conditioning experiments proves illustrative. Say you take a rat, and every so often, you play a piano tone and give it a little **zap**! Pretty quickly, the rat will begin to react fearfully whenever it hears the tone, because it's predicting the upcoming shock. (Not too hard to learn that one, eh?) But next, let's say that you give another rat the same number of tone-shock pairings, but this time, you also occasionally throw in the odd note without shocking the rat. This rat won't be as skittish when it hears the tone sound, because the tone doesn't necessarily predict a shock. In line with this, the more you increase the number of tones-without-shocks, the less the rat will fear the tone. This is because you've introduced 'noise' into the signal, making the tone less informative about fur-singing jolts. (This example comes courtesy of Bob Rescorla and one Prof Plum, who is ever so fond of mentioning it.)
The bottom line is that if a rat is trying to establish when to jump, it's going to want to be tracking both positive evidence (tone and shock together) and negative evidence (tone without shock). And this becomes doubly important if you complicate the learning problem, and introduce various types of tones and shocks, and relationships between the two. The important thing is : rats can do this kind of learning without any trouble at all. What might surprise you is that there's a huge debate in psychology over whether people can.
To be fair, this debate only occurs in language. Learning theoretic models are used widely across many disciplines in psychology and neuroscience, but are conspicuously absent from mainstream research into language acquisition. Why? Well -- there are Chomsky's many arguments that language cannot be learned from the input (that could be part of it). And in addition, there is this delightful argument that Steven Pinker and colleagues like to tout, which is misleadingly called the "logical problem of language acquisition" .
I'm likely going to devote a series of posts to that particular 'problem,' at some point -- but in short, the argument is that because children early on make grammatical 'mistakes' in their speech (e.g., saying 'mouses' instead of 'mice' or 'go-ed' instead of 'went'), and because they do not receive much in the way of corrective feedback from their parents (apparently no parent ever says "No, Johnny, for the last time it's MICE"), it must therefore be impossible to explain how children ever learn to correct these errors. How -- ask the psychologists -- could little Johnny ever possibly 'unlearn' these mistakes? This supposed puzzle is taken by many in developmental psychology to be one of a suite of arguments that have effectively disproved the idea that language can be learned without an innate grammar .
As Pinker wrote, rather famously :
"The implications of the lack of negative evidence for children’s overgeneralization are central to any discussion of learning, nativist or empiricist.” (Pinker, 2004)
This statement is, quite frankly, ridiculous, and belies a complete lack of understanding of basic human learning mechanisms. (And you thought 'igon values' was bad...!)
To help you understand why, let's start off by making the (uncontroversial) assumption that -- like other young animals -- little kids are trying to figure out just what in their environment is informative, so they can better grasp (and predict) the workings of the world around them. It's easy to see how this pursuit might readily lend itself to language learning, since the more predictable upcoming speech is, the easier it is to make sense of . Indeed, it seems as though figuring out what things in the world predict which words, and which words predict which other words, would be a pretty fundamental aspect of what learning a language is all about.
In line with this, there's a growing body of evidence suggesting that expectation and prediction operate in both linguistic processing and production. So, if you're listening to someone speak, you are predicting --probabilistically-- what they're going to say next (your brain is like Google Instant on crack). For example, if I say "hit the nail on the..." you can fill in head, and if I say "I'm coming down with a...", you can predict cold -- flu -- fever -- and so on, with varying degrees of certainty. What's more, the more you hear a word occupy a given context, the more strongly you will predict it in that context in the future (DeLong, Urbach & Kutas, 2005).
So -- we know rats do it. We know adult (humans) do it. But how does prediction help kids?
Well, let's say for example that a child is trying to learn to figure out what the heck 'door' means. At first glance, this looks pretty difficult. For one thing, people don't usually go up to doors and point them out, Vanna-White style. "This, my darling, is a door." Nor is it the case that every time a door is in full view, the kid hears the word 'door.' (They're more apt to hear "Hi honey, I'm home!" or "Solicitors? Lord, not again!")
But here's precisely where informativity becomes important. Because even with a noisy signal, a child can use both positive and negative evidence to disambiguate which things in the world best predict the word 'doors' (namely, doors). Turns out that 'Honey,' 'home' and 'solicitors' will all get used in contexts where there are no doors to be found. For example, mom might call dad 'Honey' at the movie theater or dad might call a 'solicitor' a dirty word over the phone. If the child had originally anticipated that the word 'door' might be used in any of those contexts, that idea will fast be binned; prediction-error will teach the child otherwise. It's like -- huh -- I was expecting something to happen, but it didn't. Quick, I need to revise those expectations!  (Again, it's like the rat failing to be shocked -- when something fails to happen, that counts as evidence too ).
Prediction and prediction-error will also be helpful as children learn how to use words. For example, if a child is learning how to talk about plural things (like rats, cats and so on), there will initially be a lot of evidence that groups-of-things take a +s ending. (There are far more regular plural nouns in English than irregular plurals, so the vast majority of the input will suggest this). It's no surprise then, that children initially 'over-regularize' plural words, and end up saying things like 'mouses' and 'gooses.' However, prediction can help children learn that 'mice' and 'geese' are actually preferred. How? Quite simply -- if the child is expecting 'mouses' or 'gooses,' her expectations will be violated every time she hears 'mice' and 'geese' instead. And clearly that will happen a lot. Over time, this will so weaken her expectation of 'mouses' and 'gooses,' that she will stop producing these kinds of words in context.
I should emphasize that I'm not just saying this is possible 'in theory.' The hive-mind in my lab --led by the buzzing Prof Plum-- has actually modeled learning in these kinds of word-learning scenarios and shown, in some pretty elegant behavioral experiments, that kids behave exactly as these learning models would predict.
But those eastern seaboard psychologists are having none of it. Evidence -- theory -- logic aside, Pinker (and others of his ilk) would claim that negative evidence simply doesn't exist! --and that even if it does, children simply can't learn from it.
This is a, erm, puzzling conclusion to draw. So -- we know that rats use prediction-error to learn -- and, what's more, it's pretty obvious that people do too; it's not as if we somehow fail to notice when something that we expect will happen doesn't.
The joke Prof Plum always tells about this is: imagine you were fixed up on a date and you went to the restaurant and the date didn't show. Would you need the waiter to come tell you that your date hadn't arrived? Or might you not notice yourself that something was amiss?
(You get the point.)
Yet for all this,what learning-denialists developmental psychologists would have you believe is that even though you use predictive learning mechanisms every day of your life -- to successfully navigate a busy sidewalk, to add just the right amount of milk to your coffee, and to understand what your friend is saying over a choppy cellphone signal -- even though all these things are undeniably true -- children, you should know, can use none of these mechanisms in learning language. No. Because when it comes to learning language, children, it would seem, are dumber than rats...
Well, I'm so glad we got that cleared up!
What I'm left wondering is how we somehow evolved a gene that specifies the workings of a complex innate grammar, while simultaneously switching off all our general-purpose learning mechanisms -- for language and children only. Fancy trick, that one.
[Access the follow-up discussion to this post here.]
In Support of...
"Evidently, the language faculty incorporates quite specific principles that lie well beyond any "general learning mechanisms," and there is good reason to suppose that it is only one of a number of such special faculties of mind. It is, in fact, doubtful that "general learning mechanisms," if they exist, play a major part in the growth of our systems of knowledge and belief about the world in which we live--our cognitive systems. ...It is fair to say that in any domain in which we have any understanding about the matter, specific and often highly structured capacities enter into the acquisition and use of belief and knowledge."
--Noam Chomsky in The Managua Lectures
 These kinds of arguments have the same creationist-flare that early Chomsky arguments did : It's 'impossible,' they say, it could 'only work this way.' Is that science? Or dogma?
 In the past, you may have had the experience of reading a highly technical text or listening to a very dense talk that you had the worst time trying to follow. Even if the words in play were ones that you were relatively familiar with, they may have been used in ways that were completely unfamiliar and unexpected, rendering them virtually incomprehensible. On the flip side, you may have had the experience of listening to something so predictable (and boring) that it put you to sleep. From the perspective of information theory, one of the aims of communication is to effectively manage the amount of 'uncertainty' (or 'entropy') in what's being communicated, such that the message is predictable enough to be understood, but not so predictable as to be boring.
 I phrase this actively, but we think that this kind of learning isn't conscious -- it's implicit.
 You might reasonably think that listening to language and quivering in anticipation of a shock seem like mighty different things -- they certainly are. But there's good neural evidence that our brains' reward systems respond similarly to surprise in our environment, whether it be an unexpected musical phrasing in a concerto, or an amusing word in context, or -- woe be the rat -- a painful and surprising sensation. The learning rate tends to look a lot different, of course...
Attributions where attributions due: While the opinions discussed in this post are my own, many of the ideas -- about informativity and negative evidence -- are properly attributed to Michael Ramscar (the diligent Prof Plum). I even occasionally steal his jokes, it's true... The idea that so-called implicit negative evidence -- or 'prediction-error' -- might function in this way in language learning has been discussed by, among others, Elman, 1991; Bates & Carnevale, 1993; Rhode & Plaut, 1999; Seidenberg & MacDonald, 1999; Lewis & Elman, 2001; Pullum & Scholz, 2002; Prinz, 2002; Ramscar, 2002; Ramscar & Yarlett, 2007; Cowie, 2003; Johnson, 2004; MacWhinney, 2004; Hahn & Oaksford, 2008. Here's Jeff Elman's wonderful insight :
“If we accept that prediction or anticipation plays a role in language learning, then this provides a partial solution to what has been called Baker's paradox... The paradox is that children apparently do not receive (or ignore, when they do) negative evidence in the process of language learning. Given their frequent tendency initially to over-generalize from positive data, it is not clear how children are able to retract the faulty over-generalizations... However, if we suppose that children make covert predictions about the speech they will hear from others, then failed predictions constitute an indirect source of negative evidence which could be used to refine and retract the scope of generalization” (Elman, 1991).
Of course, you'd never know that this literature existed reading Pinker, Marcus, et al. When it comes to allowing (or addressing) new information, the other side of the debate looks suspiciously like east Germany before the wall fell. As far as they're concerned -- and in terms of how they cite the literature -- none of what I've just written exists, and neither do any of the articles cited above. (If you're going to play the Pinker, 2004 card -- don't. He devotes a single line to prediction error and he gets it wrong).
Remember when Gladwell wrote that article about Pinker being out there all alone on the lonely ice floe of IQ fundamentalism? It's like that with language -- except it's not a lonely ice floe. It's Antarctica. And there's still many a cold and disenchanted developmental psychologist stranded there. [[Help!]] 😉
Ramscar, M., Yarlett, D., Dye, M., Denny, K., & Thorpe, K. (2010). The Effects of Feature-Label-Order and their implications for symbolic learning. Cognitive Science, 34 (6), 909-957.
Ramscar, M., & Yarlett, D. (2007). Linguistic self-correction in the absence of feedback: A new approach to the logical problem of language acquisition. Cognitive Science, 31, 927-960
Rescorla, R. (1988). Pavlovian conditioning: It's not what you think it is. American Psychologist, 43 (3), 151-160 DOI: 10.1037//0003-066X.43.3.151
Ramscar, M., & Dye, M. (2009). Expectation and error distribution in language learning: the curious absence of mouses in adult speech. (under review)