An argument is often made that similarities between languages (so-called "linguistic universals") provide strong evidence for the existence of an innate, universal grammar (UG) that is shared by all humans, regardless of language spoken. If language were not underpinned by such a grammar, it is argued, there would be endless (and extreme) variation, of the kind that has never been documented. Therefore -- the reasoning goes -- there simply must be design biases that shape how children learn language from the input they receive.
There are several potentially convincing arguments made in favor of innateness in language, but this, I think, is not one of them.
Why? Let me explain by way of a evolutionary biology:
Both bats and pterodactlys have wings, and both humans and squid have eyes, but neither pair shares a common ancestor that had these traits. This is because wings and eyes are classic examples of 'convergent' evolution -- traits that arose in separate species as 'optimal' (convergent) solutions to the surrounding environment. Convergent evolution has always struck me as a subversive evolutionary trick, because it demonstrates how external constraints can produce markedly similar adaptations from utterly different genetic stock. Not only that, but it upends our commonsense intuitions about how to classify the world around us, by revealing just how powerfully surface similarities can mask differences in origin.
When we get to language, then, it need not be surprising that many human languages have evolved similar means of efficiently communicating information. From an evolutionary perspective, this would simply suggest that various languages have, over time, 'converged' on many of the same solutions. This is made even more plausible by the fact that every competent human speaker, regardless of language spoken, shares roughly the same physical and cognitive machinery, which dictates a shared set of drives, instincts, and sensory faculties, and a certain range of temperaments, response-patterns, learning facilities and so on. In large part, we also share fairly similar environments -- indeed, the languages that linguists have found hardest to document are typically those of societies at the farthest remove from our own (take the Piraha as a case in point).
The existence of similarities between languages then, hardly speaks to the question of whether there is a "universal grammar." It could be, as the UG-proponents argue, the case that there are strong inbuilt linguistic-biases that indelibly shape how children learn language. Or it could just as easily be the case that given our working memory capacity -- the slow development of our prefrontal cortex -- the general learning mechanisms available to humans -- the particular range of vocalizations suited to the human voicebox (and so on and so forth) -- that there is simply a range of 'optimal' solutions for human communicative purposes.
To explore that possibility further, I would like to turn to a contemporary argument Steven Pinker and Ray Jackendoff make for universal grammar on the basis of linguistic universals. P&J -- in response to the BBS article "The Myth of Language Universals" -- argue that the sheer diversity of natural languages is not particularly impressive when we "consider the larger design space for conceivable languages." They then describe six hypothetical languages that they claim are not "obviously incompatible with cognitive limitations or communicative limitations," but which, of course, do not exist. That they do not exist, they imply, is evidence that there must be strong innate constraints on language learning.
This argument is weak for several reasons :
First -- the percentage of human languages carefully studied by linguists is vanishingly small (particularly when we think about this from a historical perspective). Making an argument on "what might be possible, but has never been found" on the basis of a highly restricted data set is not particularly convincing.
Second -- and this is crucial -- the argument seems to presume that languages develop in isolation. Yet the opposite is true. Most languages that have been studied are fairly easily classified in historical family trees, and share common origins with a large cluster of related others. What's more, as the English language so aptly illustrates, many languages have seen 'incursions' from other languages, which promote language change and borrowing. Stealing a page from the biology book, we can safely conclude that most extant human languages share both homologous and analogous features with other languages : that is, they share features that likely evolved in parallel, and they share features that they derived from common origins, or from each other. But these broadly defined structural similarities bring us no closer to the question of whether or not there is a universal grammar. Certainly their existence does not necessitate one .
This is because the evolution of languages is, in many ways, so clearly divorced from the evolution of the human genome. As the birth of creoles neatly illustrates, substantial and rapid language change can occur in the total absence of genetic change. This is true of likely every aspect of shared human culture. For example, the accelerating advancement of civilization and technology over the last hundred years has come without any attendant advance on the part of human genetics. Or, take reading as another example: there are no obvious genetic changes to explain why it evolved or just how we do it. No -- reading is a cultural -- not a genetic -- innovation, and to engage with it, we are made to recruit (and recycle) neural resources that would otherwise be devoted to visual processing .
What we make with our hands and our lips evolves in the hands and lips of others, and those of their descendants, and those of our own. And quite independently-- as if it had taken on a life of its own.
If we take the argument from innate constraints seriously, then we end up presuming something like the following: either a) that every human population independently selected for the gene (or suite of genes), that specified the hypothesized universals, b) that language appeared "full blown" before the exodus out of Africa and has changed remarkably little since; or c) that in every case of human contact, the supposed gene-variants swept through the pre-linguistic population like wildfire. Pick and choose among these, and we can handily devise a just-so story on the basis of genetics .
Or, should we look to the alternative, we might say that given some basic facts about human cognition, development, and environment, and given the contact between human populations over millenia, that various languages evolved similar solutions to efficient communication, and that various human communities chose similar ways of divvying up the world in language. Of course, this is a just-so story too.
But really, we should reformulate these stories as hypotheses. Here, there are two : it could be this way or it could be that way (and probably there are many more ways than I've mentioned). Whatever the case, each possibility is empirical. In the first our focus is on genetics : we need to isolate what genes are language-specific (if such genes exist) and then track how they've changed over time. In the second, our focus is on information and language change : we need to look both to extant and historical languages to tell us both how languages have managed information in more or less optimal ways, and how these 'solutions' have changed over time and across languages . In this, our aim needs be to examine both whether changes tend toward more optimal solutions, and whether such changes occur "convergently" (i.e., in unrelated languages) .
Note however, that in either case, the argument that P&J make from the "larger design space" gets us practically nowhere; it's a just-so story that needs to be refashioned as a testable hypothesis. To better illustrate this, let us look for a moment at four of the six "conceivable" languages they propose, and see how conceivable they really are.
Abba speciﬁes grammatical relations not in terms of agent, patient, theme, location, goal, and so on, but in terms of evolutionarily signiﬁcant relationships: predator-prey, eater-food, enemy-ally, permissible-impermissible sexual partners, and so on. All other semantic relations are metaphorical extensions of these.
Semantics is one area where we would almost certainly expect to find fundamental similarities between human languages. Why? Because humans are speaking them. The ways in which humans divvy up the world (distinguishing for example, self from others, time from space, and so on) is almost certainly to do with the workings of human cognition. If dogs were capable of communication, we would expect them to make different categorical distinctions in kind (e.g., car-rides and walks might fall under the category of "awesome things to do," whereas baths and skunks under "really quite crap.")
Bacca resembles the two-word stage in children’s language development. Speakers productively combine words to express basic semantic relations like recurrence, absence, and agent-patient, but no utterance contains more than two morphemes. Listeners use the context to disambiguate them.
The speakers of Cadda have no productive capacity: they draw on a huge lexicon of one-word holophrases and memorized formulas and idioms. New combinations are occasionally coined by a shaman or borrowed from neighboring groups.
All of these examples would be sub-optimal from an information theoretic / or learning perspective, and would not be selected for as languages changed and evolved. For ease of understanding, we can think of this simply in terms of what the rational goals of a communicative system are : to transmit information as quickly, efficiently, and effectively as possible.
Baca does not meet this criteria : if no utterance contained more than two morphemes, the context needed to disambiguate the communicated meaning would be massive. (Ever notice that toddlers often struggle to make themselves understood? This is, in part, why). In information theoretic terms, the "entropy" -- or ambiguity -- in the signal would be far too high.
Fagga does not meet this criteria either : it would require that its speakers be capable of producing --and aurally discriminating-- as many phonemes as there are words in that language (to give you an idea of what would be required, there are several hundred thousand words in English). The learning problem for children learning Fagga would be massive; just think about how many children speaking English have articulatory problems (lisping and confusing d/t for example). How much more difficult would it be to learn a language that required the child to distinguish not several dozen phones, but thousands? This is a simple learning problem, not evidence for constraints.
Cadda is similarly implausible : at a glance, it resembles an avian language -- one in which an established list of calls (and / or songs) are used communicatively. But this is a poor fit with what we know about human learning and innovation. Surely we can invent new tools; why, when it comes to language, would we be unable to invent new words? (If language is "productive," so is all of human culture -- 'productivity,' at least as so understood, is hardly language-specific).
So there you have it : these examples aren't good ones. But perhaps P&J could think of better ones . I don't think it would matter; the argument structure is still shot-through.
To see why, let's push their reasoning a little harder; let's extend it to another domain.
If you would -- stop for a moment and imagine all the 'possible' animals and plants species that might have existed on planet Earth if chance and fate had so conspired. The list of imaginables is infinite, while the list of the real is finite. Against the backdrop of infinite possibility, the diversity of species on this planet may seem somehow less 'remarkable' (--and P&J would say this about language and about linguistic diversity). Conceptually, perhaps, we could buy into this kind of claim. But what would it have revealed to us? Why, nothing at all -- for it is just another way of characterizing the same thing; it is merely a new manner of looking at and turning over a familiar question.
Let us suppose though, that we mistook this insight as profound, and carried it to still further conclusions -- such that we argued that all of our planet's flora and fauna were underwritten by the same design code and that that the design's particular expression (as an ape, or a dandelion) simply masked the underlying similarities. We could say that surely, and to some extent, I suppose it would be "true"; all life on earth shares the same chemical building blocks. But what has this told us about the ape or the dandelion? Merely something trivial : that apes and dandelions are creatures of our planet.
What linguistic universals tell us about language is similarly trivial : that languages are, by design, spoken by humans. But it has not thereby told us anything about humans -- whether they come with language acquisition devices or no; whether they are stocked with innate concepts or merely with the desire to think they are; whether there is truly a hardwired "tool kit" (as P&J are wont to believe) that strongly biases the development of certain grammars. (A lovely theory, and yet -- almost certainly unfalsifiable).
They drum it into you in statistics: "Correlation does not imply causation." With 'universals,' we're still lost searching for the cause.
For the curious
 I should note that the existence of true linguistic universals is heavily contested in comparative linguistics. In recent years, the most high-profile paper on this topic is the BBS article "The Myth of Language Universals" by Nicholas Evans and Stephen Levinson (2009). For a rather more aggressive articulation, you may be interested to read the (recently deceased) Basque linguist Larry Trask. The Guardian (2003) summarized his stance thusly :
"He rejects the specific, and controversial theories of Chomsky and his followers - that the brain encodes some kind of "universal grammar", which underlies all the languages that anyone speaks. Trask says any competent linguist can find counter-examples to all the rules Chomskyans propose. "I have no time for Chomskyan theorising and its associated dogmas of 'universal grammar'. This stuff is so much half-baked twaddle, more akin to a religious movement than to a scholarly enterprise. I am confident that our successors will look back on UG as a huge waste of time. I deeply regret the fact that this sludge attracts so much attention outside linguistics, so much so that many non-linguists believe that Chomskyan theory simply is linguistics, that this is what linguistics has to offer, and that UG is now an established piece of truth, beyond criticism or discussion.The truth is entirely otherwise."
 While the first writing system was invented almost 6,000 years ago in Mesopotamia, literacy rates were extremely low in most countries up until the Industrial Revolution. In some instances, that carried over into the 20th century : for example, the literacy rate in India was just 12% in 1947. In more recent times, the lowest literacy rates are observed in sub-Saharan Africa and southeast Asia. For more on reading, check out "Your Brain on Books," a Scientific American interview with neuroscientist Stanislas Dehaene.
 There is some fascinating -- albeit highly controversial -- work in historical linguistics, linking language change to population genetics (see e.g., Joseph Greenberg, Merritt Ruhlen & Luigi Luca Cavalli-Sforza). However, this work does not speak to the problem of specifying how hypothesized language-specific genes have changed over time, and among (or between) populations. Indeed, such 'language-specific genes' -- which presumably specify the 'universals' that are said to make up a universal grammar -- have yet to be discovered.
 An example of an information theoretic approach to understanding language :
"For language production, there are at least two pressures that speakers have to balance to achieve effective communication. On the one hand,speakers want to successfully convey a message (where by message, we do not mean the literal message, but whatever set of directly or indirectly intended effects the speaker wants to achieve). On the other hand, speakers need to produce language efﬁciently. Pressure for efﬁcient communication may come from several sources, such as limited attentional or memory resources, or other interlocutors who are competing for the ground (i.e. a speaker may be interrupted if information is conveyed too inefﬁciently). A rational production system, then, is one which maximizes the likelihood of efﬁcient and successful communication, taking into account the limitations imposed by the speaker, listener, and environment." (Frank & Jaeger, 2008)
 Information theoretic approaches would be particularly interesting in the case of universal 'implicatures' (i.e., if language has feature-A it must also have feature-B). It may be that having feature-A is only an optimal solution, if accompanied by feature-B.
 Here are the two other examples P&J give :
The grammar of Daffa corresponds to quantiﬁcational logic, distinguishing only predicates and arguments, and using morphological markers for quantiﬁers, parentheses, and variables.
If we stick to the metaphor that natural languages operate like 1950's computer software, I suppose we could ask why they tend to resemble some kinds of software programs and not others. But if you don't buy into that metaphor (and there are many good reasons not to), this example is tantamount to saying "dogs could be more like cats, but shockingly, they're not."
Gahha is a musical language, which uses melodic motifs for words, major and minor keys for polarity, patterns of tension and relaxation for modalities, and so on.
Well, I do find this example intriguing! (Though I'm guessing Pinker didn't; he famously called music "auditory cheesecake" and an "evolutionary accident.") Language and music are similar across quite a number of interesting dimensions: for one, we appear to learn musical conventions and pitch categories in similar ways to how we learn verbal distributions and phonemic categories; for another, (trained) listeners can predict upcoming notes in music in much the same that they can predict upcoming words in speech. So why aren't more sophisticated musical structures part of language? This one I can't answer with any certainty. But it might have something to do with the incidence of tone-deafness in the general population (about 5%) or the vocal and melodic abilities that would be required for such a language, which not all humans possess. In any case, there are many possible reasonable explanations that do not necessitate a universal grammar.
Evans N, & Levinson SC (2009). The myth of language universals: language diversity and its importance for cognitive science. Behavioral and Brain Sciences, 32 (5) PMID: 19857320
Pinker, S., & Jackendoff, R. (2009). The reality of a universal language faculty. Behavioral and Brain Sciences, 32 (5) DOI: 10.1017/S0140525X09990720