In the long term and in immersion contexts, second-language (L2) learners starting acquisition early in life – and staying exposed to input and thus learning over several years or decades – undisputedly tend to outperform later learners. Apart from being misinterpreted as an argument in favour of early foreign language instruction, which takes place in wholly different circumstances, this general age effect is also sometimes taken as evidence for a so-called ‘critical period’ (cp) for second-language acquisition (sla). Derived from biology, the cp concept was famously introduced into the field of language acquisition by Penfield and Roberts in 1959  and was refined by Lenneberg eight years later . Lenneberg argued that language acquisition needed to take place between age two and puberty – a period which he believed to coincide with the lateralisation process of the brain. (More recent neurological research suggests that different time frames exist for the lateralisation process of different language functions. Most, however, close before puberty .) However, Lenneberg mostly drew on findings pertaining to first language development in deaf children, feral children or children with serious cognitive impairments in order to back up his claims. For him, the critical period concept was concerned with the implicit “automatic acquisition” [2, p. 176] in immersion contexts and does not preclude the possibility of learning a foreign language after puberty, albeit with much conscious effort and typically less success.
sla research adopted the critical period hypothesis (cph) and applied it to second and foreign language learning, resulting in a host of studies. In its most general version, the cph for sla states that the ‘susceptibility’ or ‘sensitivity’ to language input varies as a function of age, with adult L2 learners being less susceptible to input than child L2 learners. Importantly, the age–susceptibility function is hypothesised to be non-linear. Moving beyond this general version, we find that the cph is conceptualised in a multitude of ways . This state of affairs requires scholars to make explicit their theoretical stance and assumptions , but has the obvious downside that critical findings risk being mitigated as posing a problem to only one aspect of one particular conceptualisation of the cph, whereas other conceptualisations remain unscathed. This overall vagueness concerns two areas in particular, viz. the delineation of the cph's scope and the formulation of testable predictions. Delineating the scope and formulating falsifiable predictions are, needless to say, fundamental stages in the scientific evaluation of any hypothesis or theory, but the lack of scholarly consensus on these points seems to be particularly pronounced in the case of the cph. This article therefore first presents a brief overview of differing views on these two stages. Then, once the scope of their cph version has been duly identified and empirical data have been collected using solid methods, it is essential that researchers analyse the data patterns soundly in order to assess the predictions made and that they draw justifiable conclusions from the results. As I will argue in great detail, however, the statistical analysis of data patterns as well as their interpretation in cph research – and this includes both critical and supportive studies and overviews – leaves a great deal to be desired. Reanalysing data from a recent cph-supportive study, I illustrate some common statistical fallacies in cph research and demonstrate how one particular cph prediction can be evaluated.
Delineating the scope of the critical period hypothesis
First, the age span for a putative critical period for language acquisition has been delimited in different ways in the literature . Lenneberg's critical period stretched from two years of age to puberty (which he posits at about 14 years of age) , whereas other scholars have drawn the cutoff point at 12, 15, 16 or 18 years of age . Unlike Lenneberg, most researchers today do not define a starting age for the critical period for language learning. Some, however, consider the possibility of the critical period (or a critical period for a specific language area, e.g. phonology) ending much earlier than puberty (e.g. age 9 years , or as early as 12 months in the case of phonology ).
Second, some vagueness remains as to the setting that is relevant to the cph. Does the critical period constrain implicit learning processes only, i.e. only the untutored language acquisition in immersion contexts or does it also apply to (at least partly) instructed learning? Most researchers agree on the former , but much research has included subjects who have had at least some instruction in the L2.
Third, there is no consensus on what the scope of the cp is as far as the areas of language that are concerned. Most researchers agree that a cp is most likely to constrain the acquisition of pronunciation and grammar and, consequently, these are the areas primarily looked into in studies on the cph. Some researchers have also tried to define distinguishable cps for the different language areas of phonetics, morphology and syntax and even for lexis (see  for an overview).
Fourth and last, research into the cph has focused on ‘ultimate attainment’ (ua) or the ‘final’ state of L2 proficiency rather than on the rate of learning. From research into the rate of acquisition (e.g. –), it has become clear that the cph cannot hold for the rate variable. In fact, it has been observed that adult learners proceed faster than child learners at the beginning stages of L2 acquisition. Though theoretical reasons for excluding the rate can be posited (the initial faster rate of learning in adults may be the result of more conscious cognitive strategies rather than to less conscious implicit learning, for instance), rate of learning might from a different perspective also be considered an indicator of ‘susceptibility’ or ‘sensitivity’ to language input. Nevertheless, contemporary sla scholars generally seem to concur that ua and not rate of learning is the dependent variable of primary interest in cph research. These and further scope delineation problems relevant to cph research are discussed in more detail by, among others, Birdsong , DeKeyser and Larson-Hall , Long  and Muñoz and Singleton .
Formulating testable hypotheses
Once the relevant cph's scope has satisfactorily been identified, clear and testable predictions need to be drawn from it. At this stage, the lack of consensus on what the consequences or the actual observable outcome of a cp would have to look like becomes evident. As touched upon earlier, cph research is interested in the end state or ‘ultimate attainment’ (ua) in L2 acquisition because this “determines the upper limits of L2 attainment” [9, p. 10]. The range of possible ultimate attainment states thus helps researchers to explore the potential maximum outcome of L2 proficiency before and after the putative critical period.
One strong prediction made by some cph exponents holds that post-cp learners cannot reach native-like L2 competences. Identifying a single native-like post-cp L2 learner would then suffice to falsify all cph s making this prediction. Assessing this prediction is difficult, however, since it is not clear what exactly constitutes sufficient nativelikeness, as illustrated by the discussion on the actual nativelikeness of highly accomplished L2 speakers , . Indeed, there exists a real danger that, in a quest to vindicate the cph, scholars set the bar for L2 learners to match monolinguals increasingly higher – up to Swiftian extremes. Furthermore, the usefulness of comparing the linguistic performance in mono- and bilinguals has been called into question , , . Put simply, the linguistic repertoires of mono- and bilinguals differ by definition and differences in the behavioural outcome will necessarily be found, if only one digs deep enough.
A second strong prediction made by cph proponents is that the function linking age of acquisition and ultimate attainment will not be linear throughout the whole lifespan. Before discussing how this function would have to look like in order for it to constitute cph-consistent evidence, I point out that the ultimate attainment variable can essentially be considered a cumulative measure dependent on the actual variable of interest in cph research, i.e. susceptibility to language input, as well as on such other factors like duration and intensity of learning (within and outside a putative cp) and possibly a number of other influencing factors. To elaborate, the behavioural outcome, i.e. ultimate attainment, can be assumed to be integrative to the susceptibility function, as Newport  correctly points out. Other things being equal, ultimate attainment will therefore decrease as susceptibility decreases. However, decreasing ultimate attainment levels in and by themselves represent no compelling evidence in favour of a cph. The form of the integrative curve must therefore be predicted clearly from the susceptibility function. Additionally, the age of acquisition–ultimate attainment function can take just about any form when other things are not equal, e.g. duration of learning (Does learning last up until time of testing or only for a more or less constant number of years or is it dependent on age itself?) or intensity of learning (Do learners always learn at their maximum susceptibility level or does this intensity vary as a function of age, duration, present attainment and motivation?). The integral of the susceptibility function could therefore be of virtually unlimited complexity and its parameters could be adjusted to fit any age of acquisition–ultimate attainment pattern. It seems therefore astonishing that the distinction between level of sensitivity to language input and level of ultimate attainment is rarely made in the literature. Implicitly or explicitly , the two are more or less equated and the same mathematical functions are expected to describe the two variables if observed across a range of starting ages of acquisition.
But even when the susceptibility and ultimate attainment variables are equated, there remains controversy as to what function linking age of onset of acquisition and ultimate attainment would actually constitute evidence for a critical period. Most scholars agree that not any kind of age effect constitutes such evidence. More specifically, the age of acquisition–ultimate attainment function would need to be different before and after the end of the cp. According to Birdsong , three basic possible patterns proposed in the literature meet this condition. These patterns are presented in Figure 1. The first pattern describes a steep decline of the age of onset of acquisition (aoa)–ultimate attainment (ua) function up to the end of the cp and a practically non-existent age effect thereafter. Pattern 2 is an “unconventional, although often implicitly invoked” [9, p. 17] notion of the cp function which contains a period of peak attainment (or performance at ceiling), i.e. performance does not vary as a function of age, which is often referred to as a ‘window of opportunity’. This time span is followed by an unbounded decline in ua depending on aoa. Pattern 3 includes characteristics of patterns 1 and 2. At the beginning of the aoa range, performance is at ceiling. The next segment is a downward slope in the age function which ends when performance reaches its floor. Birdsong points out that all of these patterns have been reported in the literature. On closer inspection, however, he concludes that the most convincing function describing these age effects is a simple linear one. Hakuta et al.  sketch further theoretically possible predictions of the cph in which the mean performance drops drastically and/or the slope of the aoa–ua proficiency function changes at a certain point.
Three possible critical period effects.
Although several patterns have been proposed in the literature, it bears pointing out that the most common explicit prediction corresponds to Birdsong's first pattern, as exemplified by the following crystal-clear statement by DeKeyser, one of the foremost cph proponents:
[A] strong negative correlation between age of acquisition and ultimate attainment throughout the lifespan (or even from birth through middle age), the only age effect documented in many earlier studies, is not evidence for a critical period…[T]he critical period concept implies a break in the AoA–proficiency function, i.e., an age (somewhat variable from individual to individual, of course, and therefore an age range in the aggregate) after which the decline of success rate in one or more areas of language is much less pronounced and/or clearly due to different reasons. [22, p. 445].
DeKeyser and before him among others Johnson and Newport  thus conceptualise only one possible pattern which would speak in favour of a critical period: a clear negative age effect before the end of the critical period and a much weaker (if any) negative correlation between age and ultimate attainment after it. This ‘flattened slope’ prediction has the virtue of being much more tangible than the ‘potential nativelikeness’ prediction: Testing it does not necessarily require comparing the L2-learners to a native control group and thus effectively comparing apples and oranges. Rather, L2-learners with different aoas can be compared amongst themselves without the need to categorise them by means of a native-speaker yardstick, the validity of which is inevitably going to be controversial . In what follows, I will concern myself solely with the ‘flattened slope’ prediction, arguing that, despite its clarity of formulation, cph research has generally used analytical methods that are irrelevant for the purposes of actually testing it.
Inferring non-linearities in critical period research: An overview
In this section, I present a non-exhaustive overview of studies that have either claimed to have found evidence relevant to the ‘flattened slope’ prediction or that have been cited by others in this context. These studies can be split up in three broad and partially overlapping categories. The first category consists of studies in which statistical tools to compare means or proportions, e.g. - and -tests and anovas, were used. Studies in which the correlation coefficients of the aoa–ua relationship were compared between younger and older arrivals make up the second category. Lastly, studies in the third category used regression methods to address the ‘flattened slope’ prediction. I will demonstrate that the analyses used in the first two categories rest on statistical fallacies, rendering them useless for the purposes of addressing the ‘flattened slope’ prediction. Regression models, I argue, present the only valid alternative, provided they are fitted correctly and interpreted judiciously.
Group mean or proportion comparisons
The first broad category consists of studies in which the aoa continuum is discretised into bins (e.g. aoa 3–7, 8–10, 11–15 and 17–39 years in a study by Johnson and Newport ), whose ua scores or nativelikeness ratings are subsequently compared together and sometimes with those of native speakers using a series of - or -tests or an anova. Inferences about discontinuities in the aoa–ua function are then made on the basis of whether such comparisons reach significance or not. (To prevent any misunderstandings, note that the terms ‘discontinuity’ and ‘non-continuity’ are often used in cph research, even though the predicted patterns (see Figure 1) do not contain discontinuities in the mathematical sense. In mathematics, a discontinuity is a ‘jump’ in the function .) A fairly recent paper by Abrahamsson and Hyltenstam  is a case in point. The authors split up the aoa continuum into five bins (aoa–5, 6–11, 12–17, 18–23 and 24–47 years), carried out an anova with pairwise post-hoc tests on nativelikeness ratings and inferred the presence of a critical point in adolescence on the basis thereof:
[T]he main differences can be found between the native group and all other groups – including the earliest learner group – and between the adolescence group and all other groups. However, neither the difference between the two childhood groups nor the one between the two adulthood groups reached significance, which indicates that the major changes in eventual perceived nativelikeness of L2 learners can be associated with adolescence. [15, p. 270].
Similar group comparisons aimed at investigating the effect of aoa on ua have been carried out by both cph advocates and sceptics (among whom Bialystok and Miller [25, pp. 136–139], Birdsong and Molis [26, p. 240], Flege [27, pp. 120–121], Flege et al. [28, pp. 85–86], Johnson [29, p. 229], Johnson and Newport [23, p. 78], McDonald [30, pp. 408–410] and Patowski [31, pp. 456–458]). To be clear, not all of these authors drew direct conclusions about the aoa–ua function on the basis of these groups comparisons, but their group comparisons have been cited as indicative of a cph-consistent non-continuous age effect, as exemplified by the following quote by DeKeyser :
Where group comparisons are made, younger learners always do significantly better than the older learners. The behavioral evidence, then, suggests a non-continuous age effect with a “bend” in the AoA–proficiency function somewhere between ages 12 and 16. [22, p. 448].
The first problem with group comparisons like these and drawing inferences on the basis thereof is that they require that a continuous variable, aoa, be split up into discrete bins. More often than not, the boundaries between these bins are drawn in an arbitrary fashion, but what is more troublesome is the loss of information and statistical power that such discretisation entails (see  for the extreme case of dichotomisation). If we want to find out more about the relationship between aoa and ua, why throw away most of the aoa information and effectively reduce the ua data to group means and the variance in those groups?
Second, I strongly suspect that the underlying assumption when using - and -tests and anovas to infer the shape of the underlying aoa–ua function is one of the gravest fallacies in all of inferential statistics: the belief that non-significant test results indicate that the group means or proportions are essentially identical. To quote Schmidt, this notion is “the most devastating of all to the research enterprise” [33, p. 126]. Yet, judging by the snippet quoted above, Abrahamsson and Hyltenstam's reasoning seemed to be that the lack of a statistical difference between the childhood groups and between the adulthood groups indicates that these groups perform at roughly the same level, whereas the presence of a statistical difference between the adolescence group and all other groups indicates a steep drop in perceived nativelikeness. Such reasoning ignores the issue that when the default null hypothesis of no difference is adopted as or integrated into the research hypothesis, the statistical power of the tests, i.e. the probability of finding a statistically significant difference when the actual population means differ by a prespecified minimum effect size, should be substantially higher than what tends to be the case in the social sciences .
In order to illustrate the gravity of this problem, I computed the power that Abrahamsson and Hyltenstam would actually have had to detect a significant difference between their two childhood groups (, ) if the underlying population effect size had, in fact, been medium-sized (, see ). These power computations were carried out with the pwr.t2n.test() function in the pwr package for R. (R is an open source program and programming language for statistical computing and can be downloaded freely from http://www.r-project.org/. All add-on packages used for the analyses in this paper can be installed from within r, see the ‘supporting information’ section. For a highly accessible introductory text to power analysis, see Cohen's Power primer.) It turned out that Abrahamsson and Hyltenstam's power was about 0.73 assuming a two-tailed -test with fixed at 0.05. While this is better than what is typically found in social science papers , it still means that in 27% of cases, even a medium-sized effect would have gone undetected. Since Abrahamsson and Hyltenstam used post-hoc tests that corrected the individual -levels downwards to maintain the familywise Type I error rate, their actual power was even lower , . To clarify, I am not arguing against maintaining the familywise level; the point is merely that these power computations are generous. In the case of Johnson and Newport's oft-cited study, which claimed that participants with aoas between 3 and 7 years () did not behave differently from native speakers () and on that basis surmised the presence of a non-continuity, this lack of power is even more pronounced at a mere 0.20, assuming a medium-sized effect size and a two-tailed test with fixed at 0.05. This means that in a whopping 80% of cases a medium-sized effect would have gone undetected. Note that Sedlmeier and Gigerenzer  suggest that researchers have a power level of 0.95 before they accept null hypotheses, which is equivalent to the typical requirement of needing a -value lower than 0.05 before rejecting the null hypothesis in favour of a non-null research hypothesis, but which would require about 105 participants per group (assuming ).
Thus, within an ‘orthodox’ frequentist framework, group mean or proportion comparisons are fine for establishing that a difference does likely exist between two groups (though subject to a host of caveats, see – and many others), but using them to infer that a difference does not exist is highly suspect. The only reliable inference that they by themselves allow in cph research is that younger learners tend to outperform older learners in some domains of language (e.g. pronunciation and syntax), which all scholars implied in the debate essentially agree on. In sum, inferring the precise shape of a bivariate relationship using -tests, anovas or -tests is at the very least cumbersome and prone to errors.
Comparison of correlation coefficients
The second broad category, which is not mutually exclusive with the first category, consists of studies that address the discontinuity hypothesis by computing and comparing correlation coefficients between aoa and ua for two or more aoa subgroups. In a sense, this approach represents an improvement over group mean or proportion comparisons as the aoa data are treated as a continuous variable. Nevertheless, this approach, too, rests on a fallacious assumption, namely that differences in correlation coefficients are indicative of differences in slopes. We suspect that the correlation-based approach dates back to Johnson and Newport's 1989 study , in which they split up their participants into two aoa-defined groups and found that ua as measured using a gjt correlated strongly and significantly in the early arrivals (age 3–15, , ) but not in the older arrivals (age 17–39, , ). Johnson and Newport took this to suggest that “language learning ability slowly declines as the human matures and plateaus at a low level after puberty” [23, p. 90].
Correlation-based inferences about slope discontinuities have similarly explicitly been made by cph advocates and skeptics alike, e.g. Bialystok and Miller [25, pp. 136 and 140], DeKeyser and colleagues ,  and Flege et al. [45, pp. 166 and 169]. Others did not explicitly infer the presence or absence of slope differences from the subset correlations they computed (among others Birdsong and Molis , DeKeyser , Flege et al.  and Johnson ), but their studies nevertheless featured in overviews discussing discontinuities , . Indeed, the most recent overview draws a strong conclusion about the validity of the cph's ‘flattened slope’ prediction on the basis of these subset correlations:
In those studies where the two groups are described separately, the correlation is much higher for the younger than for the older group, except in Birdsong and Molis (2001) [ = , JV], where there was a ceiling effect for the younger group. This global picture from more than a dozen studies provides support for the non-continuity of the decline in the AoA–proficiency function, which all researchers agree is a hallmark of a critical period phenomenon. [22, p. 448].
In Johnson and Newport's specific case , their correlation-based inference that ua levels off after puberty happened to be largely correct: the gjt scores are more or less randomly distributed around a near-horizontal trend line . Ultimately, however, it rests on the fallacy of confusing correlation coefficients with slopes, which seriously calls into question conclusions such as DeKeyser's (cf. the quote above).
For clarity's sake, let's briefly review the difference between correlation coefficients and slopes. The slope of a function is defined as the increment with which and the direction in which the value on the -axis changes when the value on the -axis is increased by one increment. In a linear regression model of the form , is the value of (i.e. the expected -value according to the model) when , i.e. the intercept. The coefficient that takes in this equation, , represents the slope of the regression function, i.e. it expresses how changes when is increased by one increment. In principle, can take any value between negative and positive infinity.
The Pearson correlation coefficient, , on the other hand, expresses the strength of the linear relationship between two variables. It is bound between (perfect negative relationship) and 1 (perfect positive relationship). If equals or 1, a straight line captures all the data points; the closer comes to zero, the farther from such a linear line the data points are scattered. In simple linear functions, and are linked to each other in that is times the ratio of the sample standard deviations of the - and -variables: . Crucially, however, the relationships between two pairs of variables can be characterised by the same functional regression form but still have radically different coefficients, and the other way around (see Figure 2).
Illustration of the difference between correlation coefficients and slopes.
What this boils down to is that a hypothesis concerning the slope of a function must be addressed by comparing coefficients computed using regression techniques rather than by comparing correlation coefficients. But then why are the aoa–ua correlations typically weaker in the older arrivals than in the younger ones? Assuming, for the sake of the argument, that the slope of the aoa–ua function is identical in both groups (Eq. 1), we can substitute the coefficients for the correlation coefficients times the ratio of the relevant sample standard deviations (Eq. 2).
It can then straightforwardly be deduced that, other things equal, the aoa–ua correlation in the older group decreases as the ua variance in the older group increases relative to the ua variance in the younger group (Eq. 3).
Lower correlation coefficients in older aoa groups may therefore be largely due to differences in ua variance, which have been reported in several studies , , ,  (see  for additional references). Greater variability in ua with increasing age is likely due to factors other than age proper , such as the concomitant greater variability in exposure to literacy, degree of education, motivation and opportunity for language use, and by itself represents evidence neither in favour of nor against the cph.
Having demonstrated that neither group mean or proportion comparisons nor correlation coefficient comparisons can directly address the ‘flattened slope’ prediction, I now turn to the studies in which regression models were computed with aoa as a predictor variable and ua as the outcome variable. Once again, this category of studies is not mutually exclusive with the two categories discussed above.
In a large-scale study using self-reports and approximate aoas derived from a sample of the 1990 U.S. Census, Stevens found that the probability with which immigrants from various countries stated that they spoke English ‘very well’ decreased curvilinearly as a function of aoa. She noted that this development is similar to the pattern found by Johnson and Newport  but that it contains no indication of an “abruptly defined ‘critical’ or sensitive period in L2 learning” [48, p. 569]. However, she modelled the self-ratings using an ordinal logistic regression model in which the aoa variable was logarithmically transformed. Technically, this is perfectly fine, but one should be careful not to read too much into the non-linear curves found. In logistic models, the outcome variable itself is modelled linearly as a function of the predictor variables and is expressed in log-odds. In order to compute the corresponding probabilities, these log-odds are transformed using the logistic function. Consequently, even if the model is specified linearly, the predicted probabilities will not lie on a perfectly straight line when plotted as a function of any one continuous predictor variable. Similarly, when the predictor variable is first logarithmically transformed and then used to linearly predict an outcome variable, the function linking the predicted outcome variables and the untransformed predictor variable is necessarily non-linear. Thus, non-linearities follow naturally from Stevens's model specifications. Moreover, cph-consistent discontinuities in the aoa–ua function cannot be found using her model specifications as they did not contain any parameters allowing for this.
Using data similar to Stevens's, Bialystok and Hakuta found that the link between the self-rated English competences of Chinese- and Spanish-speaking immigrants and their aoa could be described by a straight line . In contrast to Stevens, Bialystok and Hakuta used a regression-based method allowing for changes in the function's slope, viz. locally weighted scatterplot smoothing (lowess). Informally, lowess is a non-parametrical method that relies on an algorithm that fits the dependent variable for small parts of the range of the independent variable whilst guaranteeing that the overall curve does not contain sudden jumps (for technical details, see ). Hakuta et al. used an even larger sample from the same 1990 U.S. Census data on Chinese- and Spanish-speaking immigrants (2.3 million observations) . Fitting lowess curves, no discontinuities in the aoa–ua slope could be detected. Moreover, the authors found that piecewise linear regression models, i.e. regression models containing a parameter that allows a sudden drop in the curve or a change of its slope, did not provide a better fit to the data than did an ordinary regression model without such a parameter.
Summarising, Bialystok and Hakuta and Hakuta et al. found no evidence supporting a cph account for the aoa–self-ratings relationship. The pertinence of these studies to the cph has, however, been questioned for a number of reasons. These concern (1) the exclusion of immigrants who reported that they only spoke English at home from the data set , (2) the possibility that the immigrants believed that second-language competence decreases monotonically as a function of age of learning and that the self-ratings are shaped by this belief , , (3) the coarseness of the aoa variable retrieved from the census , , and (4) the assumption that the self-ratings could be considered a continuous variable . While I recognise the potential of all four points to obscure a cp effect in the aoa–ua function, I fail to grasp another point of Stevens's criticism of Hakuta et al.'s study. This point concerns the use of comparing simple linear regression fits to fits of piecewise linear regressions. She argues that since the aoa–proficiency relationship is negative when viewed over the whole lifespan, there is hardly any variance left to be explained by the breakpoints . This is, of course, the whole point of the enterprise: parsimony dictates that if the breakpoints do not add sufficiently to the model fit, they should be left out! That said, the necessity of including a breakpoint in the model can be assessed by means other than the coefficient of determination (), e.g. relative goodness-of-fit measures such as the Akaike Information Criterion  or the Bayesian Information Criterion  or -tests. Such measures can in principle indicate better model fits even if the increase in is minimal.
To my knowledge, regression models capable of highlighting non-linearities have only been modelled in two studies looking into the relationship between aoa and ua variables extracted using tasks rather than self-ratings. Flege et al. measured ua in English for 240 Korean participants using foreign-accent ratings and a grammaticality judgement task (gjt) . They fitted both linear and cubic functions to the aoa–ua data. The cubic function explained somewhat more variance than did the linear function for the foreign-accent ratings (increase in : 1.9%), but follow-up analyses failed to find support for a non-linearity in puberty. A cubic function likewise explained somewhat more variance compared to a linear function for the gjt scores (increase in : 1.2%), but this time follow-up analyses revealed a change in slope an aoa of about 12 years. In my opinion, however, Flege et al.'s follow-up analyses are not quite ideal as they entail fitting models on aoa-defined subsets and checking whether the cubic term still contributed significantly to the model fit in those subsets; I refer the reader to the original publication for details on this procedure. (Moreover, pinpointing the location of a slope change in a cubic function is mathematically speaking impossible: the function's slope changes continuously (expressed by the first derivative, which itself is a continuous quadratic function) as does the rate by which it changes (expressed by the second derivative, which is a continuous linear function). One could pinpoint the aoa at which the change in slope starts to slow down or speed up (i.e. the point at which the sign of the second derivative changes), but one should be aware that one is dealing with a continuous phenomenon.)
Instead, I prefer the analytical approach used by Birdsong and Molis, who, like Hakuta et al., fitted piecewise linear regression models and checked whether the breakpoint parameter contributed enough to the model to offset the resultant loss of parsimony . Birdsong and Molis's study was a replication of Johnson and Newport's but used Spanish L1 speakers () rather than Korean- and Chinese-speaking participants. These authors found a breakpoint in the aoa–ua slope that contributed significantly to the model fit, but this breakpoint was located at aoa 27.5 years – well beyond a putative critical period. Reanalysing Johnson and Newport's data, the authors further found that a breakpoint could improve the model fit for this data set, too. This time, however, the breakpoint was located at aoa 18 years. Importantly, the breakpoints had different functions in the two data sets: whereas it marked the beginning of a flatter part of the curve in the Johnson and Newport data set (as in the left panel of Figure 1), it actually marked the onset of a steeper part of the curve in the Birdsong and Molis study (as in the middle panel of Figure 1). In other words, the age effect in ua actually became more pronounced for the older arrivals. (Birdsong and Molis did not mention by how much increased when breakpoint parameters were included in their models.)
To sum up, I have argued at length that regression approaches are superior to group mean and correlation coefficient comparisons for the purposes of testing the ‘flattened slope’ prediction. Acknowledging the reservations vis-à-vis self-estimated uas, we still find that while the relationship between aoa and ua is not necessarily perfectly linear in the studies discussed, the data do not lend unequivocal support to this prediction. In the following section, I will reanalyse data from a recent empirical paper on the cph by DeKeyser et al. . The first goal of this reanalysis is to further illustrate some of the statistical fallacies encountered in cph studies. Second, by making the computer code available I hope to demonstrate how the relevant regression models, viz. piecewise regression models, can be fitted and how the aoa representing the optimal breakpoint can be identified. Lastly, the findings of this reanalysis will contribute to our understanding of how aoa affects ua as measured using a gjt.
The critical period hypothesis is the subject of a long-standing debate in linguistics and language acquisition over the extent to which the ability to acquire language is biologically linked to age. The hypothesis claims that there is an ideal time window to acquire language in a linguistically rich environment, after which further language acquisition becomes much more difficult and effortful.
The critical period hypothesis states that the first few years of life is the crucial time in which an individual can acquire a first language if presented with adequate stimuli. If language input does not occur until after this time, the individual will never achieve a full command of language—especially grammatical systems.
The evidence for such a period is limited, and support stems largely from theoretical arguments and analogies to other critical periods in biology such as visual development, but nonetheless is widely accepted. The nature of such a critical period, however, has been one of the most fiercely debated issues in psycholinguistics and cognitive science in general for decades. Some writers have suggested a "sensitive" or "optimal" period rather than a critical one; others dispute the causes (physical maturation, cognitive factors). The duration of the period also varies greatly in different accounts.
In second-language acquisition, the strongest empirical evidence for the critical period hypothesis is in the study of accent, where most older learners do not reach a native-like level. However, under certain conditions, native-like accent has been observed, suggesting that accent is affected by multiple factors, such as identity and motivation, rather than a critical period biological constraint.
The critical period hypothesis was first proposed by Montreal neurologist Wilder Penfield and co-author Lamar Roberts in their 1959 book Speech and Brain Mechanisms, and was popularized by Eric Lenneberg in 1967 with Biological Foundations of Language.
Lenneberg's critical period hypothesis states that there are maturational constraints on the time a first language can be acquired. First-language acquisition relies on neuroplasticity. If language acquisition does not occur by puberty, some aspects of language can be learned but full mastery cannot be achieved.
Support for the critical period theory stems largely from theoretical arguments and analogies to other critical periods in biology such as visual development. Strictly speaking, the experimentally verified critical period relates to a time span during which damage to the development of the visual system can occur, for example if animals are deprived of the necessary binocular input for developing stereopsis. It has however been considered "likely", and has in many cases been flatly presented as fact, that experimental evidence would point to a comparable critical period also for recovery of such development and treatment; however this is a hypothesis. Recently, doubts have arisen concerning the validity of this critical period hypothesis with regard to visual development, in particular since the time it became known that neuroscientist Susan R. Barry and others have achieved stereopsis as adults, long after the supposed critical period for acquiring this skill.
Recently, it has been suggested that if a critical period does exist, it may be due at least partially to the delayed development of the prefrontal cortex in human children. Researchers have suggested that delayed development of the prefrontal cortex and an associated delay in the development of cognitive control may facilitate convention learning, allowing young children to learn language far more easily than cognitively mature adults and older children. This pattern of prefrontal development is unique to humans among similar mammalian (and primate) species, and may explain why humans—and not chimpanzees—are so adept at learning language.
The theory has often been extended to a critical period for second-language acquisition (SLA), although this is much less widely accepted. Certainly, older learners of a second language rarely achieve the native-like fluency that younger learners display, despite often progressing faster than children in the initial stages. David Singleton states that in learning a second language, "younger = better in the long run," but points out that there are many exceptions, noting that five percent of adult bilinguals master a second language even though they begin learning it when they are well into adulthood—long after any critical period has presumably come to a close.
While the window for learning a second language never completely closes, certain linguistic aspects appear to be more affected by the age of the learner than others. For example, adult second-language learners nearly always retain an immediately identifiable foreign accent, including some who display perfect grammar. A possible explanation for why this foreign accent remains is that pronunciation, or phonology, is susceptible to the critical period. The pronunciation of speech sounds relies on neuromuscular function. Adults learning a new language are unlikely to attain a convincing native accent since they are past the prime age of learning new neuromuscular functions, and therefore pronunciations. Writers have suggested a younger critical age for learning phonology than for morphemes and syntax. Singleton & Lengyel (1995) reports that there is no critical period for learning vocabulary in a second language because vocabulary is learned consciously using declarative memory. The attrition of procedural memory with age results in the increased use of declarative memory to learn new languages, which is an entirely different process from L1 (first language) learning. The plasticity of procedural memory is argued to decline after the age of 5. The attrition of procedural memory plasticity inhibits the ability of an L2 user to speak their second language automatically. It can still take conscious effort even if they are exposed to the second language as early as age 3. This effort is observed by measuring brain activity. L2-users that are exposed to their second language at an early age and are everyday users show lower levels of brain activity when using their L1 than when using their L2. This suggests that additional resources are recruited when speaking their L2 and it is therefore a more strenuous process.
The critical period hypothesis in SLA follows a "use it then lose it" approach, which dictates that as a person ages, excess neural circuitry used during L1 learning is essentially broken down. If these neural structures remained intact they would cost unnecessary metabolic energy to maintain. The structures necessary for L1 use are kept. On the other hand, a second "use it or lose it" approach dictates that if an L2 user begins to learn at an early age and continues on through his life, then his language-learning circuitry should remain active. This approach is also called the "exercise hypothesis".
There is much debate over the timing of the critical period with respect to SLA, with estimates ranging between 2 and 13 years of age. These estimates tend to vary depending on what component of the language learning process a researcher considers. For instance, if an SLA researcher is studying L2 phonological development, they will likely conclude that the critical period ends at around age 3. If another SLA researcher is studying L2 syntactical development, they may conclude that the critical period ends at a much later age. These differences in research focus are what create the critical period timing debate.
Some writers have argued that the critical period hypothesis does not apply to SLA, and that second-language proficiency is determined by the time and effort put into the learning process, and not the learner's age.Robertson (2002) observed that factors other than age may be even more significant in successful second-language learning, such as personal motivation, anxiety, input and output skills, and the learning environment. A combination of these factors often leads to individual variation in second-language acquisition experiences.
On reviewing the published material, Bialystok and Hakuta (1994) conclude that second-language learning is not necessarily subject to biological critical periods, but "on average, there is a continuous decline in ability [to learn] with age."
Experimental and observational studies
How children acquire native language (L1) and the relevance of this to foreign language (L2) learning has long been debated. Although evidence for L2 learning ability declining with age is controversial, a common notion is that children learn L2s easily, whilst older learners rarely achieve fluency. This assumption stems from ‘critical period’ (CP) ideas. A CP was popularised by Eric Lenneberg in 1967 for L1 acquisition, but considerable interest now surrounds age effects on second-language acquisition (SLA). SLA theories explain learning processes and suggest causal factors for a possible CP for second language acquisition. These SLA-CP theories mainly attempt to explain apparent differences in language aptitudes of children and adults by distinct learning routes, and clarify these differences by discussing psychological mechanisms. Research explores these ideas and hypotheses, but results are varied: some demonstrate pre-pubescent children acquire language easily, and some that older learners have the advantage, whilst others focus on existence of a CP for SLA. Recent studies (e.g. Mayberry and Lock, 2003) have recognised certain aspects of SLA may be affected by age, whilst others remain intact. The objective of this study is to investigate whether capacity for vocabulary acquisition decreases with age.
Other work has challenged the biological approach; Krashen (1975) re-analysed clinical data used as evidence and concluded cerebral specialisation occurs much earlier than Lenneberg calculated. Therefore, if a CP exists, it does not coincide with lateralisation. Despite concerns with Lenneberg’s original evidence and the dissociation of lateralisation from the language CP idea, however, the concept of a CP remains a viable hypothesis, which later work has better explained and substantiated.
Effects of aging
A review of SLA theories and their explanations for age-related differences is necessary before considering empirical studies. The most reductionist theories are those of Penfield and Roberts (1959) and Lenneberg (1967), which stem from L1 and brain damage studies. Children who suffer impairment before puberty typically recover and (re-)develop normal language, whereas adults rarely recover fully, and often do not regain verbal abilities beyond the point reached five months after impairment. Both theories agree that children have a neurological advantage in learning languages, and that puberty correlates with a turning point in ability. They assert that language acquisition occurs primarily, possibly exclusively, during childhood as the brain loses plasticity after a certain age. It then becomes rigid and fixed, and loses the ability for adaptation and reorganisation, rendering language (re-)learning difficult. Penfield and Roberts (1959) claim children under nine can learn up to three languages: early exposure to different languages activates a reflex in the brain allowing them to switch between languages without confusion or translation into L1 (Penfield, 1964). Lenneberg (1967) asserts that if no language is learned by puberty, it cannot be learned in a normal, functional sense. He also supports Penfield and Roberts’ (1959) proposal of neurological mechanisms responsible for maturational change in language learning abilities. This, Lenneberg maintains, coincides with brain lateralisation and left-hemispherical specialisation for language around age thirteen: infants’ motor and linguistic skills develop simultaneously, but by age thirteen the cerebral hemispheres’ functions separate and become set, making language acquisition extremely difficult (Lenneberg, 1967).
Deaf and feral children
Cases of deaf and feral children provide evidence for a biologically determined CP for L1. Feral children are those not exposed to language in infancy/childhood due to being brought up in the wild, in isolation and/or confinement. A classic example is 'Genie', a victim of child abuse who was deprived of social interaction from birth until discovered aged thirteen. Her father had judged her retarded at birth and had chosen to isolate her. She was kept strapped to a potty chair and forced to wear diapers. She was completely without language. Her case presented an ideal opportunity to test the theory that a nurturing environment could somehow make up for the total lack of language past the age of 12. After seven years of rehabilitation Genie still lacked linguistic competence, although the degree to which she acquired language is disputed. Another case is 'Isabelle', who was incarcerated with her deaf-mute mother until the age of six and a half (pre-pubescent). She also had no language skills, but, unlike Genie, quickly acquired normal language abilities through systematic specialist training. Detractors of the critical period hypothesis point out that in these examples and others like them (see feral children), the child is hardly growing up in a nurturing environment, and that the lack of language acquisition in later life may be due to the results of a generally abusive environment rather than being specifically due to a lack of exposure to language. Such studies are problematic; isolation can result in general retardation and emotional disturbances, which may confound conclusions drawn about language abilities.
Studies of deaf children learning American Sign Language (ASL) have fewer methodological weaknesses. Newport and Supalla  studied ASL acquisition in deaf children differing in age of exposure; few were exposed to ASL from birth, most of them first learned it at school. Results showed a linear decline in performance with increasing age of exposure; those exposed to ASL from birth performed best, and 'late learners' worst, on all production and comprehension tests. Their study thus provides direct evidence for language learning ability decreasing with age, but it does not add to Lenneberg's CP hypothesis as even the oldest children, the 'late learners', were exposed to ASL by age four, and had therefore not reached puberty, the proposed end of the CP. In addition, the declines were shown to be linear, with no sudden 'drop off' of ability at a certain age, as would be predicted by a strong CP hypothesis. That the children performed significantly worse may suggest that the CP ends earlier than originally postulated. However, this decline in performance may also be attributed in part to limitations of second language acquisition for hearing parents learning ASL.
Contrary to biological views, behavioural approaches assert that languages are learned as any other behaviour, through conditioning. Skinner (1957) details how operant conditioning forms connections with the environment through interaction and, alongside O. Hobart Mowrer (1960), applies the ideas to language acquisition. Mowrer hypothesises that languages are acquired through rewarded imitation of ‘language models’; the model must have an emotional link to the learner (e.g. parent, spouse), as imitation then brings pleasant feelings which function as positive reinforcement. Because new connections between behaviour and the environment are formed and reformed throughout life, it is possible to gain new skills, including language(s), at any age.
To explain observed language learning differences between children and adults, children are postulated to create countless new connections daily, and may handle the language learning process more effectively than do adults. This assumption, however, remains untested and is not a reliable explanation for children’s aptitude for L2 learning. Problematic of the behaviourist approach is its assumption that all learning, verbal and non-verbal, occurs through the same processes. A more general problem is that, as Pinker (1995) notes, almost every sentence anybody voices is an original combination of words, never previously uttered, therefore a language cannot consist only of word combinations learned through repetition and conditioning; the brain must contain innate means of creating endless amounts of grammatical sentences from a limited vocabulary. This is precisely what Chomsky (1965) (reprinted as Chomsky (1969)) argues with his proposition of a universal grammar (UG).
Chomsky (1969) asserts that environmental factors must be relatively unimportant for language emergence, as so many different factors surround children acquiring L1. Instead, Chomsky claims language learners possess innate principles building a 'language acquisition device' (LAD) in the brain. These principles denote restricted possibilities for variation within the language, and enable learners to construct a grammar out of 'raw input' collected from the environment. Input alone cannot explain language acquisition because it is degenerated by characteristic features such as stutters, and lacks corrections from which learners discover incorrect variations.
Singleton and Newport (2004) demonstrate the function of UG in their study of 'Simon'. Simon learned ASL as his L1 from parents who had learned it as an L2 after puberty and provided him with imperfect models. Results showed Simon learned normal and logical rules and was able to construct an organised linguistic system, despite being exposed to inconsistent input. Chomsky developed UG to explain L1 acquisition data, but maintains it also applies to L2 learners who achieve near-native fluency not attributable solely to input and interaction (Chomsky 1969).
Although it does not describe an optimal age for SLA, the theory implies that younger children can learn languages more easily than older learners, as adults must reactivate principles developed during L1 learning and forge an SLA path: children can learn several languages simultaneously as long as the principles are still active and they are exposed to sufficient language samples (Pinker, 1995). The parents of Singleton and Newport's (2004) patient also had linguistic abilities in line with these age-related predictions; they learned ASL after puberty and never reached complete fluency.
Problems within UG theory for L2 acquisition
There are, however, problems with the extrapolation of the UG theory to SLA: L2 learners go through several phases of types of utterance that are not similar to their L1 or the L2 they hear. Other factors include the cognitive maturity of most L2 learners, that they have different motivation for learning the language, and already speak one language fluently. Other studies also highlight these problems: Stanislas Dehaene has investigated how cerebral circuits used to handling one language adapt for the efficient storage of two or more. He reports observations of cerebral activation when reading and translating two languages. They found the most activated brain areas during the tasks were not those generally associated with language, but rather those related to mapping orthography to phonology. They conclude that the left temporal lobe is the physical base of L1, but the L2 is 'stored' elsewhere, thus explaining cases of bilingual aphasia where one language remains intact. They maintain that only languages learned simultaneously from birth are represented, and cause activity, in the left hemisphere: any L2 learned later is stored separately (possibly in the right hemisphere), and rarely activates the left temporal lobe.
This suggests that L2 may be qualitatively different from L1 due to its dissociation from the 'normal' language brain regions, thus the extrapolation of L1 studies and theories to SLA is placed in question. A further disadvantage of UG is that supporting empirical data are taken from a limited sample of syntactic phenomena: a general theory of language acquisition should cover a larger range of phenomena. Despite these problems, several other theorists have based their own models of language learning on it. These ideas are supported by empirical evidence, which consequently supports Chomsky's ideas. Due to this support and its descriptive and explanatory strength, many theorists regard UG as the best explanation of language, and particularly grammar, acquisition.
UG and the critical period hypothesis
A key question about the relationship of UG and SLA is: is the language acquisition device posited by Chomsky and his followers still accessible to learners of a second language? The critical period hypothesis suggests that it becomes inaccessible at a certain age, and learners increasingly depended on explicit teaching. In other words, although all of language may be governed by UG, older learners might have great difficulty in gaining access to the target language's underlying rules from positive input alone.
Piaget (1926) is one psychologist reluctant to ascribe specific innate linguistic abilities to children: he considers the brain a homogeneous computational system, with language acquisition being one part of general learning. He agrees this development may be innate, but claims there is no specific language acquisition module in the brain. Instead, he suggests external influences and social interaction trigger language acquisition: information collected from these sources constructs symbolic and functional schemata (thought or behaviour patterns). According to Piaget, cognitive development and language acquisition are lifelong active processes that constantly update and re-organise schemata. He proposes children develop L1 as they build a sense of identity in reference to the environment, and describes phases of general cognitive development, with processes and patterns changing systematically with age. Piaget assumes language acquisition is part of this complex cognitive development, and that these developmental phases are the basis for an optimal period for language acquisition in childhood. Interactionist approaches derived from Piaget’s ideas supports his theory. Some studies (e.g. Newport and Supalla) show that, rather than abrupt changes in SLA ability after puberty, language ability declines with age, coinciding with declines in other cognitive abilities, thus supporting Piaget.
Although Krashen (1975) also criticises this theory, he does not deny the importance of age for second-language acquisition. Krashen (1975) proposed theories for the close of the CP for L2 at puberty, based on Piaget’s cognitive stage of formal operations beginning at puberty, as the ‘ability of the formal operational thinker to construct abstract hypotheses to explain phenomena’ inhibits the individual’s natural ability for language learning.
The term "language acquisition" became commonly used after Stephen Krashen contrasted it with formal and non-constructive "learning." Today, most scholars use "language learning" and "language acquisition" interchangeably, unless they are directly addressing Krashen's work. However, "second-language acquisition" or "SLA" has become established as the preferred term for this academic discipline.
Though SLA is often viewed as part of applied linguistics, it is typically concerned with the language system and learning processes themselves, whereas applied linguistics may focus more on the experiences of the learner, particularly in the classroom. Additionally, SLA has mostly examined naturalistic acquisition, where learners acquire a language with little formal training or teaching.
Other directions of research
Effect of illiteracy
Virtually all research findings on SLA to date build on data from literate learners. Tarone, Bigelow & Hansen (2009) find significantly different results when replicating standard SLA studies with low literate L2 learners. Specifically, learners with lower alphabetic literacy levels are significantly less likely to notice corrective feedback on form or to perform elicited imitation tasks accurately. These findings are consistent with research in cognitive psychology showing significant differences in phonological awareness between literate and illiterate adults . An important direction for SLA research must therefore involve the exploration of the impact of alphabetic literacy on cognitive processing in second-language acquisition.
Empirical research has attempted to account for variables detailed by SLA theories and provide an insight into L2 learning processes, which can be applied in educational environments. Recent SLA investigations have followed two main directions: one focuses on pairings of L1 and L2 that render L2 acquisition particularly difficult, and the other investigates certain aspects of language that may be maturationally constrained. Flege, Mackay & Piske (2002) looked at bilingual dominance to evaluate two explanations of L2 performance differences between bilinguals and monolingual-L2 speakers, i.e. a maturationally defined CP or interlingual interference.
Flege, Mackay & Piske (2002) investigated whether the age at which participants learned English affected dominance in Italian-English bilinguals, and found the early bilinguals were English (L2) dominant and the late bilinguals Italian (L1) dominant. Further analysis showed that dominant Italian bilinguals had detectable foreign accents when speaking English, but early bilinguals (English dominant) had no accents in either language. This suggests that, though interlingual interference effects are not inevitable, their emergence, and bilingual dominance, may be related to a CP.
Sebastián-Gallés, Echeverría & Bosch (2005) also studied bilinguals and highlight the importance of early language exposure. They looked at vocabulary processing and representation in Spanish-Catalan bilinguals exposed to both languages simultaneously from birth in comparison to those who had learned L2 later and were either Spanish- or Catalan-dominant. Findings showed 'from birth bilinguals' had significantly more difficulty distinguishing Catalan words from non-words differing in specific vowels than Catalan-dominants did (measured by reaction time).
These difficulties are attributed to a phase around age eight months where bilingual infants are insensitive to vowel contrasts, despite the language they hear most. This affects how words are later represented in their lexicons, highlighting this as a decisive period in language acquisition and showing that initial language exposure shapes linguistic processing for life. Sebastián-Gallés, Echeverría & Bosch (2005) also indicate the significance of phonology for L2 learning; they believe learning an L2 once the L1 phonology is already internalised can reduce individuals’ abilities to distinguish new sounds that appear in the L2.
Age effects on grammar learning
Most studies into age effects on specific aspects of SLA have focused on grammar, with the common conclusion that it is highly constrained by age, more so than semantic functioning. Harley (1986) compared attainment of French learners in early and late immersion programs. She reports that after 1000 exposure hours, late learners had better control of French verb systems and syntax. However, comparing early immersion students (average age 6.917 years) with age-matched native speakers identified common problem areas, including third person plurals and polite ‘vous’ forms. This suggests grammar (in L1 or L2) is generally acquired later, possibly because it requires abstract cognition and reasoning.
B. Harley also measured eventual attainment and found the two age groups made similar mistakes in syntax and lexical selection, often confusing French with the L1. The general conclusion from these investigations is that different aged learners acquire the various aspects of language with varying difficulty. Some variation in grammatical performance is attributed to maturation, however, all participants began immersion programs before puberty and so were too young for a strong critical period hypothesis to be directly tested.
This corresponds to Noam Chomsky’s UG theory, which states that while language acquisition principles are still active, it is easy to learn a language, and the principles developed through L1 acquisition are vital for learning an L2.
Scherag et al. (2004) also suggest learning some syntactic processing functions and lexical access may be limited by maturation, whereas semantic functions are relatively unaffected by age. They studied the effect of late SLA on speech comprehension by German immigrants to the U.S.A. and American immigrants to Germany. They found that native-English speakers who learned German as adults were disadvantaged on certain grammatical tasks but performed at near-native levels on lexical tasks.
Semantic functions acquisition
One study that specifically mentions semantic functions acquisition is that of Weber-Fox & Neville (1996). Their results showed that Chinese-English bilinguals who had been exposed to English after puberty, learned vocabulary to a higher competence level than syntactic aspects of language. They do, however, report that the judgment accuracies in detecting semantic anomalies were altered in subjects who were exposed to English after sixteen years of age, but were affected to a lesser degree than were grammatical aspects of language. It has been speculated by Neville & Bavelier (2001) and Scherag et al. (2004) that semantic aspects of language are founded on associative learning mechanisms, which allow lifelong learning, whereas syntactical aspects are based on computational mechanisms, which can only be constructed during certain age periods. Consequently, it is reasoned, semantic functions are easier to access during comprehension of an L2 and therefore dominate the process: if these are ambiguous, understanding of syntactic information is not facilitated. These suppositions would help explain the results of Scherag et al.'s (2004) study.
Advantages of bilingual education for children
It is commonly believed that children are better suited to learning a second language than are adults. However, general second-language research has failed to support the critical period hypothesis in its strong form (i.e., the claim that full language acquisition is impossible beyond a certain age). According to Linda M. Espinosa, especially in the United States the number of children growing up with a home language that is not English but Spanish is constantly increasing. Therefore, these children have to learn the English language before kindergarten as a second language. It is better for young children to maintain both their home language and their second language. Cultivating their home language, children create their own cultural identity and become aware of their roots. This fact leads to the question whether having the ability to speak two languages helps or harms young children. Research shows that the acquisition of a second language in early childhood confers several advantages, especially a greater awareness of linguistic structures. Furthermore, it is advantageous for young children to grow up bilingually because they do not need to be taught systematically but learn languages intuitively. How fast a child can learn a language depends on several personal factors, such as interest and motivation, and their learning environment. Communication should be facilitated rather than forcing a child to learn a language with strict rules. Education in early childhood can lead to an effective educational achievement for children from various cultural environments.
Another aspect worth considering is that bilingual children are often doing code switching, which does not mean that the child is not able to separate the languages. The reason for code switching is the child's lack of vocabulary in a certain situation. The acquisition of a second language in early childhood broadens children's minds and enriches them more than it harms them. Thus they are not only able to speak two languages in spite of being very young but they also acquire knowledge about the different cultures and environments. It is possible for one language to dominate. This depends on how much time is spent on learning each language.
In order to provide evidence for the evolutionary functionality of the critical period in language acquisition, Hurford (1991) generated a computer simulation of plausible conditions of evolving generations, based on three central assumptions:
- Language is an evolutionary adaptation that is naturally selected for.
- Any given individual’s language can be quantified or measured.
- Various aspects of maturation and development are under genetic control, which determines the timing for critical periods for certain capacities (i.e. polygenic inheritance).
According to Hurford's evolutionary model, language acquisition is an adaptation that has survival value for humans, and that knowing a language correlates positively with an individual’s reproductive advantage. This finding is in line with views of other researchers such as Chomsky and Pinker & Bloom (1990). For example, Steven Pinker and Paul Bloom argue that because a language is a complex design that serves a specific function that cannot be replaced by any other existing capacity, the trait of language acquisition can be attributed to natural selection.
However, while arguing that language itself is adaptive and "did not 'just happen'" (p. 172), Hurford suggests that the critical period is not an adaptation, but rather a constraint on language that emerged due to a lack of selection pressures that reinforce acquiring more than one language. In other words, Hurford explains the existence of a critical period with genetic drift, the idea that when there are no selection pressures on multiple alleles acting on the same trait, one of the alleles will gradually diminish through evolution. Because the simulation reveals no evolutionary advantage of acquiring more than one language, Hurford suggests that the critical period evolved simply as a result of a lack of selection pressure.
Komarova and Nowak's dynamical system
Komarova & Nowak (2001) supported Hurford's model, yet pointed out that it was limited in the sense that it did not take into account the costs of learning a language. Therefore, they created their own algorithmic model, with the following assumptions:
- Language ability correlates with an individual’s reproductive fitness
- The ability to learn language is inherited
- There are costs to learning a language
Their model consists of a population with constant size, where language ability is a predictor of reproductive fitness. The learning mechanism in their model is based on linguistic theories of Chomsky (1980, 1993)– the language acquisition device (LAD) and the notion of universal grammar. The results of their model show that the critical period for language acquisition is an "evolutionarily stable strategy (ESS)" (Komarova & Nowak, 2001, p. 1190). They suggest that this ESS is due to two competing selection pressures. First, if the period for learning is short, language does not develop as well, and thus decreases the evolutionary fitness of the individual. Alternatively, if the period for learning language is long, it becomes too costly to the extent that it reduces reproductive opportunity for the individual, and therefore limits reproductive fitness. Therefore, the critical period is an adaptive mechanism that keeps these pressures at equilibrium, and aims at optimal reproductive success for the individual.
- Barry, Susan R. (January 2010). "Thwarted at every turn, Guest Editorial". Optometry — Journal of the American Optometric Association. 81 (1): 2–3. doi:10.1016/j.optm.2009.10.003.
- Birdsong, David, ed. (1999). Second language acquisition and the critical period hypothesis : [August 1996 ... symposium entitled "New Perspectives on the Critical Period for Second Language Acquisition"] (1 ed.). Mahwah (NJ): Erlbaum. ISBN 0-8058-3084-7.
- Castro-Caldas, A.; Petersson, A.; Reis, S.; Stone-Elander, S.; Ingvar, M (1998). "The illiterate brain: Learning to read and write during childhood influences the functional organization of the adult brain". Brain. 121 (6): 1053–63. doi:10.1093/brain/121.6.1053. PMID 9648541.
- Chomsky, Noam (15 March 1969). Aspects of the Theory of Syntax. MIT Press. ISBN 978-0-262-26050-3.
- Chomsky, Noam; Huybregts, Riny; Riemsdijk, Henk C. van (1982). The Generative Enterprise: A Discussion. Foris.
- Chomsky, Noam (1993). "A minimalist program for linguistic theory". In Hale, Kenneth; Keyser, Samuel J. The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger. MIT Press. ISBN 978-0-262-58124-0.
- Dehaene, S.; Spelke, E.; Pinel, P.; Stanescu, R.; Tsivkin, S. (1999). "Sources of mathematical thinking: Behavioral and brain-imaging evidence"(PDF). Science. 284 (5416): 970–974. doi:10.1126/science.284.5416.970. PMID 10320379. Archived from the original(PDF) on 2013-07-21.
- Dye, Melody (February 9, 2010). "The Advantages of Being Helpless". Scientific American.
- Espinosa, L. M. (2007). "Second language acquisition in early childhood". In Rebecca Staples, New; Cochran, Moncrieff. Early Childhood Education: An International Encyclopedia. Westport, CT: Praeger Publishers. ISBN 978-0-313-33100-8.
- Fawcett, Sherry L.; Wang, Yi-Zhong; Birch (February 2005). "The Critical Period for Susceptibility of Human Stereopsis". Investigative Ophthalmology & Visual Science. 46 (2). pp. 521–525. doi:10.1167/iovs.04-0175.
- Flege, James Emil; Mackay, Ian R. A.; Piske, Thorsten (2002). "Assessing bilingual dominance". Applied Psycholinguistics. 23 (4): 567–598. doi:10.1017/S0142716402004046.
- Harley, Birgit (1986). Age in second language acquisition. College-Hill Press. ISBN 978-0-88744-269-8.
- Hurford, J. R. (1991). "The evolution of critical period for language acquisition". Cognition. 40 (3): 159–201. doi:10.1016/0010-0277(91)90024-X. PMID 1786674.
- Jones, Peter E. "Contradictions And Unanswered Questions In The Genie Case: A Fresh Look At The Linguistic Evidence". FeralChildren.com. Archived from the original on 2016-03-12.
- Komarova, N. L.; Nowak, M. A. (2001). "Natural selection of the critical period for language acquisition". Proceedings: Biological Sciences. 268 (1472): 1189–1196. doi:10.1098/rspb.2001.1629.
- Lenneberg, E.H. (1967). Biological Foundations of Language. Wiley. ISBN 0-89874-700-7.
- Loewen, Shawn; Reinders, Hayo (2011). Key concepts in second language acquisition. Houndmills, Basingstoke, Hampshire: Palgrave Macmillan. ISBN 978-0-230-23018-7.
- Neville, H.J.; Bavelier, D. (March 2001). "Variability of developmental plasticity". In McClelland, J.; Siegler, R. Mechanisms of cognitive development: Behavioral and neural perspectives. Carnegie Mellon Symposia on Cognition (1 ed.). Psychology Press. ISBN 978-0-8058-3276-1.
- Oyama, S. (1976). "A sensitive period for the acquisition of a nonnative phonological system". Journal of Psycholinguistic Research. 5 (3): 261–285. doi:10.1007/BF01067377.
- Paradis, Michel (1999). Neurolinguistic aspects of bilingualism. Amsterdam: J. Benjamins. pp. 59–60. ISBN 90-272-4127-9.
- Penfield, W.; Roberts, L. (1959). Speech and Brain Mechanisms. Princeton: Princeton University Press. ISBN 0-691-08039-9.
- Pinker, S.; Bloom, P. (1990). "Natural language and natural selection". Behavior and Brain Sciences. 13 (4): 707–784. doi:10.1017/s0140525x00081061.
- Pinker, S. (1994). The Language Instinct. New York: Morrow. ISBN 84-206-6732-3.
- Ramscar, M.; Gitcho, N. (2007). "Developmental change and the nature of learning in childhood". Trends in Cognitive Science. 11 (7): 274–279. doi:10.1016/j.tics.2007.05.007. PMID 17560161.
- Reis, A.; Castro-Caldas, A. (1997). "Illiteracy: A case for biased cognitive development". Journal of the International Neuropsychological Society. 3 (5): 444–450. PMID 9322403.
- Robertson, P. (2002). "The Critical Age Hypothesis". Asian EFL Journal.
- Sacks, Oliver