Introduction
11
separate the two languages during acquisition without ever addressing the question
of why either case should apply at all.
Moreover, this debate kept the focus away from a characteristic of BFLA
which I believe to be of major importance, namely the fragmentary nature of
transfer effects. As soon as this is considered it becomes evident that, strictly
speaking, neither of the two traditional hypotheses can be correct. If it were the
case that the two languages are completely separate, the child’s utterances should
feature absolute absence of any apparent translations unless these are a
consequence of a particular developmental stage within one language, in which case
they will also be present in the speech of monolinguals. On the other hand, if the
two languages gave rise to a hybrid system, transfer effects should be present
indiscriminately across the board and in copious amounts. Neither of these
scenarios reflects the facts. For example, De Houwer (1990) showed that the
subject of her case study, Kate, used each language in a manner largely consistent
with the language of her interlocutors, though she arguably displayed some transfer
effects. In the same vein, Genesee (1989) discussed a series of experiments which
showed, rather conclusively, that although bilinguals display occasional transfer
effects, they systematically use each language predominantly with speakers of that
language (1989). Furthermore, Genesee et. al. (1995) reviewed a number of cases
showing that the path bilingual children follow during the development of their two
languages closely resembles the one followed by their monolingual peers.
This type of evidence has been largely taken to indicate that the Single
System Hypothesis is untenable, although I believe it also raises questions about
the defensibility of the Separate Development Hypothesis. Most of all, there seems
to be a necessity for formulating the idea of separation in a manner which allows
some type of contact between the two languages. However, a precise reformulation
of the Separate Development Hypothesis is yet to be provided.
Nevertheless, as the Separate Development Hypothesis gained wider
support, researchers have taken a different perspective on BFLA. For example,
Döpke (2000) points out that the sporadic nature of transfer effects should not
serve as an excuse for disregarding them, since they could be invaluable clues as to
the cognitive processes involved in the simultaneous acquisition of two languages.
In the same vein, Müller (1998) suggests that, although the two languages may
develop separately, they still may have some influence on each other and that it is
the domain of the bilingual researcher to investigate how this influence process
might function. A new and more specific question is thus raised with regard to the
Introduction
12
architecture of the acquisition process in general and how it can result in the
development of two separate, yet connected, language systems.
The aim of this thesis is to contribute towards an understanding of these
issues by proposing a potential explanation for the occurrence of transfer effects
and investigating the theoretical and empirical consequences of this explanation.
Firstly, I will be concerned with the structure of the acquisition device. The
bulk of chapter 1 will be dedicated to this purpose. Based on assumptions that are
mostly drawn from standard linguistic theory, I will propose a theoretical model of
language acquisition which might shed some light on the workings of bilingual
acquisition and enable us to make some predictions as to what particular structures
may be subject to transfer effects. In particular, I will argue that the mechanism
responsible for the emergence of transfer effects is the very same one that
underlies the well-documented overgeneralisation process in monolinguals. This link
between transfer effects and overgeneralisation is arguably the main motivation
behind the model to be discussed in this thesis.
The necessity to link BFLA to monolingual acquisition has sometimes been
acknowledged elsewhere in the literature. For example Hulk and Müller (2000)
suggest that transfer is “more likely to occur in exactly those areas which are also
problematic - albeit to a lesser extent - for monolingual children” (2000:228). This
statement refers to their idea that bilingual and monolingual children alike have
‘problems’ with structures usually associated with the C-domain (e.g. verb second,
complementizers, and topicalization). According to this view, transfer effects result
from a ‘relief strategy’ which bilinguals are said to apply in order to deal with the
problems that arise from the alleged vulnerability of the C-domain.
Nevertheless, I believe that if a theory of BFLA is to achieve explanatory
adequacy it must necessarily regard transfer effects as a consequence of the
workings of the language acquisition process. Hypotheses based on claims that
bilinguals employ some ‘special’ principles or ‘relief strategies’ are undesirable as
they only provide a formal description of the problem. Moreover, if the language
behaviour of bilinguals results from the application of special principles then
bilingualism is of little interest to the acquisitional theorist since it sheds no light on
the workings of the acquisition process per se. On the other hand, if we take such
language behaviour to be a direct consequence of the very system that underlies
acquisition, its investigation will contribute to our understanding of that system.
Taking this as a point of departure, I will try to derive transfer effects from the
mechanisms that underlie language acquisition in general. This will lead us to the
second aim of this thesis, namely that of assessing the empirical consequences of
Introduction
13
the acquisitional model to be proposed. This will involve discussion of various data-
sets from the literature as well as of newly conducted research and will be the focus
of the remaining three chapters, to be organised as follows.
In chapter 2 I will introduce examples of transfer effects from two case-
studies which involve two different language pairs, namely English-Italian and
Cantonese-English. I will then show how these can be explained by appealing to the
architecture presented in chapter 1. The same chapter also involves discussion of
some non-adult utterances which - prima facie - appear to be manifestations of two
different phenomena, namely transfer effects and delay. I will suggest that –
contrary to traditional views - these do not involve separate cognitive mechanisms
and can be accommodated as cases of transfer, despite the fact that they appear to
be radically different on the surface. Some issues concerning overgeneralisation in
monolinguals will also be addressed in this chapter.
In the third chapter I will discuss some experimental evidence which
indicates that transfer effects do not affect the development of Principle B of Binding
Theory (Chomsky, 1981). I will then argue that this follows from the architecture
proposed in chapter 1. I will also present some newly obtained experimental
evidence in support of the same analysis but with regard to monolingual
development.
The fourth chapter discusses two other areas within which transfer effects
have been argued to be absent, namely Pro-drop and Root Infinitives. I will argue
that this is in fact predicted by our analysis since these phenomena are related to
principles that reside outside the domain of application assumed for our model.
Chapter 5 gives some general conclusions. An appendix is also included with
details of the experimental material employed as part of the experiment discussed
at the end of chapter 3.
14
CHAPTER 1
On the Acquisition of Lexical Properties
1. Introduction
Developing a theory of language acquisition is a massive undertaking and I am not
hoping to address every aspect of language development, or of the mechanisms
involved in it. However, it is my intention to present a view sufficiently precise to
help us make some headway in the domain of lexical generalisation and, ultimately,
transfer effects.
The aim of this chapter is threefold. Firstly, I will introduce a particular
architecture of the language system, based on the work of Jackendoff (1997). I will
then outline a specific model of language acquisition which involves updating of
lexical items followed by systematic generalisation of newly acquired properties.
Finally, I will show how this may help us understand some well-known cases of
overgeneralisation and suggest that the same system can also provide an
explanation for the transfer effects found in bilinguals. More precisely, I will argue
that given certain assumptions about the organisation of the lexicon, transfer
effects are a necessary consequence of the acquisition process.
2. The Acquisition Process
2.1 Preamble
In addressing the issue of language acquisition I will assume that humans are
endowed with innate linguistic knowledge in the form of a Universal Grammar
(Chomsky 1962, 1981). This assumption will provide an essential scaffold on which
a potentially successful theory of language acquisition can be developed. In
particular, it will play a fundamental part within the context of UG-compatibility, a
1. On the Acquisition of Lexical Properties
15
notion which will become central to our discussion when addressing well-known
learnability issues (cf. section 2.4 and section 5 below).
Following a theory of UG necessarily implies assuming that there are some
linguistic components which are built into what Chomsky (1986b) calls the initial
state of language acquisition. However, there has been much debate in the
literature with regard to how much should be assumed to be available to the child.
A variety of positions have been advocated, often within either a Continuity (see for
example Pinker 1984, Poeppel and Wexler 1993, Wexler 1999) or a Maturational
perspective (Borer and Wexler 1987, Radford 1988, 1990, Rizzi 1994, Wexler 1994,
among many others). Both Continuity and Maturational views come in a variety of
formulations. In its strongest formulation, the Maturation hypothesis claims that
language development involves not only the acquisition of lexical items but also the
addition or development of some specific categories and principles. For example,
Radford (1990) argued that the relative absence of elements such as determiners
and auxiliaries in early child speech can be explained by assuming that functional
categories are not represented in early child grammar and that these mature at a
later stage (see also Meisel 1994, Platzack 1990, Tsimpli 1996). A different
maturational account has been proposed by Rizzi (1993/1994) and Wexler (1994,
1996) who suggested that children’s linguistic behaviour does not necessarily lead
to the conclusion that functional categories are absent at the early stages. Instead,
they suggest that what needs to be assumed is the maturation of some principle
that disallows optionality. In Wexler’s terms, this would be a principle which dictates
that Tense must be obligatorily projected. Similarly, Rizzi suggests maturation of a
principle stating that root clauses must necessarily consist of a full Complementiser
Phrase. What sets these proposals apart from Radford’s is their commitment to the
claim that the initial state includes all the syntactic categories allowed by UG,
including functional categories.
As for the Continuity hypothesis, there are at least two formulations of it,
often referred to as ‘weak’ and ‘strong’ continuity, respectively. Strong continuity
accounts maintain that all UG properties and principles are available to the child
from the initial state. The obvious differences that exist between child language and
the adult target are put down to putatively UG-external factors. These include
constraints on phonological production (Demuth 1994, Gerken 1994), interface
problems between different linguistic components (e.g. Phillips 1996), or processing
limitations (Poeppel and Wexler 1993). However, it is not obvious why some of
these alleged solutions should qualify as UG-external. For example, it is at best
1. On the Acquisition of Lexical Properties
16
unclear why some phonological constraints may be assumed to be subject to
developmental stages while syntactic ones may not.
It is perhaps with this kind of objection in mind that some researchers have
opted for a weaker formulation of the Continuity hypothesis. This involves assuming
that although properties and principles do not mature, they may be present in an
underspecified version when compared to the target language. In other words, the
fact that child language may deviate from the target is explained solely in terms of
parameter/feature setting. This position is taken by, among others, Déprez (1994)
who suggests that optionality in child grammar can be explained by assuming that
functional categories may be underspecified in early child language (see also
Radford 1995, Vainikka 1993/1994). Since under-specification is a possible setting
in adult grammars too (see also section 2.1 below), it can be maintained that child
language is consistent with adult language, though not necessarily with the target.
In other words, every instance of child language corresponds to a possible adult
language. Consequently, the weak continuity hypothesis is largely compatible with
most versions of Maturation, as most maturational accounts assume that there is no
principle which is peculiar to child language (see for example the maturational views
proposed in Borer and Wexler (1987) and Radford (1988, 1990)).
At first glance, it seems inevitable that any theory of acquisition must
necessarily be developed with one of these assumptions in mind. Nevertheless, in
this thesis I will abstract away from the Continuity/Maturation debate as I believe
that the model I will be defending is compatible with either assumption, for the
following reasons. The focus of this thesis is on how the language acquisition device
deals with a lexical feature/property once this has been associated with some items
(cf. section 2.4.3) and the impact this has on the rest of the lexicon (cf. section 3).
Whether the acquisition of the feature/property at issue is dependent on
maturational principles or on triggering experiences does not seem to be of
consequence. Because the model I will be presenting only makes claims about the
mechanisms that affect a feature/property once this comes to be hosted by some
lexical items, what happens before this process takes place (i.e. the feature is
immature, dormant, below threshold etc.) does not affect our line of argumentation.
Indeed, some of the assumptions I will be making (e.g. UG-compatibility, cf. section
2.4.5) have been previously made by proponents of both Continuity and Maturation.
At the same time, as the discussion develops I will suggest that certain
specific properties should be assumed to be present from the initial state (see for
example sections 2.2 and 3.2.1 with regard to person features and conceptual
structures respectively). This is not equivalent to supporting the Continuity
1. On the Acquisition of Lexical Properties
17
hypothesis, as Maturational approaches also postulate the existence of certain
properties from the initial state. Nevertheless, it does entail that the model I
propose may be incompatible with certain specific Maturational accounts. For
example, the discussion in section 2.2 suggests that the model to be proposed is
incompatible with any account that assumes maturation of person and number
features (though I am not aware of any work within the universalist tradition that is
in line with this assumption, thus the above point may be a trivial one).
In sum, in those cases where I will be arguing that certain properties are
present from the initial state, I will be defending the positron that there is
developmental ‘continuity’ as far as those properties are concerned. This view is
compatible with the Continuity hypothesis as well as with a number of Maturational
accounts, since lack of maturation for a given group of properties does not
contradict Maturation per se, it simply suggests that such properties do not mature,
while others still may.
A more immediate question, and one which is not necessarily dependent on
the Continuity/Maturation debate, regards the identity of the properties that might
be present from the initial state of language acquisition. I will address this in section
3.2. First, however, I will lay out some assumptions concerning the architecture of
the acquisition process as well as the nature of the mechanisms involved.
2.2 Lexical Properties and their Value
During acquisition, the crucial task for the language system is to detect the
presence of a particular feature/property in the input and, consequently, to work
towards its acquisition. Following much current work I will assume that lexical
features are of a binary nature. With regard to this, some clarification is in order. It
has sometimes been suggested (e.g. Adger 2003, Brody 2002) that the most
economical distinction we can postulate is of a monadic nature, i.e. based on the
presence vs. absence of a feature. Although a possible representational choice, this
assumption raises a serious learnability problem. Consider a case in which the
acquisition device has concluded that some lexical item has a negative setting for a
feature f. Within a monadic system, this is equivalent to acquiring absence of f. This
process offers no obvious way of preventing the acquisition device from considering
addition of f again and potentially re-acquiring the same f feature which it had
previously concluded should be absent. Therefore, it seems that a monadical system
is unable to recognise whether it has achieved a target setting, namely absence of
f, or whether it has not yet been exposed to such feature at all. The only way of
enabling such a distinction is to introduce some memory device whereby deletion of
1. On the Acquisition of Lexical Properties
18
f is perceived differently from absence of f, essentially introducing a binary system.
I will briefly return to this in section 5 (see also Fodor and Sakas 2005 for extensive
discussion of how memory-less acquisition systems are bound to fail). I therefore
embrace the view that a feature must be set to either +f or -f and I will take the
absence of a feature as indicating the system’s lack of knowledge with regard to
that particular feature.
As has been convincingly argued for within morphology (Ackema 2001,
Andrews 1990, Blevins 1995), however, this cannot be the full picture since the
presence of a feature does not necessarily entail the presence of a specific value. In
particular, Blevins (1995) – building on the work of Andrews (1990) - proposes that
morphological syncretism should be taken as arising from feature under-
specification. The features associated with each paradigm member, he argues, must
necessarily be defined in direct opposition with other members if feature-based
analyses are to move beyond simple descriptive formalism. The English verbal
paradigm illustrates this point:
1. English Inflectional Paradigm
walk
walk
walk-s
1-sg
2-sg
3-sg
walk
walk
walk
1-pl
2-pl
3-pl
Besides creating some unnecessary redundancy, associating the form walk with five
separate feature specifications is conceptually unrevealing to say the least. A much
more parsimonious alternative is to recognise that the two separate forms stand in
direct opposition:
2. Paradigmatic Opposition: English
walk-s marked: 3-sg
walk unmarked, general form
It can then be assumed, following Andrews (1990), that a condition on
morphological blocking will prevent the general form from applying when a more
specified one exists1, hence the ungrammaticality of examples like *he walk.
A natural formalisation of this idea would be to assume that the language
learner applies some form of the Biuniqueness Principle (Dressler 1985) which
entails that every morpheme corresponds to only one feature specification and
every feature specification to only one morpheme. Within morphology, the mapping
1. On the Acquisition of Lexical Properties
19
of a single phonological string onto two separate feature bundles (cf. 1) is
disallowed, as is the co-existence of two separate phonological strings mapped onto
the same feature bundle2 (hence preventing the formation of he walk where walk
would have the same specification as walks3). These are illustrated by (3a) and (3b)
respectively:
3. a. * f, f1
/phon/
f2, f3
b. * /phon/
f, f1
/phon1/
The emergence of general forms can therefore be viewed as a consequence of the
principle in (3a). The principle in (3b), on the other hand, is a formalisation of the
blocking principle mentioned above and it is responsible for the absence of free
variation between specified (e.g. walks) and general forms (e.g. walk). Several
formulations of this principle have been proposed in the morphological literature
(see, for example, Aronoff 1976, Lapointe 1980, Pinker 1984)4.
Another important reason against the analysis in (1), and therefore in favour
of the principle in (3a), comes from learnability considerations. Pinker (1984)
argues that unconstrained proliferation of feature specifications to be mapped onto
homophonous strings leads to serious learnability problems5. A theory that allows
zero-inflection to occur indiscriminately has no way of preventing children from
hypothesising grammaticalisable features for every single lexical item that belongs
to a potentially inflectable category. Needless to say, this leads to massive
computational waste and potential failure to achieve the target language.
Consequently, Pinker suggests a development of Slobin’s (1984) hypothesis
and proposes that postulation of feature specifications must be limited to those
cases where the input provides a morphological contrast in the form of some overtly
realised feature bundle(s), as is the case in the English verbal paradigm. This would
successfully solve the computational problem. Because the child encounters the –s
affix in third person singular cases, s/he will be able to postulate a specific 3rd
person singular form and contrast this with an underspecified ‘elsewhere’ case. In
cases where no morphological marking is present at all, however, there is no
contrast to trigger the postulation of specific features. Therefore, the number of
specific features postulated by the language learner is directly proportional to the
number of distinct morphological forms found in the input (see also Koeneman
2000). Application of the morphological principles in (3) further simplifies the
learner’s task of mapping morphological content onto phonological strings.
1. On the Acquisition of Lexical Properties
20
Inevitably, however, the claim in (2) raises the question of how this kind of
generality is formally encoded. One possibility is that of analysing generality as a
consequence of feature disjunction (cf. Karttunen 1984). On this view, the English
verbal paradigm would look as follows:
4. Generality as disjunction
walk-s 3-sg
walk 1-sg \/ 2-sg \/ pl
This is hardly an improvement over analyses such as (1) above as it allows the
amalgamation of features that do not form a natural class, leading to essentially
“arbitrary formal specifications” (Blevins 1995:124)6. This type of analysis also fails
to capture an important crosslinguistic generalisation, namely that syncretism does
not occur randomly. For example, Williams (1981) points out that – in the nominal
domain - syncretism is widely attested between nominative singular and accusative
singular but not between nominative singular and accusative plural (see also
Carstairs 1983).
A more promising alternative is to analyse the general case as
underspecified for Person and Number features, which gives the opposition shown
below7. Following a long tradition (dating back to Chomsky and Halle 1968), alpha-
notation is used to indicate underspecification:
5. Underspecification
walk-s 3-sg
walk αperson, βnumber
This offers a satisfactory solution to overspecification problems that arise with
analyses such as the one given in (1) above without having to resort to arbitrary
specifications.
I will therefore adopt the view that features can be underspecified as a result
of the application of the Biuniqueness Principle which I will assume to be at work
within morphological paradigms (but see Leiss 1997 for a different position which
advocates the Biuniqueness Principle being active across grammatical categories).
We are now ready to lay out what possible feature settings may be relevant
to the acquisition process. These correspond to either [+f] or its counterpart [–f] as
well as the underspecified [αf]. Crucially, the underspecified case cannot be taken
to result from the total absence of a feature since, as mentioned above, total
1. On the Acquisition of Lexical Properties
21
absence of a feature indicates that such feature has not yet been acquired. I will
therefore make use of the notation [αf] to indicate that a feature is underspecified.
In sum, there are four distinct states of affairs that can obtain with regard to
any given feature:
6. a. no knowledge of f
b. αf Knowledge of f but underspecified value for f
c. +f Knowledge of f specified as +
d. -f Knowledge of f specified as -
The cases represented in (6) will have the same effect on language production
independently of whether they obtain during acquisition or as part of the adult
knowledge. However, the states represented in (6a) and (6b) can have different
implications due to the different status that the child’s and the adult’s system have
with regard to accessibility. As far as the child is concerned, absence of f and
underspecified f can be two subsequent stages that precede a complete setting for
either [+f] or [-f]. If a feature is temporarily absent in the child’s knowledge, it
might later be detected in the input and therefore be made part of such knowledge.
When this happens, however, the child might not yet have derived the value for it.
S/he will therefore set it as underspecified (i.e. [αf]) until further evidence is
gathered and a value can be set. Indeed, it could turn out that the target language
has no specification for f and therefore there will be no further value to add. As for
the adult, the four cases represented in (6) are independent of each other, having
no chronological connection.
2.3 Language Variation
Following the generally accepted assumption that language variation is restricted to
the lexicon (Borer 1984, Elliott and Wexler 1985, Chomsky 1995) I will view
language acquisition as the process involving the setting of the idiosyncratic
combination of properties that makes up the identity of each individual lexical item.
On this view, acquiring a lexical item is equivalent to setting the semantic, morpho-
syntactic and phonological properties that will determine the item’s behaviour within
each of the linguistic components.
Following Jackendoff (1987, 1997) I will take these components to be
Representationally Modular. This view takes the three components as independent
of one another, each with its own set of “combining principles” and distinct
“representational formats” (Jackendoff 1997:41). Consequently, grammaticality