BOOT-LA

Bootstrapping in Language Acquisition

Psychological, Linguistic and Computational Aspects

Scope	Location	Time	Areas of Interest
Invited speakers	Organisational Issues		Contact
Program	LSRL 33	Abstracts	Hotel Information

Scope

Bootstrapping approaches to language acquisition have been extensively addressed by researchers in domains like psycholinguistics, computational and theoretical linguistics, cognitive science, machine learning.

This workshop aims at bringing together researchers in these fields.

Location

Indiana University, Bloomington campus, Bloomington, Indiana.

Time

21st to 23rd of April, 2003

Areas of interest

The workshop seeks to provide a forum for presentation and discussion of original research on different aspects of bootstrapping including, but not limited to:

its role in language acquisition for different domains, from a theoretical, computational and psycholinguistic point of view
empirical evidence for bootstrapping strategies from psycholinguistic, psychological or cognitive science
concepts, algorithms, techniques, strategies and consequences from work on:

grammar induction
learnability theory
parameter setting approaches
constraint-based approaches
cue-based learning

computational approaches to learning subcategorization frames
how much knowledge can be bootstrapped from function words
strategies for lexical/sense disambiguation.

Invited speakers are:

Morten Christiansen (Cornell University)
Michael Gasser (Indiana University)
William Sakas (City University of New York)
Linda Smith (Indiana University)
Juan Uriagereka (University of Maryland and University of the Basque Country)
Jürgen Weissenborn (Universität Potsdam)
Charles Yang (Yale University)

Organisational Issues

Organisation

Damir Cavar
Malgorzata Cavar
Laurent Dekydtspotter
Khaled Elghamry
Steven Franks
Rex Sprouse

Contact

Damir Cavar <dcavar@iu.edu>

Indiana University
Linguistics Dept.
Memorial Hall, Room 402
1021 E. Third Street
Bloomington, IN. 47405-7005
phone: (812) 855-3268
fax: (812) 855-5363

Program

Isabelle Barriere, Marjorie Lorch	Morphological bootstrapping in the acquisition of Argument Structure
Aleka A. Blackwell	The effect of the semantic typology of a lexical class on its acquisition: Evidence from young children’s acquisition of English adjectives	23rd, 11:15-12:00
Morten Christiansen	Syntactic Bootstrapping through Multiple-Cue Integration	21st, 14:15-15:00
Hamid R. Ekbia, Joshua Goldberg,David Landy	Starting Large or Small? An Unresolved Dilemma for Connectionist Models of Language Learning	21st, 15:00-15:45
Michael Gasser	tba
Judith A. Gierut, Holly L. Storkel, Michele L. Morrisette	The syllable as a phonological bootstrap: Linguistic categorization in typical and delayed development
Annette Hohenberger	Bootstrapping into recursive structures: Compounds, SCs, and phrasal syntax
Gaja E. Jarosz	Separating Structure and Category Learning: A Model of Phrasal Categories
Christopher Johnson	Bootstrapping and prototypes
Jacqueline van Kampen	Bootstrapping syntactic categories
Fatma Nihan Ketrez	Is it possible to bootstrap any lexical category information from word order in a flexible-word-order language?	22nd, 10:15-11:00
Sean McLennan	Schema Theorem in Language Acquisition: A Rags to Riches Story	23rd, 10:15-11:00
William G. Sakas	tba	22nd, 18:00-18:45
Linda Smith, Eliana Colunga Hanako Yoshida	Statistical Bootstrapping: Correlations between the lexicon and perception create "ontological" kinds	21st, 13:00-13:45
Melanie Soderstrom, Amanda Seidl, Deborah Kemler Nelson, Jim Morgan	Infants are sensitive to the prosodic contours of phrases	22nd, 11:15-12:00
Juan Uriagereka	Categorial Dimensions and Sub-case Conditions in Syntactic Bootstrapping	23rd, 12:00-12:45
Jürgen Weissenborn	Prosodic and lexical aspects of the acquisition of syntax in German infants	22nd, 09:30-10:15
Charles D. Yang	tba	21st, 16:15-17:00

21st of April

12:45-13:00	welcome
13:00-13:45	Linda Smith, Eliana Colunga Hanako Yoshida: "Statistical Bootstrapping: Correlations between the lexicon and perception create 'ontological' kinds"
break
14:15-15:00	Morten Christiansen: "Syntactic Bootstrapping through Multiple-Cue Integration"
15:00-15:45	Hamid R. Ekbia, Joshua Goldberg, David Landy: "Starting Large or Small? An Unresolved Dilemma for Connectionist Models of Language Learning"
break
16:15-17:00	Charles D. Yang: "tba"
17:00-17:45	tba

22nd of April

09:30-10:15	Jürgen Weissenborn: "Prosodic and lexical aspects of the acquisition of syntax in German infants"
10:15-11:00	Fatma Nihan Ketrez: "Is it possible to bootstrap any lexical category information from word order in a flexible-word-order language?"
break
11:15-12:00	Melanie Soderstrom, Amanda Seidl, Deborah Kemler Nelson, Jim Morgan: "Infants are sensitive to the prosodic contours of phrases"
12:00-12:45	tba
break
14:15-15:00	tba
15:00-15:45	tba
break
16:15-17:00	tba
17:00-17:45	tba
break
18:00-18:45	William G. Sakas: "tba"

23rd of April

09:30-10:15	tba
10:15-11:00	Sean McLennan: "Schema Theorem in Language Acquisition: A Rags to Riches Story"
break
11:15-12:00	Aleka A. Blackwell: "The effect of the semantic typology of a lexical class on its acquisition: Evidence from young children’s acquisition of English adjectives"
12:00-12:45	Juan Uriagereka: "Categorial Dimensions and Sub-case Conditions in Syntactic Bootstrapping"

Abstracts

Statistical Bootstrapping: Correlations between the lexicon and perception create "ontological" kinds

Linda Smith, Eliana Colunga and Hanako Yoshida (Indiana University)

Notions of animals, objects and substances are important to and reflected in the structure of language. In this work, we show how these notions are partly a product of language, constructed out of correlations between the lexicon, the perceptual properties of things in the world, and linguistic cues. We show how children’s learning of names for specific things creates abstract knowledge that transcends those experienced instances and also informs children’s subsequent judgments of novel instances, and their acquisition of new nouns. We use a connectionist network to show that these “rules” emerge as second-order generalizations over perceptual and linguistic regularities through ordinary processes of associative learning and generalization by similarity. And we, show that language learning matters by comparing children learning English and Japanese.

tba

Michael Gasser (Indiana University)

Alongside proposals that the language learner is bootstrapped in learning through regularities within the language itself or in the non-linguistic world are proposals that focus on constraints that originate in putative properties of the cognitive architecture and the interaction of this architecture with properties of the stimulus. In this paper I will discuss a proposal of this last type, a neural network model that starts with a simple form of competitive learning applied to the general problem of learning the mapping between forms and meanings. Given the largely arbitrary nature of the mapping, the model converges on mostly local representations of words, which in turn provide the basis for the fundamental symbolic properties of language: hierarchical representations and compositionality.

Categorial Dimensions and Sub-case Conditions in Syntactic Bootstrapping

Juan Uriagereka (University of Maryland and University of the Basque Country)

It is well-known that faced with an ambiguous situation which can be associated to a new lexical token x, such that x could denote either a boundless concept (e.g. an abstract or mass term, a state or an activity) or instead a concept which in some sense is bounded (e.g. a corresponding concrete count term, a telic event), children normally go for the second alternative. It is essentially only when terms denoting (perceptually ambiguous) concrete objects/events have already been acquired that children concentrate on more abstract objects/states, assuming that a new term y applied to the relevant percept cannot denote the very same notion that a distinct term x does. The question is how children achieve this result. How do they know to take a concrete interpretation when an unknown term is used to denote an (ambiguous) percept?

It may be thought that children inductively abstract from concrete instances, which could putatively explain the above state of affairs. However, that strategy leads to a rather different pattern. When faced with a new term x for an ambiguous percept, a seriously inductive learner ought to assume that x names a token percept of some sort, generalizing to some abstraction over other elements like x only when discovering that a related, though different, token percept is also denoted as x. But in that sense, it ought to be equally easy for a child to generalize over token masses or states as it apparently is to generalize over token objects or events. That is, there is nothing a priori more or less natural about thinking of x as the name of a particular thing or event than it is to think of x as the name of a particular mass or state (unless we want to beg the question and claim that only events or things can be named, which is certainly false in merely logical terms). If anything, it should be easier to generalize over unstructured masses or states precisely because of their lack of structure; intuitively, it is easier to see that two token wines are that, wine, than it is to discover that two token animals are both dogs.

The (controversial) observation that mass terms are in some sense formally less complex than corresponding count terms is found, in some form, in Quine. It certainly squares well with the observation that in many languages counting requires additional grammatical formatives. The same conclusions (with similar grammatical evidence) can be reached for verbal elements, which leads to the major distinction between telic events and more basic elements in a Vendler-style hierarchy. Suppose, then, that on grammatical terms we find the basic hierarchical cuts (Abstract/mass Terms < Count Elements) and (States/activities < Telic Events). Does this have a bearing on the acquisition sequence?

On first examination, it would seem to work backwards: shouldn’t grammatically simpler terms be acquired before more complex ones? Well, not if the hierarchies above are innate, and what is acquired is merely an arbitrary term corresponding to them. Then it is actually reasonable for the acquirer to bet (in situations of ambiguous percepts) for a pairing with the more complex structuring first. That is the safest assumption to make, which could be corrected by positive data if it turned out to be wrong (the ‘Sub-case situation’). The opposite assumption is falsifiable only through negative evidence, which by hypothesis is unavailable to human learners prior to the acquisition process.

Starting Large or Small? An Unresolved Dilemma for Connectionist Models of Language Learning

Hamid R. Ekbia, Joshua Goldberg and David Landy (Indiana University)

The issue of negative evidence in language learning is at the core of the nativist-empiricist debate. Connectionist models of language learning have entered this debate from a constructivist perspective. The claim is that certain connectionist architectures can successfully learn simple grammars in the absence of negative evidence in the training set (Elman 1990). What makes this possible, according to these accounts, is the mechanism of starting small. Roughly, this is the idea that the network, having started the learning process with simple inputs (with short-range dependencies), can bootstrap by gradually developing the right representations for processing more complex inputs. This is reportedly achieved by the manipulation of either the input data or the processing capacity of the network (e.g., the size of the input buffer or of the recurrent memory). On the other hand, attempts by other connectionists to replicate this experiment have either failed, or have led to rather contrary results (Rhode & Plaut 1999), leaving an unresolved dilemma for connectionist research. The dilemma for connectionism would be how to model the learning of language by children. Is the model of Elman or that of Rhode and Plaut accurate to the situation faced by children? Or is neither relevantly accurate?

Our purpose in this presentation is to argue that this dilemma cannot be resolved within the connectionist framework for a number of reasons — most notably:

The impoverished notion of “semantics” embraced by connectionist models of language learning
The narrow notion of “context” embraced by connectionists as something that only deals with the “linguistic” context
The limited notion of “time” as, for instance, captured in the recurrent connections of Elman nets

A thorough treatment of the debate requires revisiting the above notions in a fundamental manner that, we believe, would take one outside of the accepted principles of current connectionism.

Morphological bootstrapping in the acquisition of Argument Structure

Isabelle Barriere (Johns Hopkins University & University of Hertfordshire) and Marjorie Lorch (Applied Linguistics, Birkbeck College, London)

The investigation of the acquisition of Argument Structure has led researchers to propose two bootstrapping hypotheses. According to the Semantic bootstrapping hypothesis (Pinker, 1989), children rely on semantic cues in order to predict Argument Structure Alternation. In contrast, according to the Syntactic bootstrapping hypothesis (Gleitman, 1990), children make use of the arguments in order to acquire Argument Structure Alternation.

The first part of this paper demonstrates that these two hypotheses fail to consider the role of valency-marking morphemes in the process of acquisition of verb argument structure. In morphologically rich languages such as Inuktitut (Allen), Kiche (Pye, 1994) and Romance languages, the valency of the verbs is morphologically marked. However, when investigating the acquisition of these languages, most researchers have implicitly assumed that the thematic role assignment applied to the (sometimes ambiguous) morphological markers by children is appropriate (see Allen, 1996, among others). Thus children’s overgeneralizations have typically been classified into two categories: increased valency and decreased valency (Figueira, 1984; Allen, 1996). This classification fails to consider instances of unadult-like mapping between appropriate thematic role assignment and the inappropriate valency-marking morphological pattern associated with a verb which children have been shown to apply (see Allen, 1996 for examples in Inuktitut and Pye, 1988 & 1994 for Kiche). In light of this problem, we propose an alternative classification which includes an additional category -*Maintained Valency- which refers to instances when the thematic role assignment of the verb is maintained while the valency-marking morphological pattern associated with the verb is inappropriate.

The consideration of these issues and the application of this new classification require the use of complementary research strategies when investigating the acquisition of AS. We illustrate our argument by presenting a study on ASA by French speaking children which relies on a range of methodological procedure (speech production, comprehension and grammaticality judgments, using real and nonce-verbs) and enable us to investigate children’s assignment of thematic roles to an ambiguous construction, namely the clitic SE+V which gives rise to reflexive, reciprocal, neuter/ergative (no implied agent) and middle-passive (implied agent) interpretations. The results obtained show that while children use semantic cues such as animacy in order to interpret these constructions, their representations do not match those of adults until at least age 5, as the middle-passive interpretation is not available to them.

We conclude by discussing the need to consider ambiguous valency-marking morphemes in the acquisition of Argument Structure and by proposing a developmental account of children’s acquisition of SE-constructions which integrates semantic, syntactic and morphological cues.

The effect of the semantic typology of a lexical class on its acquisition: Evidence from young children’s acquisition of English adjectives

Aleka A. Blackwell (Middle Tennessee State University)

The issue of referential indeterminacy (Quine, 1960) has raised possibly the most important question in the study of word learning: How do children overcome this indeterminacy? Several word learning theories have appeared in the literature in recent years to account for children’s ability to learn the meaning of new words in what seems to be a very rapid rate. Addressing this issue, the present study presents evidence in support of the hypotheses that word learning is largely determined by the properties of semantic subcategories represented by a lexical class (Gasser & Smith, 1998). The study analyzes the composition of the adjective lexicon in the language of two children ages 2;3–5;2 (Adam) and 2;3–5;0 (Sarah) and their mothers (MacWhinney, 2000). The data are 7,449 child utterances and 6,437 maternal utterances with one of 305 adjectives. Child and maternal adjective use was examined in terms of six age categories (<2;6, 2;6–2;11, 3;0–3;5, 3;6–3;11, 4;0–4;5, 4;6–5;0). The semantic composition of the adjective lexicons of children and their mothers was analyzed in terms of Dixon’s (1982) semantic types: age, dimension, value, color, physical property, human propensity. Adjective use by the children and their mothers was evaluated both in terms of type/token ratios for each semantic category—overall and at each developmental level—and in terms of frequencies of individual semantic types following Sandhofer et al. (2000). The analyses indicate that mothers and children produce a somewhat similar number of adjective types (Sarah=127, Adam’s mother=138, Adam=143, Sarah’s mother=174), with Adam’s mother and Sarah producing larger numbers of tokens (Adam’s mother=4,402, Sarah=3,734, Sarah’s mother=3,047, Adam=2,703). However, proportionately, color adjectives, which represent 10 to 16% of the adjective tokens in the two children’s speech respectively, are practically absent from the mother’s speech (0 and 1% respectively), and value and human propensity adjectives represent a larger proportion of the adjective tokens in adult speech whereas physical property adjectives represent a larger proportion of adjective tokens in child speech. The patterns also differ at each developmental stage with the children using more color, dimension, and physical property adjectives and the mothers using more value and human propensity adjectives. I argue that the specific properties of the semantic categories to be learned by young children—even within one lexical class—play a significant role in children’s lexical development.

References

Bloom, P. (2002). How children learn the meaning of words. Cambridge, MA: MIT Press.

Dixon, R.M.W. (1982). Where have all the adjectives gone?: and other essays in semantics and syntax. Berlin: Mouton.

Gasser, M. & Smith, L.B. (1998). Learning nouns and adjectives: A connectionist account. Language and Cognitive Processes, 13 (2/3), 269-306.

MacWhinney, B. (2000). The CHILDES database: Tools for analyzing talk. 3^rd Edition. Vol 2: The Database. Mahwah, NJ: Lawrence Erlbaum Associates.

Quine, W.V.O. (1960). Word and object. Cambridge, MA: MIT Press.

Sandhofer, C.M., Smith, L.B., & Luo, J. (2000). Counting nouns and verbs in the input: differential frequencies, different kinds of learning? Journal of Child Language 27, 561-585.

Syntactic Bootstrapping through Multiple-Cue Integration

Morten Christiansen (Cornell University)

tba

The syllable as a phonological bootstrap: Linguistic categorization in typical and delayed development

Judith A. Gierut (Indiana University), Holly L. Storkel (University of Kansas), Michele L. Morrisette (Indiana University)

Phonological bootstrapping hypothesizes that language learners detect linguistically relevant phonological properties in the native language input, and then make use of these properties to organize their representation of that input. The syllable has been identified as one such phonological bootstrap, receiving robust support in both perception and production across a broad developmental span. Interestingly, however, few studies have directly examined what children know about the internal structural representation of syllables. In a series of 7 experiments, we address this question from dual perspectives of perception and production. Fifty preliterate preschool children, whose native language was English, were recruited and assigned to 1 of 2 groups: those with typically developing productive sound systems and those with delayed systems, as based on elicited productions. Children then participated in oddity tasks to determine their judgments of the perceived similarity of syllable structure, with particular emphasis on the onset. Four dimensions of structural similarity were manipulated across experiments, following from the purported linguistic representation of a syllable onset (cf. Clements & Hume, 1995 for overview). These were (1) number of segments in onset position, (2) occurrence of onset-internal branching structure, (3) location of branching structure, and (4) conformity to the Sonority Sequencing Principle. In game format, the oddity task required a child to listen to unique legal triplets of syllables corresponding to the ‘names’ of identically pictured characters, and to judge which 2 of 3 were ‘friends’ (cf. Treiman & Baron, 1981). Two main findings converged across experiments. First, the primary dimension that children used in judgments of the structural similarity of onsets was the Sonority Sequencing Principle and secondarily, the occurrence of branching. This suggests that children have access to details about subsyllabic structure that wholly accords with fully developed linguistic systems. Second, there were no differences in the similarity judgments of the two groups of children, despite obvious differences in their productive sound systems. These findings have implications for our understanding of the linguistic bases of categorization in development, and underscore long noted asymmetries in the relationship between perception and production. These asymmetries cannot be adequately handled by ‘spotlighting hypotheses’ associated with some phonological bootstrapping models (Peters & Strömqvist, 1996), but are more parsimonious with contemporary constraint-based accounts of phonological development (Smolensky, 1996). [Supported by NIDCD 01694, NIDCD 04781]

References

Clements, G. N., & Hume, E. V. (1995). The internal organization of speech sounds. In J. A. Goldsmith (Ed.), The handbook of phonological theory (pp. 245-306). Cambridge, MA: Blackwell.

Peters, A. M., & Strömqvist, S. (1996). The role of prosody in the acquisition of grammatical morphemes. In J. L. Morgan & K. Demuth (Eds.), Signal to syntax: Bootstrapping from speech to grammar in early acquisition (pp. 215-232). Mahwah, NJ: Erlbaum.

Smolensky, P. (1996). On the comprehension/production dilemma in child language. Linguistic Inquiry, 27, 720-731.

Treiman, R., & Baron, J. (1981). Segmental analysis ability: Development and relation to reading ability. In G. E. MacKinnon & T. G. Waller (Eds.), Reading research: Advances in theory and practice (pp. 159-198). New York: Academic Press.

Bootstrapping into recursive structures: Compounds, SCs, and phrasal syntax

Annette Hohenberger (Universität Frankfurt a.M.)

Recursion is the major characteristic of syntax. In this paper, I want to show by which bootstrapping processes the child eventually attains fully recursive syntax.

The lexicon as the earliest available linguistic working regime is a starting point for syntactic bootstrapping processes and also the fall-back option for not yet fully accomplished syntactic operations. Compounds are ideal candidates to investigate this liminal stage. They are morphologically complex though syntactically simplex. They are an early means to express propositional content within a single word.

The monolingual German child T uttering compounds such as (1a+2a) expresses the same proposition shortly later as a small clause (SC) (1b) or a complex NP (2b) (# indicating morpheme boundaries):

(1) (a) [sand#mund]_N T (2;1) (1) (b) [sand [im mund ]_PP]_SC T (2;1)

’sand mouth’ ’sand in-the mouth’

(2) (b) raesse [papa#buch]_N T (2;5) (2) (b) [grosse buch [vom papa ]_PP]_NPT (2;6)

’big daddy book’ ‘big book from daddy’

(Hohenberger 2002)

Compounds are attractive bootstraps because they only require embedding within the same module – the lexicon. Syntax, on the other hand, requires the joint cooperation of two modules – lexicon and syntax. In compounds, the child initiates crucial processes needed for full recursion, namely embedding through merger, but not yet move.

Full recursion is attained when the child also acquires Functional Categories (FCs) and masters syntactic movement and phrasal embedding. Through a whole series of bootstrapping processes the child’s grammar bifurcates into two modules – lexicon and syntax, with a clearly defined division of labour between them.

Once the child has started ‘putting words together’, her operations are restricted by interface constraints, most notably Phonetic Form (PF-) constraints regulating the proper serialization of elements in a linear string (Kayne 1994, Uriagereka 1998, Moro 2000). The recursive nature of binary branching phrase structures thus reflects the basic task of grammar: mapping non-linear thoughts onto a strictly serial output channel.

Hohenberger, Annette (2002): Functional categories in language acquisition: Self-organization of a dynamical system. Linguistische Arbeiten 456. Tuebingen: Niemeyer.

Kayne, R. (1994): The antisymmetry of syntax. Cambridge, Mass.: MIT Press.

Moro, A. (2000): Dynamical antisymmetry. Linguistic Inquiry Monograph 38. Cambridge, Mass.: MIT Press.

Uriagereka, J. (1998): Rhyme and reason: An introduction to minimalist syntax. Cambridge, Mass.: MIT Press.

Separating Structure and Category Learning: A Model of Phrasal Categories

Gaja E. Jarosz (Johns Hopkins University)

Standard approaches to grammar induction employ probabilistic context-free grammars, and extensions thereof, usually in combination with the Inside-Outside algorithm [1, 2, 3]. Such research has not typically been concerned with the process by which children acquire grammatical knowledge. However, research in grammar acquisition supports a fundamentally different model, which is further supported by the limited success of the standard computational approaches.

The acquisition process may be divided into two separate processes: a structure learning component and a category learning component. The prosodic bootstrapping hypothesis states that children rely on various prosodic cues, such as intonation, pauses, syllable length, and pitch to identify syntactic constituents [4]. On the other hand, learning the categories of phrases depends on phonological, distributional, semantic, and syntactic cues [5]. In contrast, the grammars used in grammar induction do not make this distinction; they assign probabilities to the structure, or bracketing, and category labels of parse trees simultaneously.

This work describes a probabilistic grammar, RPCFG, which models the phrase structure categories component of grammar. Whereas a PCFG assigns a probability to the parse of a sentence, RPCFG (.Reverse. PCFG) assigns a probability to a labeling of a tree, given the structure. Specifically, RPCFG assigns a probability Pr(P→X Y | X Y) to a parent label P, given the structure and labels of its children: X Y. The probability of the tree labeling is the product of the probabilities of the individual rules.

To compare the performance of RPCFG and PCFG, the CKY parsing algorithm was adapted to search only the correct structures. Grammars trained on 100, 1,000, and 10,000 sentences from a binarized version of the Wall Street Journal Treebank were tested on 1,000 sentences. Accuracy of labeling is determined by the number of correct labels divided by the total number of labels. On the largest training size, the grammars performed comparably: the PCFG achieved 80.0% accuracy and the RPCFG achieved 79.7%. However, on the smaller training sizes, RPCFG significantly outperformed the PCFG. Trained on 100 sentences the RPCFG achieved 68.7% accuracy, while the PCFG achieved only 26.0%. This means that after training on only 100 sentences, the RPCFG reached 85% of its best performance!

These results suggest that bootstrapping between a successful structure-inducing system, such as Klein and Manning.s Generative Constituent-Context Model [6], and a label-inducing system based on RPCFG may be a more cognitively plausible, computationally feasible, and successful grammar induction system.

1) Chen, Stanley F (1995). Bayesian Grammar Induction for Language Modeling. Proceedings of the 33rd Annual Meeting of the ACL, 228-235.

2) Lari, K., and S. J. Young (1990). The estimation of stochastic context-free grammars using the insideoutside algorithm. Computer Speech and Language, 4:35-56.

3) Carroll, G., and E. Charniak. (1992). Two experiments on learning probabilistic dependency grammars from corpora. In C. Weir, S. Abney, R Grishman, and R. Weischedel, editiors, Working Notes of the Workshop on Statistically-Based NLP Techniques, 1-13. AAAI Press.

4) Gleitman, L., Gleitman, H., Landau, B., & Wanner, E. (1988). Where learning begins: Initial representations for language learning. In F.J. Newmeyer (Ed.), Language: Psychological and biological aspects: Volume III. Linguistics: The Cambridge Survey, 150-193. New York: Cambridge University Press.

5) Cartwright, T. A., and M. R. Brent. (1997). Syntactic categorization in early language acquisition: Formalizing the role of distributional analysis. Cognition, 62, 121-170.

6) Klein, D., and C. Manning (2002). A Generative Constituent-Context Model for Improved Grammar Induction. al Meeting of the ACL, 128-135.

Bootstrapping and prototypes

Christopher Johnson (University of Chicago)

Formal syntactic categories correlate significantly yet imperfectly with semantic categories (noun with object, subject with agent, etc.). Theories of “semantic bootstrapping” maintain that such correlations are innate “inductive bases” for learning about formal categories (Grimshaw 1981, Pinker 1984). Other approaches maintain that such imperfect correlations provide evidence for prototype structure in grammatical categories (Lakoff 1987). This talk evaluates the explanatory value of these two views in the context of language acquisition, and proposes a synthesis that combines advantages of both views.

The bootstrapping problem (Pinker 1984) arises because children need to learn distributional facts about formal categories without having any direct way to identify instances of those categories in their input. In semantic bootstrapping, children solve this problem by using innate knowledge of implications such as “If an expression denotes an object, then it is a noun or NP” (Grimshaw 1981). This view provides a mechanism by which children acquire new knowledge from experience in a way that is consistent with the Subset Principle. However, it posits a type of innate cognitive category (e.g. N or NP) that is unnatural because it is not characterized by criteria for membership. This view also fails to explain the relation between innate form-meaning correlations and the instances of adult grammatical categories that do not conform to them. Prototype accounts suggest principles of extension relating canonical and non-canonical instances, but do not associate those principles with strategies for acquiring knowledge from experience.

I propose a view in which children start acquiring formal categories by identifying the specific formal means used to encode innate semantic notions like ‘object’ and ‘agent’, which are assumed to be especially amenable to linguistic encoding for pragmatic reasons. There are no innate formal categories. Children begin to extend their proto-grammatical forms in principled ways from early exemplars, as in a prototype account. The main mechanism of extension is not comparison to a prototype, however, but identification of phenomena that correlate with instances of innate semantic categories and are therefore exemplified by the earliest pairings of grammatical form and meaning. For example, human agents are especially salient examples of causes; frequent encoding of agents by subjects will lead children to associate subjects with the notion ‘cause’ as well. In this view, innate and pragmatically-accessible semantic categories “bootstrap” less accessible semantic categories associated with grammatical forms--principles of extension are therefore also strategies for acquiring knowledge from experience.

References

Grimshaw, Jane. 1981. Form, function, and the language acquisition device. In C.L. Baker and J. J. McCarthy (eds.), The logical problem of language acquisition, 165-182. Cambridge, MA: The MIT Press.

Lakoff, George. 1987. Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press.

Pinker, Steven. 1984. Language learnability and language development. Cambridge, MA: Harvard University Press.

Bootstrapping syntactic categories

Jacqueline van Kampen (UiL OTS, Utrecht University)

If all grammar resides in the grammatical properties assigned to a lexical item (Borer 1984, Chomsky 1995, and of course Categorial Grammar) it is important to see how grammatical categories are identified and assigned in first language acquisition.

This paper will show how two highly frequent cues, the <+fin> marking of predicates and the <+D> marking of arguments lead to four category assignments.

(1) a. cue <+D>, new category <+N> and <+free anaphors>

b. cue <+I/+fin>, new category <+V> and <+ nominative>

These are plausibly the first grammatical learning steps in most grammars as will be argued for by an analysis of French and Dutch child language.

The <+fin> marking of predicates and the <+D> marking of arguments suggest the introduction of the universal <+V> and <+N> by a context sensitive learning step.

(3) category neutral X Þ <+V> / I/<+fin> ¾

category neutral X Þ <+N> / D ¾

This is an alternative to Pinker’s (1984) semantic bootstrapping for <+N> and <+V>, and parallel to the proposal by Buszkowski (1987) in Categorial Grammar. A further and almost immediate effect of <+fin> is the <+nominative> subject in Spec,I. A further and almost immediate effect of <+D> is the rise of discourse (free) anaphors.

The triggering contexts <+fin> and <+D> are systematically construed with a category neutral item. They systematically signify, respectively, a naming intention for D+X or a characterizing intention fro I+X. They function as bootstrapping cues due to their high text frequency. The general perspective of this paper is that cues for category assignment follow from a highly repetitive language specific context. This suggests that:

(4) a. Cues become effective only if part of the context has already been identified.

b. Acquisition steps are ordered accordingly

c. Language types are shaped by the different possibilities of learning steps.

References

Borer, H. (1984) Parametric Syntax: Case Studies in Semitic and Romance Languages, Dordrecht: Foris.

Buszkowski, W. (1987) ‘Discovery procedures for categorial grammars’, in: E. Klein and J. van Benthem (eds.) Categories, Polymorphism and Unification, University of Amsterdam.

Buszkowski, W. and G. Penn (1990) ‘Categorial grammars determined from linguistic data by unification’, in: Studia Logica 49, 431-454.

Chomsky, N. (1995) The Minimalist Program, Cambridge Mass./London: MIT Press.

Pinker, S. (1984) Language Learnability and Language Development, Cambridge Mass./London: Harvard University Press.

Is it possible to bootstrap lexical category information from word order in a flexible-word-order language?

Fatma Nihan Ketrez (University of Southern California, Los Angeles)

This study investigates whether or not there is any significant regularity in the noun and verb distributions in Turkish child-directed-speech, which may be helpful to a child in the acquisition of lexical categories.

According to Maratsos & Chalkey (1986), input language can provide learners with sufficient information with respect to lexical categories. Any word that appears between the and is in the __ is is a noun, for example. Analyses done on English-speaking parents’ language provide evidence for this proposal by showing that words belonging to the same category are clustered together according to their preceding and following contexts (Cartwright & Brent 1997; Redington et. al. 1998; Mintz et. al. 2002). This proposal and the evidence in the literature are based on English, which is a fixed-word-order language, when compared to languages such as Turkish. Present study looks at linear distribution of nouns and verbs in Turkish child-directed-speech in order to investigate the cross-linguistic plausibility of this proposal. Turkish is especially interesting and challenging for a distributional analysis not only because of its relatively flexible word-order pattern, which is considered nothing but a puzzle for young learners (Kuntay & Slobin 1996), but also because of its pro-drop mechanism and the lack of function words such as auxiliaries and articles.

The corpora consist of 15.000 adult utterances directed to four children between the ages 1:0 and 2:0. A series of three analyses are conducted on each child’s corpora individually. The first analysis examines the target words in their immediate contexts. In the later phases, context sizes are increased to two and six words on each side of the target words. The results obtained in each phase are compared quantitatively and qualitatively. The quantitative results show that the distribution of verbs is regular and better than chance according to the immediate contexts (t(3)=3.87, p<.05). With an increase in the context size, their clustering gets even better. The results obtained from the verb analyses provide a cross-linguistic support for the proposal for the distributional approaches suggesting that the distributional properties can provide a learner with useful grammatical information even in a language with a flexible word-order pattern. In noun distribution, however, there is no significant regularity in any context (t(3)=0.90, p=.430) and that is problematic for a distribution analysis that looks at words without taking into consideration other cues such as the morphological structure of words. The qualitative analyses further reveal that rich morphological structure plays a role in clustering.

References

Cartwright, M. R. & Brent, M. (1997) Syntactic Categorization in early language acquisition: Formalizing the role of distributional analysis. Cognition 63, 121-170.

Küntay, A. & Slobin D. I. (1996) Listening to A Turkish Mother: Some Puzzles for Acquisition. Social Interaction, Social Context & Language: Essays in Honor of Susan Ervin-Tripp, Hillsdale: Lawrence Erlbaum Associates Publishers, 265-286.

Mintz, T., E. Newport & T. Bever. (2002) The distributional structure of grammatical categories in speech to young children. Cognitive Science 26. 393-424.

Maratsos, M & Chalkley, M. A. (1980) The internal language of children’s syntax: The ontogenesis and representation of syntactic categories. In Nelson (Ed.) Children’s Language, Vol: 2 NY: Gardner Press.

Redington, M. Chater, N. & Finch, S. (1998) Distributional information: a powerful cue for acquiring syntactic categories, Cognitive Science, 22, 425-469.

Schema Theorem in Language Acquisition: A Rags to Riches Story

Sean McLennan (Indiana University)

The "Poverty of the Stimulus" (POS) argument, which maintains that the input a child receives from the environment is not sufficient to infer a productive grammar (Miller and Chomsky, 1963), has been central to the strong view of Universal Grammar. However, increasing evidence from other disciplines like Cognitive Science, and Psychology suggest that the POS may not be valid. Statistical models that can learn tasks that they were previously thought incapable of (ex. Chalmers, 1990; Elman 1995). Moreover, it appears that acquisition may progress more formulaically than has been believed (Tomasello, 2000).

Despite these findings, the intuitions of Linguists concerning the course of L1 acquisition has not significantly changed—perhaps because, although there may be new evidence, the nature of the input, observations of which originally gave rise to the POS, has not changed. Nor does it seem that those initial observations were mistaken. Where then, is the "extra" information hiding?

"Schema Theorem", originally proposed to deal with genetic algorithms (Holland, 1975; Mitchell, 1996), may provide insight into the "missing" information. Simply stated, Schema Theorem claims that a process that operates on a population of tokens, simultaneously, implicitly, operates on every category that those tokens instantiate. Schema Theorem, as generalized and applied to language, highlights the "wealth of the stimulus" that has heretofore been overlooked.

Examining the language learning environment through the lens of Schema Theorem has two important benefits. First, it helps resolve the tension between theoretical linguistics and other disciplines on the question of language learnability by expanding our perspective on the quantity and quality of L1 input. Second, it reduces the burden on biology for an explanation of language learnability which is more consistent with what we know about genetics and the brain.

Chalmers, David. 1990. Syntactic Transformations on Distributed Representations. Connection Science, 2/1-2:53-62.

Elman, Jeffery. 1995. Language as a Dynamical System. In Robert Port and Timothy van

Gelder (eds.). Mind as Motion: Explorations in the Dynamics of Cognition. Cambridge, MA: MIT Press.

Holland, John. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor, MI: University of Michigan Press.

Miller, G.A. and Noam Chomsky. 1963. Finitary Models of Language Users. In: R.D. Luce,

R.R. Bush, and E. Galanter (eds.). Handbook of Mathematical Psychology, Vol. 2. New York, NY: Wiley.

Mitchell, Melanie. 1996. An Introduction to Genetic Algorithms. Cambridge, MA: MIT Press.

Tomasello, M. 2000. First Steps Toward a Usage-Based Theory of Language Acquisition. Cognitive Linguistics, 11/1-2:61:82

tba

William G. Sakas (City University of New York)

Infants are sensitive to the prosodic contours of phrases

Melanie Soderstrom (Brown University), Amanda Seidl (Johns Hopkins University), Deborah Kemler Nelson (Swarthmore College), Jim Morgan (Brown University)

Prosodic cues in the speech stream may provide infants with important information about the structure of their language. Considerable evidence suggests that infants use prosodic cues to segment speech at clause boundaries (e.g. Nazzi et al., 2000), but whether and how infants use prosodic cues to phrase boundaries is less clear. Prosodic cues coincident with phrase boundaries are less reliable, and infants show variable sensitivity to these boundaries, depending upon the methodology used (cf. Jusczyk et al., 1992; Soderstrom et al., in press) as well as specific acoustic/syntactic properties of the stimuli (Soderstrom et al.; Gerken et al., 1994).

One reason for such variable results may concern whether infants pay attention only to prosodic boundaries (dispreferring potential parses that contain them), or to the prosodic contours of the phrases (preferring globally well-formed parses). The current study disentangles these possibilities by comparing infants' recognition of target word sequences (“children were crying”) that constitute either two adjacent phrasal units (“As a result, [children] [were crying] afterwards”) or two adjacent phrasal fragments (“Later on, [sad children] [were crying salty tears]”). In both cases, the word sequence contains a major phrase boundary. However, only in the second case is the target sequence syntactically ill-formed. Measurements of pause and preboundary lengthening at the sequence-internal major phrase boundary showed no differences between the "well-formed" and "ill-formed" target sequences in either stimulus set.

Preliminary results with 25 infants revealed a significant preference for the test passage containing the well-formed target sequence over the passage containing the ill-formed sequence in one of two stimulus sets tested. It may be that contour cues were stronger in the well-formed sequence of the successful stimulus set than that of the unsuccessful set. Or it may be that boundary cues were stronger in the unsuccessful set, such that the sequence-internal boundary was more strongly heard in the well-formed sequence for that set, masking any contour effects. Acoustic analyses found similarities in pitch accent cues across the two stimulus sets. However, there was a short pause (48 ms) prior to the start of the well-formed target sequence of the successful set that was not present in the other stimuli, which might help to signal the beginning of a phrase. In sum, these results suggest infants are (at least sometimes) sensitive to the prosodic well-formedness of the contour of the word sequence, and not just to the major phrase boundary.

References

Gerken, L.-A., Jusczyk, P.W., & Mandel, D.R. (1994). When prosody fails to cue syntactic structure: Nine-month-olds’ sensitivity to phonological versus syntactic phrases. Cognition, 51, 237-265.

Jusczyk, P.W., Hirsh-Pasek, K., Kemler Nelson, D., Kennedy, L., Woodward, A., & Piwoz, J. (1992). Perception of acoustic correlates of major phrasal units by young infants. Cognitive Psychology, 24, 252-293.

Nazzi, T., Kemler Nelson, D.G., Jusczyk, P.W., & Jusczyk, A.M. (2000). Six-month-olds’ detection of clauses embedded in continuous speech: Effects of prosodic well-formedness. Infancy, 1, 123-147.

Soderstrom, M., Seidl, A., Kemler Nelson, D.G., & Jusczyk, P.W. (in press). The prosodic bootstrapping of phrases: Evidence from prelinguistic infants

Prosodic and lexical aspects of the acquisition of syntax in German infants

Jürgen Weissenborn (Humboldt-Universität zu Berlin)

In my contribution I will present findings from our on-going research with German infants which focuses on the role of the prosody-syntax and the lexicon-syntax interface in language acquisition during the first and second year of life. Our studies provide additional evidence for the assumption that from early on prosodic as well as lexical knowledge helps the child to identify and categorize the syntactic units in the speech input, e.g. clauses, phrases and words. Furthermore, I will present results which support the suggestion that aspects of the acquisition of word order and of the distinction between arguments and adjuncts may also be related to regularities in the prosodic domain.

tba

Charles D. Yang

tba

Hotel Information

For on campus accommodation, please see the LSRL page here.

Here is an information site: Bloomington Area Guide

Hotel List

LSRL 33

Following this workshop will be "The 33rd Linguistic Symposium on Romance Languages" (Indiana University, Bloomington). Details on the LSRL can be found here:

LSRL 33 homepage