Navigation – Plan du site

AccueilNumérosVolume 2The phoneme in cognitive phonolog...

The phoneme in cognitive phonology: episodic memories of both meaningful and meaningless units?

Riitta Välimaa-Blum

Résumés

Si l’on accepte la nature fondamentalement symbolique du langage, comme le propose la linguistique cognitive, il faut se poser la question de savoir si, en plus des unités symboliques, les locuteurs forment des structures mnésiques, parmi lesquelles certaines seraient dépourvues de sens, tels que les phonèmes. Dans le cadre de la théorie des exemplaires, l'unité de représentation fondamentale en phonologie est vue de deux manières différentes. L’une de ces approches propose que les locuteurs conservent en mémoire d’innombrables exemplaires de phonèmes tandis que l’autre prend pour unité de base le mot. Des études psycholinguistiques démontrent que la représentation mentale du lexique contient des exemplaires des mots entièrement spécifiés pour tous les traits phonologiques. Les acquis neurolinguistiques plaident pour un lexique auditif plutôt qu’articulatoire. Dans cet article je cherche à démontrer que les locuteurs ne conservent que des exemplaires d’unités symboliques, et que la formation de myriades d’exemplaires autonomes des phonèmes n'est pas seulement non motivée mais aussi problématique.

Haut de page

Notes de la rédaction

Soumis le 22/10/2006
Version révisée reçue le 08/04/2007
Version finale juin 2007

Avertissement : cet article contient des caractères spéciaux au format Unicode. Si vous rencontrez des problèmes d’affichage, vous devez probablement télécharger et installer une police Unicode ; vous trouverez une police Unicode gratuite et libre de droits à l’adresse suivante : http://dejavu-fonts.org/.

Disclaimer: this article contains special characters encoded in the Unicode format. If you have trouble reading this text, you should probably download and install a Unicode font; you can find a free software Unicode font at the following URL: http://dejavu-fonts.org/.

Texte intégral

I am grateful to two anonymous referees of CogniTextes for their helpful comments on the first version of this paper. I tried to take them all into account, but any potentially remaining errors or misconceptions are mine alone.

1. Introduction

1The term phoneme has many senses in the literature, but whatever the case may be, it is a truism that they do not exist as discrete entities in speech, for words cannot be cut up into sequences of sounds that can be reorganized to form other words without the new words sounding unnatural. This is due to several factors, one is co-articulation, and another has to do with different phonetic contexts involving different phonetic variants of the phonemes. There is also a great deal of inter- and intra-speaker variability due to speaking styles, dialects, age, gender, foreign accents, pathologies, etc., which further means that the limits of variability of a given phoneme are impossible to demarcate. All this entails that lexical access cannot be a function of invariance in the sounds but it depends more on the distinctiveness in the acoustic signal (Lindblom 1990: 404). Even this is not sufficient, however, for speech sounds are often deleted or reduced in continuous speech, so that, while distinctiveness obviously is a key factor, there are other considerations as well.

2An approach to the phoneme and the lexicon that is based on exemplar theory places some of the traditional assumptions of phonology into a new perspective. Exemplar theory has a long history in psychology (Goldinger 1998: 251), but it is emerging in phonological research as well (Johnson 1997; 2005a, 2005b; Nosofsky 1998; Pierrehumbert 1994, 2001, 2002; Pierrehumbert, Beckman and Ladd 2000) and in linguistic research in general (Goldberg 2006). It proposes that human beings categorize and memorize their experiences in terms of so-called exemplar clouds, large clusters of remembered episodes of individual experiences. In phonology, exemplar theory proposes that we do not form abstractions of the formal properties of the phonemes and words but remember individual occurrences of them, just as they are perceived. These episodic tokens thus give rise to a mental lexicon containing highly detailed information of both the predictable and non-predictable properties of the sounds, and even data relating to individual speakers, speaking styles and dialects are conserved.

3For the speech sounds then, it is proposed that speakers store them in terms of labeled exemplar clouds. The prototype of each category is found in the part of the cloud that contains the greatest number of exemplars (Pierrehumbert 2001, 2002), that is, where the most frequent tokens are concentrated. The number of memorized episodes is considerable and in a constant flux. With time, however, some exemplars will be forgotten and those tokens that are identical to one another, within the perceptual limits, will consolidate into a single item in the cloud, for the exemplar clouds are granular (Pierrehumbert 2001) rather than ‘powder-like.’ In this view, then, a labeled exemplar cloud of speech sounds largely corresponds to the traditional phoneme as a distinctive sound, so that what gives a given cloud its linguistic unity is its distinctive role, and we may say that the label identifies the phoneme category. Pierrehumbert (2001: 148) explicitly assumes that there are exemplar clouds of both phonemes and words, whereas Johnson (2005b: 298) takes words to be the basic exemplar units. I will argue that there are no exemplar clouds of the phonemes per se but of the meaningful units only.

4While the phonetic speech signal itself is continuous, the phonological representations are usually assumed to consist of discrete entities. However, if the mental lexicon is indeed exemplar-based, and if phonology purports to investigate what speakers represent about the speech sounds in the mind, phonology can only be as continuous as phonetics. Thus, strictly speaking, there are no discrete phonological units in the mental lexicon, but only continuous phonological shapes, which are, of course, further associated with meaning and grammar. What we learn in the course of language acquisition is meaningful linguistic form, words being the first units that emerge in production. Infants, however, can apparently also figure out abstract syntactic arrangements as early as the age of seven months (Marcus et al. 1999), which demonstrates that the higher-level morphosyntactic patterns probably develop in the infants’ minds alongside the mono-morphemic words.

5It is an established fact that comprehension precedes production, and the very early extraction of grammatical schemas apparently belongs to the same phenomenon, i.e., while children give an overt manifestation of the lexicon first, it is nevertheless likely that constructional forms and meanings are present in their minds long before that. But the important point to see now is that while children demonstrate understanding of meaningful words and grammatical structures very early on, in terms of production, it is generally assumed that a child has said his first words only when adults understand them as linguistic objects. It would thus seem that meaning dominates language acquisition just as it dominates adult language use. McNeill and Lindig actually observe that “[I]n normal language use, the focus of attention is the meaning of an utterance. Subordinate levels become the focus of attention only under special circumstances” such as linguistic experimentation (McNeill and Lindig 1973: 430).

6Cognitive linguistics sees languages as systems entirely geared toward the expression of meaning (Lakoff 1987; Langacker 1987; Taylor 2002; Talmy 2000; Wierzbicka 1988), and this must be included in the conception of phonology as well. The cognitive standpoint also underlies Construction Grammar, where the basic units of language include grammatical constructions, which are symbolic structures ranging from polymorphemic words to complex sentences (Croft 2001; Fillmore, Kay and O’Connor 1988; Goldberg 1995, 2006; Kay 1997; Lakoff 1987; Langacker 1987). Grammar expresses meaning by semantically struc­turing the lexical content (Talmy 1988, 2000), so that the meanings of the two in fact match and reinforce one another (Goldberg 1995, 2006). Given that grammatical constructions are symbolic entities, phonology, morphology and syntax are necessarily intimately intertwined in them, for the meanings they express are based on simultaneous phonological, morpholexical and/or syntactic material.

7There are two basic approaches as to the nature of the phoneme in the cognitive framework. One suggests that speakers store each phoneme in a form that is an abstract, schematic summary representation of all of its allophones, and in a speech event, this schematic form is then ‘modified’ according to the phonetic contexts (Langacker 1987; Mompeán-González 2004; Nathan 1996; Taylor 2002). This schematic phoneme, abstracted from actually observed instances of speech, constitutes the prototype, and the incoming stimuli are compared to it for similarity. The other approach to the phoneme stems from the exemplar account, which assumes that speakers store innumerable exemplars of observed phoneme tokens, the most frequent ones being more strongly ‘implanted’ in the mind and thus serving as the point of comparison with incoming stimuli, and they thus form the prototype (Nosofsky 1998; Pierrehumbert 2001; Pierrehumbert, Beckman and Ladd 2000; K. Johnson 1997).

8There is a fundamental difference between these two views, which involves the mental representation of the lexicon (Välimaa-Blum 2005: 64). If we assume that the mental sound shapes of the morphemes are represented in terms of phonemes and that the phoneme is a schematic entity, encapsulating only some central properties of the actually occurring allophones, this position necessarily leads to underspecification in the lexicon. That is, if the morphemes are stored in the mind in terms of sequences of schematic sounds, this entails that certain properties of the segments are not specified. This goes against the evidence indicating that the mental lexicon is not underspecified but includes even extremely fine phonetic detail, both predictable and non-predictable (Miller 1994; Fougeron and Steriade 1997; Bybee 2000; Pierrehumbert, Beckman and Ladd 2000; Ohala and Ohala 1995). If, however, the phoneme is a class of sounds, along the lines of the exemplar model, then all the allophones of each phoneme class are fully present in the mental representation of the morphemes, according to their phonetic contexts. This necessarily takes us to the stance that morphemes are fully specified in the lexicon, and the phonemes that make them up cannot thus be schematic. This also means that, given the absence of underspecification or schematicity in phonology, the lexicon must contain at least all the non-automatic allomorphs of each morpheme (Välimaa-Blum 2005).

9However, the lexical entries, morphemes, co-occur with word formation schemas, which are partly schematic, partly filled. Like most other grammatical constructions, word schemas never occur as such in speech, but they are instantiated only with a specific lexical content. As a child is acquiring language, it is these word schemas that emerge as abstractions that the child constructs on the basis of the data. The word schemas themselves are meaningful and have their own grammatical properties, so that what also emerges in the child’s mental grammar are the connections between the schematic word patterns and the fully specified lexical entries, for the child needs to learn which word schemas and which lexical stems are semantically and syntactically compatible.

2. Phonemes and Words

10When children start saying their first words, these tend to be pronounced as if they were unique, holistic entities, the individual sounds in them being produced in rather disparate ways (Studdert-Kennedy 1987). The different sounds become more distinctly articulated roughly at the moment of the vocabulary spurt at around the age of two. When a child hears and uses a word frequently, it gradually becomes solidly entrenched in the language system in his mind (Bybee 1995; Kemmer and Barlow 2000; Langacker 1987, 2000; Taylor 2002), and at the same time, the production of the word becomes highly automatized, procedural. Speakers possess two kinds of knowledge concerning language: procedural and propositional (Anderson 1983, 1993; Boyland 1996; Bybee 1998, 2001). In the non-linguistic domain, riding a bicycle and swimming represent procedural knowledge; that is, we know how to do them but we can’t explain how we do them, and this is because they involve a high level of subconscious automaticity, i.e., procedural knowledge. In the domain of language, for example, the phonetic similarity of the allophones of a given phoneme is based on their sharing procedural knowledge. As for propositional knowledge, we can talk about it, so that we can state, for example, what the meanings of the words bicycle and swim are, and stating these meanings represents propositional, explicit knowledge.

11Johnson (2005b: 297) discusses recognition memory and declarative memory, which resemble certain features of procedural and propositional knowledge. Declarative memory, just as propositional knowledge, has to do with explicit real-world knowledge that we can make statements about. Recognition memory is based on highly detailed observations made of real-world experiences, but this memory type is implicit, hence difficult to put into words. Both the two kinds of knowledge and the two types of memory are obviously central in language. Procedural knowledge makes possible the effortless, automatic production of speech, and recognition memory, as Johnson observes, makes possible the effortless, automatic recognition of all aspects of speech, which also underlies the mental storage of the same. Since speakers possess highly detailed exemplar memories of speech events, these are used by recognition memory in speech perception so that the incoming stimuli are matched for similarity with already stored exemplars.

12However, everyday speech perception is not simply a matter of comparing speech stimuli for formal similarity with memorized instances, but it furthermore involves the speaker’s global experience with language, for phonology is embodied, just as the rest of language (Johnson 1987; Lakoff 1987; Lakoff and Johnson 1980). What this means is that the procedural knowledge speakers possess of speech production and the concomitant auditory effects together give rise to consistent, semantically and grammatically significant ‘consequences,’ i.e., meaningful expressions. In other words, in the minds of speakers, there is a well-established association of the auditory percepts and the stored tokens of the corresponding units, and when speaking, automatized articulatory movements are employed to reproduce the same entities. It is this first-hand linguistic experience, i.e., the production and perception of speech sounds in meaningful words that engenders the embodiment of phonology. The extremely fine-tuned ability to produce and auditorily discriminate among words that differ from one another by only one sound or even just a single phonological feature empowers the speaker with the intimate knowledge that minute articulatory and auditory differences are significant.

13We may say that the highly entrenched exemplar clouds of the words are based on speech perception, on what we hear others say and on the feedback loop of what we say ourselves, and the procedural knowledge that we have of words is based on our own productions of them. This entrenchment and procedural knowledge of the sound shapes of words are combined in the cognitive language system with the corresponding meaning and grammatical information. Now then, at the time a child suddenly appears to have a word for everything and starts to pronounce the sounds much more distinctly than he did in the holistic, first words, it is because the child’s mastery of speech production has reached a level of automaticity that approaches procedural knowledge, and because this procedural knowledge is deeply enmeshed with the corresponding, fine recognition memory. As the child’s lexicon gradually gets more and more firmly entrenched, the same happens to his grammatical constructions, and this too is due to the interaction of both recognition memory and procedural knowledge. In this global process, however, the child never needs to learn the language sounds independently of the symbolic units that the sounds give form to. I would like to claim that our mastery of morphemes, words and grammatical constructions doesn’t even entail a mastery of the segmentation of speech into sequences of sounds. In fact, segmenting words into phonological categories with the objective of getting at their meaning is not even feasible, as we will see in the sections below.

14Knowledge of writing in general is not relevant to language learning in that it is an independent, metalinguistic skill, learned through explicit teaching after the mastery of language, and it is a skill that people are proficient in to unequal degrees. Also, not all writing systems are even based on speech sounds. Speakers of languages that use alphabetic writing tend to place great importance on segments, but segmentation of speech is apparently not something that comes to us naturally in the course of cognitive maturation. For example, problems in learning to read are often due to a difficulty with segmentation (Liberman et al. 1980). It has also been shown that Chinese adults who could read and write only with logographic characters had difficulty adding individual consonants to spoken words or removing them, but those who knew the logographic system enriched with phonetic symbols had no problems in the same task (Read et al. 1986). The conclusion of this study was that segmenting speech into sounds is an artifact of having learned to read and write alphabetically.

15The fact that alphabetic writing is so widely spread is not in itself evidence for an independent cognitive status of phonemes, but simply of the fact that, already thousands of years ago, people realized that words are composed of sequences of meaningless sounds. Sapir argued for the psychological reality of the phoneme in stating that, “what the naïve speaker hears is not phonetic elements but phonemes” (Sapir 1949: 47), and we may say that this translates into a search for meaningfulness in the signal. But, of course, listeners are also capable of making within-category differences among co-allophones, and this information may even a play a role in lexical access (Bybee 2000: 44–45). The extensive use of alphabetic writing is also due to the fact that the same symbols can be adapted relatively easily to all languages, while, e.g., this adaptation would be more difficult for syllabaries, since the sound inventories, actual syllable shapes and prosody are not shared with the same generality across languages.

16I will now assume that each lexical entry in the mental lexicon corresponds to an exemplar cloud of tokens of at least its non-automatic allomorphs, including speaker and variety dependent variants as well (Välimaa-Blum 2005). Distributed in these lexical memory structures, phonemes are not underspecified or schematic abstractions, but fully specified sounds, including all their co-articulatory properties (Miller 1994; Fougeron and Steriade 1997; Bybee 2000; Pierrehumbert, Beckman and Ladd 2000; Ohala and Ohala 1995). Keeping with the cognitive framework and the centrality of meaning, I will now pursue the idea that phonemes as mental entities exist only in the lexical entries and argue that there are no exemplar clouds of the meaningless units, but of the meaningful ones only.

3. Problems for the Formation of Exemplar Clouds of Phonemes1

  • 1  Parts of the following sections are based on a talk given at a meeting of the Réseau français de p (...)

17Johnson, Flemming and Wright (1993) found that the vowels listeners produced by manipulating a speech synthesizer to match vowels in example words were different from those in the words. The synthetic vowels were much more extreme, peripheral in the vowel space than the ones actually produced in the words. The authors propose that the phonetic targets of the isolated sounds correspond to hyperarticulated speech (Lindblom 1990) and the actual productions of the words with the same vowels represent hypospeech, that is, hyperspeech that has undergone reduction processes, and this would explain why the two vowel sets were different from one another. These results also suggest that vowels in words and vowels in isolation do not correspond to the same mental entity. Words are holistic units, in which the sounds are fully integrated into their surrounding phonetic environments, while isolated sounds are part of the metalinguistic knowledge that speakers have learned through explicit training, typically in connection with writing and language studies, and in phonetics classes. The exemplars we hear of individual sounds are not the same entities we hear when the ‘same’ sounds occur in words, and this probably explains the observed differences. It would thus seem that the hyperspeech form of an individual word always resembles some carefully pronounced exemplar of it, with all the accompanying coarticulation phenomena, not one where the component sounds resemble isolated sounds. Isolated sounds would thus not be likely candidates for the hyperspeech variants of speech sounds, nor for the prototype exemplars of phonological categories (Välimaa-Blum 2007).

18Pierrehumbert discusses the updating of exemplar clouds and notes the following: “If an incoming stimulus is so ambiguous that it can’t be labeled, then it is ignored rather than stored. That is, the exemplar cloud is only updated when the communication is successful to the extent that the speech signal was analyzable” (2001: 152). If the stimulus is a word that one does not understand, it is clear that no exemplar cloud can be updated. But if there actually were exemplar clouds of speech sounds, then the mere identification of the sound is not sufficient for the labeling and updating of the memory structures. I will now discuss problems that a language learner and an adult speaker would face if he were forming exemplar clouds of speech sounds as such, along with those of meaningful elements.

3.1 Reductions and Deletions

19First of all, we have to assume that if a listener indeed segments speech into sound units, he presumably does it order to understand what is being said. However, segmenting alone does not take the listener to the meaning, and segmenting alone does not enable him to construct labeled exemplar clouds of phonemes. If the listener were segmenting hyperspeech words, he might be expected to have a reasonable outcome since all the phonemes are present in these forms, but at the level of hypospeech, segments are frequently reduced and even deleted, and thus segmentation per se does not lead to the meaning. To illustrate the difficulty, we may consider the German word morgen /'mɔrgən/ ‘tomorrow,’ which may reduce to just a single syllable, e.g., [mɔ̃] or [mɔŋ̃] (Kohler 1996: 13). The deleted syllable is an unstressed one with a schwa, so we might assume that it carries less information than the stressed syllable with its full vowel, and this may well justify its deletion. In a specific linguistic context, the meaning of the reduced [mɔ̃] is fully recoverable, as in morgen vormittag ‘tomorrow morning.’ Likewise, in morgen abend ‘tomorrow night,’ the following vocalic context triggers the retention of a postvocalic consonant in the reduced word, [mɔ̃ŋ], but the meaning is prompted by the presence of abend. If one were to say [mɔ̃] to a German speaker out of context, it is not obvious that morgen would be understood, or that the segmentation would even be successful.

20Kohler notes that the smooth operation of the articulators is a central factor in deletions and reductions, but nevertheless, the context is a crucial factor for the interpretation. But also, as Butcher (1996) notes, not all continuous speech processes are universal, for there are language specific reduction phenomena as well. Now, what would the exemplar clouds be that a listener updates when he hears [mɔ̃ŋ]? As long as the meaning is understood, the morpheme cloud corresponding to morgen is certainly updated. But if there were exemplar clouds of phonemes as well, three clouds would presumably be updated, those corresponding to [m], [ɔ̃] and [ŋ] and nothing for the deleted sounds. However, it is not clear where [ŋ] would go. It could be added to among the memorized episodes of /n/, for in this word, [ŋ] is an assimilated outcome of [g] and [n]. Or would [ŋ] go with /ŋ/, which is another phoneme? Of course, it might also be possible that the evocation of the word morgen suffices for the updating of all of its full-form segments, but this can only take place through the meaning of morgen, which must be retrieved first. And it would mean that, after understanding the word, the speaker performs a simultaneous double updating, one of the lexicon and the other of the phoneme clouds.

21Johnson, Flemming and Wright (1993: 305) provide an example from English with the sentence did you eat yet, which has the possible hypospeech form [ʤiɁjɛɁ]. The affricate is an articulatorily motivated rendition of the adjacent consonants /d/ and /j/ in did you, but the vowels are deleted in these two words, whereas eat and yet are largely intact, with the glottal stops realizing the /t/. Which exemplar clouds would be updated in the case of [ʤiɁjɛɁ]? The individual morpheme clouds and the constructional schema of yes/no-questions would certainly be reinforced. But as for updating phoneme clouds, it is not clear whether the affricate would go to the exemplar cloud labeled /ʤ/ or to those with /d/ and /j/ labels, or to some other cloud altogether. The status of the glottal stop will be discussed presently.

22The fact that in many languages there are more deletions in unstressed than stressed syllables has no explanation if we assume that speakers operate on segments when they speak. Stress is a property of syllables in words, not of segments, and hence it is not clear why unstressed vowels would undergo deletion more often than the stressed ones. If deletions concern segments, we would expect that all sounds be deleted with roughly equal frequency. But if we assume that this phenomenon actually applies to words, then stressedness becomes a relevant factor. If we see unstressed syllables as less informative than the stressed ones, then we understand that a schwa deletion, e.g., in garden /'ɡɑdən/ would be no loss to comprehension, but deleting the stressed /ɑ/ would. This is, of course, what we observed in the German morgen.

3.2 Graded Categories

23Graded phonetic characteristics pose another problem for the formation of labeled exemplar clouds of speech sounds. Let us first consider the French vowel nasalization. This language has contrastive oral and nasalized vowels, as demonstrated by the minimal pair tôt ‘early’ /to/ and ton ‘tone’ /tõ/. However, phonologically oral vowels in French have nasalized allophones before nasal consonants (Basset et al. 2002). Thus the phonologically oral vowel is nasalized in tonne ‘ton’ [tõn]. How would a language learner distinguish the two kinds of oral vowels and how in particular would the two types of nasalized vowels be assigned to their respective exemplar clouds? Would the assignment be done on the basis of just the phonetic characteristics of the sounds? If so, then all the nasalized vowels might be lumped together, separately from the fully oral ones, with the consequence that the phonologically relevant labeling would be impossible. Or would they be assigned to their respective clouds on the basis of the meanings of the words? If so, then words must be identified before the sounds can be placed in their categories. Perhaps distributional facts could be used, given that the contrastively nasal vowels never occur before a tautosyllabic nasal consonant, but this alone seems implausible. Whatever we do, it would be difficult to assign the French vowels into the nasal and oral categories without meaning.

24In American English, it may be that vowel nasalization is in the process of becoming lexicalized in that, before tautosyllabic nasals, for many speakers it actually gets underway already at the vowel onset and is thus not limited to just the portion immediately preceding the following nasal consonant (Solé 1992, 2003). This means that the global articulatory commands of these vowels must include the lowering of the velum, and in this case, the nasalization is not an anticipatory, co-articulatory gesture, but an integral part of the vowel itself. Hence, in some lexical items, nasalization is a feature coinciding with the entire vowel, and in others, it begins at some point before the following nasal consonant, and in still others, there is no nasalization at all. Bybee discusses evidence that speakers actually use vowel nasalization as a cue to morphemic identity (Bybee 2001: 43–44), which supports the evidence that nasalization is in the lexicon, be it lexicalized or not. Now, if speakers were to form labeled exemplar clouds of speech sounds on the basis of phonetic characteristics only, they would have three types of vowels, fully oral, fully nasalized and partially nasalized, and these would have to be located in three different memory structures, in which case the phonological category of the last one would be lost. If we assume that there are no exemplar clouds of speech sounds but only of words, the facts of nasalization would be found in the lexical items simply according to their actual distribution in them.

25As another example of graded phonetic properties, we may consider segmental duration in Finnish. In this language, duration is contrastive in both vowels and consonants so that on average, long phonemes are roughly twice as long as the short ones (Lehtonen 1970; Välimaa-Blum 1987), as the contrasts in (1) show. There is evidence that the long segments are actually sequences of two phonemes (Välimaa-Blum 2003) and I thus mark them with two identical letters; the final vowels in (1a), (1d) and (1e) have an extra length, marked with a raised dot, to be discussed presently.

(1)

a.

tuli ['tuliˑ] ‘fire’

b.

tulli ['tulli] ‘customs’

c.

tuuli ['tuuli] ‘(the) wind’

d.

tapa ['tɑpɑˑ] ‘custom’

e.

tapaa ['tɑpɑɑˑ] ‘meets 3sg’

f.

tappaa ['tɑppɑɑ] ‘to kill’

g.

tyytymättömänäkin [ˈtyytyˌmættøˌmænæˑkin] ‘even being dissatisfied’

26One obviously wants to know whether phonologically short and long segments belong to the same or to two different exemplar clouds? If the exemplar structures are based on vowel quality only, then short and long phonemes would be in the same cloud, but this would mean that the duration factor is lost to the learner, which is not conceivable since duration is contrastive. If there were separate clouds for short and long phonemes, they could not be kept apart without meaning.

27Finnish also has a so-called half-long vowel, which poses further problems for the formation of exemplar clouds of short and long vowels. In Finnish, every odd-numbered syllable carries some degree of stress, but the primary stress is systematically on the first syllable of the word, and the stressed vowels are generally longer than the corresponding unstressed ones. However, in the disyllabic word structure (C)V1CV2, the unstressed V2 is longer than the stressed V1, and this is marked with the raised dot in (1a) and (1d). This short V2 is the half-long vowel and it is about 1.5 times longer than the preceding, stressed short vowel (Lehtonen 1970; Välimaa-Blum 1987). Actually, after a stressed (C)V1, long vowels too are longer than they are after other stressed syllable types (Lehtonen 1970; Välimaa-Blum 1987). We can compare (1e) and (1f), where in the former, the final long vowel is extra long. The half-long short vowel is also found after secondary-stressed CV syllables in the middle of a word, as we can observe in (1g), where the penultimate vowel is half-long.

28As we can see, phonetic vowel duration in Finnish clearly forms a continuum from short to extra-long long vowels, and the consonants too range between the short and long ones with a slight lengthening effect in them as well after a tautosyllabic stressed short vowel. If we assume that a child learning Finnish forms labeled exemplar clouds of speech sounds, it is difficult to see how he would go about sorting out the half-long vowels from the long ones in order to place them together with the phonologically short vowels. And if there were exemplar clouds of phonemes based on the segmental quality alone, it would mean that about half of Finnish phonemes go uncategorized. The distinctive segmental duration cannot be established without the meanings of the words in which the segments occur, and the segments with extra length cannot be identified without considering the syllable structure and the stressedness of the preceding vowel. My argument is that there simply are no exemplar clouds of speech sounds but only of meaningful elements, for the phonemic identity of a sound in continuous speech frequently depends on the identification of the meaning of the word it occurs in.

29The above examples demonstrate that, in the case of graded phonetic characteristics, it is impossible to say where the phonological category boundaries lie without considering the entire words with their meaning and/or form. Separating sounds from meaningful elements also seems pointless. The knowledge speakers have of the hyperspeech forms of words and of the general hypospeech processes is drawn on in speech perception, but the active search of meaning dominates, nevertheless. In speech perception, when a listener hears an isolated word, he certainly listens for the sounds but only in order to determine the meaning expressed. The so-called McGurk effect is instructive in this sense. If a listener actually hears [] but sees the speaker close his lips, which cues him in to [], what the listener reports is [] (McGurk and Macdonald 1976). This demonstrates that what a listener thinks he hears overrides what he actually hears. Since he saw the lip closing but knows that he did not hear the sound that would go with the visual information, he will report something that seems compatible with both what he heard and saw. If the listener were to update an exemplar cloud of speech sounds in this case, he would presumably update the exemplar cloud of /t/ since this is what he concluded he heard, not that of /k/, which is what he actually heard. It would seem reasonable to say that, in normal conversational conditions, what gets updated are the exemplar clouds of the meaningful units identified by the listener, that is, the lexicon and grammatical constructions.

3.3 Overlapping Categories

30If speakers were indeed to form labeled exemplar clouds of speech sounds, how would they decide on the category membership in cases where the same sound instantiates two different distinctive categories? The standard example of this is the English /t/–/d/ neutralization in words like latter and ladder, both pronounced as [læɾər], though this neutralization may not be as complete as is usually assumed, for, as Johnson (2005b) demonstrates, even so-called homonyms are not necessarily homonymous. To place the flap into the respective categories in these two instances one needs to consider the semantics of the words. In experimental conditions, the listener maybe also uses the spelling, as in the case of ladder, which would certainly facilitate the task. However, if sounds are not categorized independently of meaning, the need to find out whether the flap in ladder instantiates /t/ or /d/ does not arise, for the listener uses the context to identify the word in question. But if speech sounds were stored in autonomous, labeled exemplar clouds, then there is no reason not to place the flap into its own exemplar cloud, labeled [ɾ]. In this case, this sound would sometimes be called for when the contrastive unit /t/ occurs and at other times it instantiates /d/, for while [ɾ] overlaps with both of these phonemes in certain contexts, it is also largely in complementary distribution with them in others. In the case of an exemplar cloud labeled [ɾ], both /t/ and /d/ would then lack one allophone.

31As an example that resembles overlapping categories, we may also consider the glottal stop in English. It is not a distinctive sound, but it occasionally instantiates the voiceless alveolar stop, as in hat /hæʔ/. Would this stop be placed in the /t/-cloud or would it have one of its own? If it belongs to the /t/-cloud, then the pre-vocalic glottal stop in ever [ˈʔevər] should not be in the same cloud since it does not belong to any independent phoneme category. In these cases, the glottal stop is only part of an abstract syllable structure schema, which introduces a syllable onset when the phonological make-up of the word has none. If there were an exemplar cloud containing all the glottal stops, then some of its tokens would have to be made available to the /t/-phoneme and the category /t/ exemplars would be missing at least one of its instantiations.

32Category overlap poses a problem for the formation of labeled exemplar clouds of speech sounds as long as we assume that the clouds correspond to the contrastive phonological categories. Without this assumption, however, speakers would have exemplar clouds of speech sounds that do not correspond to their linguistic function, in which case it is not clear how speakers could make use of them in speaking and listening. But assuming that the mental lexicon contains myriads of memorized instances of meaningful units, there is no need for any independent inventories of speech sounds.

3.4 Multilingual Speakers

33Let us assume now that when a child learns a language, he forms labeled exemplar clouds of the speech sounds. For this categorization and labeling to be possible, the sounds belonging to the same cloud must be classified as the same sound. As we have seen now, it is impossible to form and label speech sounds into these sorts of clouds without meaning, and with meaning, independent memory structures of speech sounds become pointless. There is a further problem for the formation of exemplar clouds of sounds that arises in the context of multilingual speakers. Would the minds of multilingual speakers store the sounds of each language in separate networks or would articulatorily similar sounds go together into the same cloud? Let us first consider a monolingual speaker of Finnish. In this language, there are only two fricative phonemes, /s/ and /h/. Since there is only a single coronal fricative, speakers of Finnish tend to manifest a great deal of variability in its phonetic realization. The word cat is kissa in Finnish, phonologically /kissɑ/, but in actual pronunciation, one can easily hear at least all of the following: [ˈkissɑ], [ˈkizzɑ], [ˈkiɕɕɑ] and [ˈkiʃʃɑ]. Now, if exemplar clouds of sounds are labeled according to their contrastive function, we may assume that in Finnish, the sounds [s z ɕ ʃ] would be in the one that is labeled /s/.

34Let us next assume that our Finnish speaker next learns English. In English, / s z ʃ / contrast and [ɕ] is a contextual variant of at least /s/. A native speaker of English would be assumed to have three exemplar clouds for the coronal fricatives. Would the native speaker of Finnish form a separate set of exemplar clouds for English? If so, would they be labeled differently from the Finnish one? If not, we have to ask the same question as above: how does one label an exemplar cloud that contains items that belong to different phonological categories and, in this case, also to different languages? Furthermore, the coronal sounds, among them /s/, are dental in Finnish and alveolar in English. Would these differences be marked in the case of a single cloud for the two languages, given that in both languages the allophones of /s/ include variations in the place of articulation?

35What if our Finn were to learn Polish as well? In this language, all of / s z ɕ ʃ / are contrastive. How many exemplar clouds would the trilingual Finn possess, one set for each language or one set for all coronal fricatives? How would they be labeled? If there is only one cloud, how would the speaker know which sounds to use when he speaks a given language, if he now uses these exemplar clouds in speech production (Pierrehumbert 2001)? If speech sounds were marked for frequency, social context, style, etc. (Pierrehumbert 2001), then we would have to say that speech sounds must be labeled according to their language as well, with all the complicating factors.

36If children were to form exemplar clouds of speech sounds in the course of language acquisition, we would expect them to do it from the very start and form either just one cloud or several, according to the number of their languages. Vihman (2002) asks whether bilingual children begin with one or two phonological systems and concludes that this question is actually out of place, for up to the first 50-100 words, children have no phonological system at all, which is implicit in Studdert-Kennedy’s (1987) observation that the first words are said in rather disparate ways. After these first words, children develop various word templates, which at first seem to be idiosyncratic and overlapping in the languages, and only later will they develop patterns matching the adult patterns. If children indeed have no phonological system at first, it would point to their initially learning holistic meaningful elements, which are not composed of smaller sound segments. Given that a bilingual older child and adult speaker may well sound native in both languages, articulation must make use of distinct procedural knowledge in each language, but these articulatory routines emerge gradually. This is, of course, predicted by entrenchment in usage-based grammars, but it does not entail that there must be independent or merged sets of exemplar clouds of the phonemes of the two languages.

37Weinreich (1953) proposed that there are three types of bilinguals, co-ordinate, compound and subordinate, differentiated according to the degree of interference from one language to another. In the first type, there is no interference among the languages, in the second, the interference is bidirectional, and in the last type, the interference is exclusively from the first language, and these three types are based on the language learning histories of the speakers. Recent research suggests that at least the languages learned through explicit teaching, as in Weinreich’s subordinate type, incorporate metalinguistic knowledge that will mark them apart more or less permanently in that speakers remain in possession of the explicit propositional knowledge about these languages (Paradis 2002: 2).

  • 2  I am grateful to Shona Whyte and Alexia Mady for some of the references on bilingualism.

38Paradis (2002: 1) notes that procedural memory is task specific, and if this is indeed true, it would explain the native-like pronunciation of co-ordinate bilinguals in both of their languages, but it would still not mean that there necessarily are separate exemplar clouds of the phonemes. Paradis (2002: 2) furthermore observes that vocabulary has a different status from the rest of the language structure in that, e.g., chimpanzees and gorillas can learn large numbers of signed words and that children deprived of language during early infancy can later learn vocabulary but very little morphosyntax. Nevertheless, studies on bilingual language processing tend to focus on the lexicon (de Bot 1992, 2004; Costa, Miozzo and Caramazza 1999; Costa, Colomé and Caramazza 2000; Finkbeiner, Gollan and Caramazza 2006; La Heij 2005; Green 19982). It seems that the semantics of the lexical items may well be shared in the form of translation equivalents in different languages, and lexical access may then use the shared semantic properties, but these studies do not explicitly indicate the independence of the phonemes of the meaningful units.

39The problem of lexical access may be further complicated by the fact that a speaker may speak more than two languages (de Bot 2004). Speakers also have specific language modes, monolingual or bilingual, which make one language more available or activated than another at a given moment, although, even when speaking in a monolingual mode, none of the other languages are actually completely switched off (Grosjean 1998), which additionally clouds the issue of retrieving speech sounds from independent exemplar clouds, be they shared or separate. Also, research on multilingualism is not at an agreement as to whether and to what extent the different languages are stored together in the brain (Fabbro 2001; Gomez-Tortosa et al. 1995; Hernandez, Li and MacWhinney 2005; Paradis 1996) or whether the representation is variable, even depending on the processing level and language (Hernandez and Bates 1999). None of this elucidates the question as to whether there are exemplar clouds of speech sounds or not.

40A modular account of bilingual lexical access in a picture-naming task is outlined by Costa, Colomé and Caramazza (2000) as follows. Lexical access first evokes the shared meaning of the translation equivalents, next decides on the word in the target language, then retrieves the relevant sounds, presumably from some exemplar cloud or another, and finally interprets the phonological shape into an articulatory one. In a modular system like this, the retrieval of the sounds from a separate module presupposes a process, which indeed needs independent exemplar clouds of the speech sounds.

41Hernandez, Li and MacWhinney (2005) propose an emergentist point of view to bilingualism, which elucidates the differences among the types of bilingual speakers in taking into account, e.g., the different degrees of entrenchment in language learning, and this, of course, more accords with cognitive linguistics. In the cognitive framework, things are very different from the modular view in the previous paragraph in that, when speaking, be the languages together in the brain or not, and regardless of the mode of lexical access, speakers make use of ready-made schematic construction patterns and phonetically fully-specified lexical elements. This means that the lexical items are available as such to the speaker, without further abstraction or process. There is no need to posit individual or multiple exemplar clouds of speech sounds per se, but only fully specified memory structures of lexical items. We can assume now that a trilingual Finn retrieves the lexical items as such from an exemplar-type lexicon, which may or may not be semantically overlapping in the languages, but his procedural articulatory movements are specific for each language, at least as long as he is a co-ordinate or compound bilingual, otherwise, he uses the same procedural knowledge for all of his languages.

3.5 Twisting Language

3.5.1 Speech Errors

42In the previous sections I considered categorization problems for the formation of labeled exemplar clouds of speech sounds. In this section I take up language games, which might suggest that speakers store and thus ought to have access to speech sounds independently of the meaningful units. First though, a few words on speech errors, which in the past have been used as evidence of the psychological reality of the phoneme and distinctive features. In normal speech, spoonerisms are typical speech errors involving phonology. They are not intentional, for a speaker may not even become aware of having made one. In spoonerisms, syllables, segments or just features are switched around in a word, as in yellow bird /ˈjɛloʊ ˈbɜːrd/, which might become /ˈbɛloʊ ˈjɜːrd/. It seems that these types of errors can be explained as a matter of the procedural commands going ‘haywire,’ whereby the articulatory organs actually receive the intended information for the production of the sound sequences, but due to a variety of factors, e.g., ‘excessive’ rate or momentary sluggishness in one articulatory organ or another, the execution of the motor commands fails to realize the planned sequence.

43These types of distortions of words obey the general phonological principles of the language in question, but they do not demonstrate that sounds are stored separately from the meaningful entities. These errors are also based on the words in question so that the sounds are not retrieved independently of them. The fact that speakers make speech errors at all demonstrates that they can certainly say things for which there is no model and which have perhaps never been heard before. But, as just noted, this is a matter of procedural knowledge. Speech errors do not violate the existing schematic patterns of possible syllables and words, but match them, and thus the procedural commands not following the target patterns still tend to conform to the global phonological shapes of existing word patterns.

3.5.2 Pig Latin and the Like

44Languages games have a long history as external evidence in phonology. For example, Chao (1934) uses a game to justify a decision about assimilation in Peiping Chinese, Sherzer (1970) uses them to analyze voiceless stops, and the Finnish Kontti-kieli or knapsack language has been extensively used as external evidence for vowel harmony (Campbell 1981 1986; Vago 1988), for whether vowel duration is a property of the syllable or of the segments (Vago 1985) and for the psychological reality of underlying and other forms (Campbell 1986). I myself am fluent in Kontti-kieli and find that many times the researchers seem to accept as Kontti-kieli forms that do not actually follow the rules that they themselves posit (Campbell 1981, 1986; Bertinetto 1988; Vago 1985), so that perhaps some informants were not fully fluent in the language game they had to play or else there simply were several variants of the rules (Välimaa-Blum 2003).

45When I was a child we would say words backwards, but in the rules of the game, we were explicitly manipulating letters, not sounds. Accompanied by big laughs, my name thus became Attiir Aamiläv, but since we were playing with letters, those of us who could not read yet were totally lost, and many readers had problems as well. Language games generally concern manipulations of meaningful words in the sense that their meaning must be fully preserved, for otherwise the games would become pointless. Thus, it is not random sounds that are used to generate the game words, but the existing lexical-phonological resources of the source words are used in novel ways. In Pig Latin, the basic principle is to move the initial consonant (cluster) to the end of the word and add the diphthong //; in the absence of a word initial consonant, the vowel is added as described but with minor variations on the consonant preceding it. For example, with these rules, satin /ˈsætɪn/ becomes /ˈætɪnseɪ/ and eat /it/ would become /ˈiteɪ/. This clearly demonstrates that, once one knows the rules, the meaning is fully recoverable from the game words, and the added syllable is an existing syllable, realized by the general procedural commands based on the phonological structure of English words. As such, games of this type thus do not demonstrate anything in favor of exemplar clouds of speech sounds.

46While language games may indeed provide external evidence for some phonological phenomena, their role should be evaluated extremely carefully, in part, as just noted, because not all speakers play language games with equal ease, and because there may be small differences in the game rules. Zwicky and Pullum actually point out that “there is a sense in which play languages are a special class, from which not every kind of data would be judged suitable as the basis of linguistic generalizations” (Zwicky and Pullum 1987: 4), for language games may well introduce phonological processes which actually exist in no language. For example, there can be no language with a phonological principle that that adds an identical segment at the end of all syllables or words, like the different versions of Pig Latin do. It also seems to me that, e.g., the use of nonsense words is not appropriate when studying prosody (cf. Iivonen 1998: 317), for the outcome is unnatural (Välimaa-Blum 2003). Globally then, language games do not provide unambiguous evidence of linguistic phenomena and certainly not of the independence of exemplar clouds of phonemes.

3.5.3 Reiterant Speech

47The basic assumption I am making is that if there were indeed exemplar clouds of speech sounds, we should be able to access the sounds with the same ease and in the same variety as we access them in words, that is, we should be able to create nonsense words containing the same diversity of sounds as in existing words. Reiterant speech is a frequently used method used in phonetic experiments to study prosody. This is the type of language play, which might be taken to indicate the independence of the speech sounds of meaningful elements. In reiterant speech, the subjects are asked to replace the syllables of a model sentence by a nonsense syllable, typically //. Thus the sentence the boat is afloat would become something like [bə ˈbɑ bə bə ˈbɑ]. In reiterant speech, however, the segments used are determined beforehand by the experimenter, so that the speaker is not retrieving any sounds randomly or even in an ordered fashion from existing memory structures of sounds.

48Reiterant speech avoids the problem of segmental effects on the prosody (Larkey 1983), though it may still be uncertain as to what properties of language reiterant speech actually captures (Okobi 2006: 16). Not all speakers are good at this either. I can attest it myself, having seen my experimental subjects do this only with difficulty and after a good dose of explicit training. Also, listening to reiterant speech may not be without complications, as noted by Rilliard and Aubergé (1998), who say that this kind of speech is not easy ‘ecological material’ for listeners, and it is not quite sure what the listeners are retrieving in this kind of metalinguistic undertaking. Perhaps these inherent difficulties make experimenters assure that the technique is reliable for each subject separately before their speech is even recorded for analysis (Adisasmito-Smith and Cohn 1996). This same proviso applies to language games in general, as observed above, for there are great inter-speaker differences in speaking languages like Pig Latin and the Finnish Kontti-kieli. If speakers indeed had independent exemplar clouds of phonemes, all sounds should be easy to access in any situation, just as words and grammatical constructions are readily available at will in natural speaking conditions, but this does not seem to be the case.

3.5.4 Scat Singing

49Another language play that might lend support to separate exemplar clouds of speech sounds is scat singing, vocal music generated by the vocal tract but lacking distinct words (Fitch 2006). Scat singing, which Pinker (2004: 15) calls meaningless display of virtuosity, consists of improvising with nonsense syllables. When we listen to these performances by those who excelled in scat, e.g., Ella Fitzgerald and Louis Armstrong, it is obvious that the singing does not employ vowels and consonants in the same diversity as in normal speech, but the vocalizations rather resemble reiterant speech in that, while the syllables in scat singing are not 100% identical, they consist of only a small number of different consonants and vowels, not at all in the same variety of combinations as they are found in normal language use.

50Accessing speech sounds with the same ease as we produce them in existing words is not easy, for it takes a considerable mental effort to create nonsense words that are not repetitive. Speech becomes extremely slow and deliberate, precisely because the speaker needs to search for the sounds and syllables, in order to avoid using the same ones repeatedly. The sounds do not surge spontaneously, which is what they ought to do, if we had them stored in separate exemplar clouds and from where we retrieved them in normal speaking. It would seem that both scat singing and reiterant speech are based on procedural knowledge of the production of speech, where the singer/speaker is using a limited number of existing, meaningless syllables. Speakers do have both schematic and procedural knowledge about the possible syllables, word shapes and phonotactics in general that enable them, e.g., to pronounce both new words and loans, but this is no evidence of independent exemplar clouds of speech sounds.

3.5.5 Jabberwocky

51What I am assuming is that speech sounds only exist as elements that give form to the meaningful units and that they do not correspond to any independent memory structures. As for complex constructions like polymorphemic words and sentences, their form is meaningful as well (Fillmore, Kay and O’Connor 1988; Croft 2001; Goldberg 1995, 2006; Kay 1997; Lakoff 1987). The meanings of grammatical constructions, which are independent of lexical items, underlie Jabberwocky, which is an exercise in the use of grammatical constructions with nonsense words, as the following verse demonstrates (Carroll 1960: 134).

(2)

‘Twas brillig, and the slithy toves

Did gyre and gimble in the wabe:

All mimsy were the borogoves,

And the mome raths outgrabe.

52It is the symbolic nature of grammar that explains the apparent meaningfulness of (2). One clearly has a feeling that something meaningful is being said in spite of the fact one understands nothing. We may cite Alice herself to support this reaction: “It seems very pretty [...] but it’s rather hard to understand! [...] Somehow it seems to fill my head with ideas – only I don’t exactly know what they are!” (Carroll 1960: 134–135). This is precisely the point: grammar evokes meaningful thought, for it modifies the lexical content in the construction in question, and the grammar in this poem clearly evokes noun phrases with entity-like semantics, adjectives with property-like meanings, and actions and states in verbal structures. If the lexical content is nonsense, then we just get the meaning of the grammar, which is not easy to make explicit, as it does not represent propositional knowledge in the way lexical items do, but certain aspects of grammar may actually be procedural (Bybee 2001: 40).

53Creating Jabberwocky is not a trivial task, for, in spite of its meaning, it is very hard to retrieve the grammatical meaning without a matching lexical content. Grammatical constructions with their somewhat intractable meanings and the lexical meanings reinforce the meanings of one another in that the constructions attract a specific lexical content to them (Goldberg 1995, 2006). In ordinary speech, it is the propositional knowledge of word meanings that implicitly brings to light the constructional meanings as well. Carroll must have spent quite a considerable time creating his Jabberwocky, as he also took great care that the nonsense words had a naturally varying range of sounds.

54When we read aloud the Jabberwocky passage above, we are not speaking in English but only using the procedural knowledge that we have of the phonological and grammatical structuring of the language. When Ella Fitzgerald produces scat singing, is she singing in English? No, for while scat singing does make use of the procedural knowledge of the production of syllables of English, the actual productions are not English. Jabberwocky and scat singing thus provide no evidence of the existence of a body of knowledge of speech sounds that would be separate from meaning. In the absence of the propositional knowledge associated with meaningful words, these language games are mainly using the highly automatized, procedural knowledge that lies beneath the production of the speech sounds and syllables.

4. Discussion

4.1 Auditory or articulatory lexicon?

55If there are no exemplar clouds of speech sounds but only of meaningful entities, what is the nature of the mental lexicon? A lexical entry is a meaningful element that has one or more sound shapes, i.e., its co-allomorphs, it has a prototype-centered meaning, and at least in some languages, grammatical information and spelling as well. In exemplar approaches, it is assumed that the sound shapes of lexical entries include information of both predictable and distinctive features plus also speaker- and variety-specific detail. Coleman (1998) discusses neuroscience evidence of whether the phonological aspects of lexical representations are articulatory or auditory. He explains that memories in general are not located in the brain in just some place or another, but there is an intimate relationship between the storage location and the areas associated with the corresponding peripheral input and output modalities: “For example, memories of sounds and music are stored close to the primary auditory areas, memory of shapes and colours next to the primary visual areas, and memories of hand-operated tools close to sensorimotor areas. This organization permits extensive and close connection between storage and processing of memories in each modality to be rapid and efficient: it is possible that storage and processing of memories are not clearly separable...” (Coleman 1998: 300).

56Coleman demonstrates that the phonological lexicon, i.e., the one related to the sound shapes of the lexical entries, has a single location in the brain, and it surrounds the primary auditory areas, distant from the motor areas. He suggests that the same location is used in both perception and production, and that the articulatory representations are generated “on-the-fly.” Coleman concludes that lexical representations of the phonological lexicon are primarily auditory. He notes also that, since the lexical memories are auditory, they are compatible with prototype-based, exemplar-type models of word memories, which incorporate low-level phonetic information. This concurs with the idea that lexical items are not processed sequentially from individual sounds but that they are stored and retrieved as wholes.

57Given that speech perception is not a matter of left-to-right segmentation of the input but a much more complex cognitive process of simultaneous comparison, matching and interpretation of the speech stimuli using the existing knowledge sources and the context, the status of the individual sound segment naturally loses some of its import in speech perception as well as production. Given the symbolic nature of languages, it seems that the basic units would have to be semantic entities, not meaningless sounds or features. In utterances, sound sequences are, of course, structured into hierarchical units such as syllables, feet and phonological words and phrases, and there are various universal patterning schemes concerning them, such as the tendency to syllabify utterances into CVCV-structures (Scheer 2004) and various assimilatory phenomena (Clements and Hume 1995). But these relate to the high degree of automaticity of speech production, i.e., procedural knowledge, and they tend to be independent of semantics.

58Port (2004) and Port and Leary (2004) discuss arguments against words and morphemes being sequences of discrete, formal sound units. They suggest that the segmental approaches of the IPA and the SPE style and many other phonologies stem from the deep-seated tradition of alphabetic writing, which represents words as a succession of isolated letters and which has led linguists into thinking that this is in fact what morphemes and words correspond to in the mind as well. They emphasize that there is no evidence that this kind of discreteness is actually present in the mental representation of language. These authors advocate the so-called social phonology, which is based on a community of speakers who use a given language over considerable time spans, and in this phonology, there are no segments but only holistic memory structures of words and morphemes, which is what I have been proposing as well in assuming exemplar-based word memories. Port and Leary, however, adopt articulatory phonology, a gestural approach to speech production, which they use to model the social phonology of a language. But if Coleman (1998) is right in assuming that the phonological part of the mental lexicon is primarily auditory, and that the articulatory representations are generated “on-the-fly,” phonology cannot just be “a model control system for the speech apparatus of the linguistic community” (Port 2004: 19). In this, I follow Coleman, but at some point, of course, the articulatory settings must also be modeled, and as Port and Leary assume, Browman and Goldstein’s (1986; 1992) articulatory phonology may well be suitable for this.

59We may conclude this section by saying that, if the phonological aspects of lexical memories are indeed auditory, it is not likely that speech would be segmented into sound sequences in either production or perception, but both involve holistic words. If the targets of speech production and perception were the individual segments, it would be difficult to understand why there can be so many deletions, but if the target is the meaningful word, it is easy to understand even the most extreme cases of hypospeech, because all the speaker wants is for the intended meaning to get across, not the individual sounds. In continuous speech then, the sound shapes of words are fitted into the complex prosodic and structural patterns of polymorphemic words and larger constructions, and the potential reductions and deletions depend on both linguistic and extra-linguistic factors.

4.2 Basic Exemplar Units: Speech Sounds or Words?

60There are at least two different points of view as to what the basic unit of representation in exemplar-based phonologies is. For one, it is the speech sounds (Pierrehumbert 2001: 148) and for the other it is the words (Johnson 2005b: 298). In the first outlook, speakers form and conserve exemplar clouds of speech sounds in their long-term memory, and these are probably labeled (Pierrehumbert 2001: 140). What the effects of this labeling are is to group those speech sounds together that are similar to one another in terms of phonetic form and linguistic function; in other words, the labeled clouds must correspond to something like the traditional phoneme. Without labeling, there would be no category structure in the mental representation of speech sounds, and in this case, it is difficult to see how speakers could benefit from them in speech production and perception. On the other hand, labeling can only be performed by taking the distinctive value of the sounds into consideration. In the second approach, the basic exemplar units are assumed to be words. If we accept that that there is a mental lexicon in the first place, and if we accept the proposal that the phonological aspects of this lexicon are primarily auditory (Coleman 1998), then it is not clear why speakers would have separate exemplar clouds of speech sounds at all. The evidence of Goldinger and Azuma (2003) that speech perception makes simultaneous use of both top-down and bottom-up information equally supports the view that the basic exemplar units are words rather than sounds.

61As noted above, I assume that, for each lexical entry, the lexicon contains an exemplar cloud of at least all the non-automatic allomorphy. These allomorphs in their various social, dialectal and stylistic variants serve as magnets for the granulation of the memorized episodes of each word. If we accept that there is a mental lexicon, whatever its nature, we have to ask the question as to what kind of relation it would bear to the putative exemplar clouds of speech sounds? If there were indeed exemplar clouds of speech sounds in addition to the mental lexicon, this would mean that there is a process-like transformation involving a real-time sequencing of discrete sounds into words. What the phonological form of the lexical entries in this case would be is not clear, and separate exemplar clouds of speech sounds would make lexical retrieval a somewhat obscure process. On the other hand, an auditory lexicon entails that speech production is not process-like but the realization of memorized, ready-made sound shapes that include detailed information of the auditory properties of the words. Spoken words are not built out of discrete, left-to-right sound sequences, but come largely as such from exemplar clouds that already in the long-term memory manifest the typical fusing of adjacent sounds into one another. Exemplar memories actually contradict the conventional assumption that phonetic sequences are continuous and phonology is discrete, for they remove discreteness from phonology altogether.

62We may also consider the fact that a given speech sound may behave differently, both synchronically and diachronically, depending on the frequency of the word in which it occurs (Bybee 2001; Solé and Ohala 1991). A given phoneme in a frequent word may be pronounced in one way and in an infrequent word in another way. If the units of speech production were sounds extracted from exemplar clouds in the long-term memory (Pierrehumbert 2001), this necessarily entails that both words and sounds must be marked for frequency, and not only that, but individual sounds would also have to be marked for variables such as dialect, gender, age, style, language, etc. It is difficult to understand why a speaker would do this, for the episodic memories of words already carry all this information.

  • 3  In the absence of the oral-auditory channel, I leave human sign languages aside in this discussion

63Associating exemplar models with cognitive linguistics is fully compatible with the fundamental assumption that languages are symbolic systems, in which everything is geared toward the expression of meaning. Cognitive approaches also assume usage-based models so that, whatever the innate cognitive potential of the human being is, languages emerge, get entrenched and automatized through frequent usage, and this assumption too makes it questionable whether, in this process, language learners would form memory structures of sounds independently of meaning. Given that languages use the oral-auditory channel, we have to see phonology as something indispensable, without which meaning would not be observable,3 but this does not render the individual sounds an independent status.

4.3 Stochastic Variability in Speech

64Normal speech is subject to a high degree of stochastic variability, in fact, variability is rather the norm than the exception, and no search for phonetic invariance in speech sounds has been fruitful (Lindblom 1990: 404). Since the speech sounds in words themselves are not invariant to the point of their often being deleted, the question necessarily arises as to what exactly is subject to this variability. Is it the individual speech sounds or is it the words? If the target of speech production were the individual sounds, it is amazing how much inconsistency they tolerate. The same must be said about speech perception: if the target of speech perception were the discerning of the individual sounds, how is it that listeners can understand words in which several sounds and entire syllables are deleted? This necessarily leads to the absurd question as to whether the stylistic absence of a phoneme from a word is part of the regular alternation pattern of the segment. How can we give an account of the variability in any general way if we assume that its primary targets are the segments? If, however, we consider that we store words in a fully specified form in a mental lexicon that is primarily auditory, and that these episodic memories are the main targets of speech production and perception, then we better understand the observed stochastic variability. It does not concern sounds per se but words.

65Not all deletions and reductions obey some phonological/articulatory principle or another, for many cannot be accounted for by phonology at all. Instead, many deletions and reductions can be identified only in terms of their minor semantic import and even pragmatic principles such as the Maxim of Relevance (Grice 1975) in that a speaker can delete anything that doesn’t jeopardize the meaning. Also, as Kohler (1989) notes, when meaning is not jeopardized, reductions are more frequent in familiar than in unfamiliar words. This means that, especially in frequently occurring words, the exact hyperarticulated form is not crucially important for lexical access. It is sufficient to preserve only the most central aspects of the global phonological shapes, which together with the constructional frame guide the interpretation of even the deleted elements. This, for example, explains why function words tend to be reduced more often than content words.

4.4 Conclusion

66Phonemes are indispensable for the overt manifestation of meaningful language, for they give the observable sound shape to it, but in a given word, as long as the meaning is recoverable, it is not obligatory for all the phonemes to be present. However, any kind of left-to-right segmentation of words into sounds is unlikely in both speech production and perception. In fact, to assume that in the process of speech perception, a listener forms and updates exemplar clouds of speech sounds leads to circularity. On the one hand, the placing of the sounds into labeled, linguistically significant categories presupposes that the meaning of the word is understood, for without it, in the absence of phonetic invariance, labeling of sounds into phoneme categories is not possible. But on the other hand, if a listener were to segment a word into sounds, it would presumably be in order to understand the meaning of what is being said. It seems therefore pointless and unfeasible as well to form labeled exemplar clouds of speech sounds. If the sound shapes of lexical entries are conserved in terms of auditory memories (Coleman 1998), then it is these holistic items that are retrieved in both perception and production. If the perception of words is indeed non-analytic (Goldinger 1998), and if there is only one mental lexicon, which is auditory, not articulatory, or any combination of the two (Coleman 1998), how could we even have exemplar clouds of phonemes? On what grounds would there emerge exemplar clouds of speech sounds, when the meaningful ones are perceived and executed as holistic entities? If, as it is assumed in cognitive linguistics, languages are indeed symbolic systems, then phonology has to follow suite. There is no motivation for speakers to form independent exemplar clouds of meaningless sounds when what they learn and use is always meaningful.

Haut de page

Bibliographie

Adisasmito-Smith, N. & A. C. Cohn 1996. Phonetic correlates of primary and secondary stressing Indonesian: a preliminary study. Working Papers of the Cornell Phonetics Laboratory 11: 1–15.

Anderson, J. 1983. The Architecture of Cognition. Cambridge, MA: Harvard University Press.

Anderson, J. 1993. Rules of the Mind. Hillsdale, NJ: Erlbaum.

Basset, P., A. Amelot, J. Vaissière & B. Roubeau 2002. Nasal airflow in French spontaneous speech. Journal of the International Phonetic Association  31(1): 87–100.

Bertinetto, P. M. 1988. The use and misuse of external evidence in phonology. In W. U. Dressler, H. C. Luschützky, O. E. Pfeiffer & J. R. Rennison (eds.), Phonologica, Proceedings of the 6th International Phonology Meeting. Cambridge: Cambridge University Press, 33–47.

Boyland, J. T. 1996. Morphosyntactic change in progress: a psycholinguistic approach. Ph.D. dissertation, Department of Linguistics, University of California, Berkeley, CA.

Browman, C. P. & L. Goldstein 1986. Towards an articulatory phonology. Phonology Yearbook 3: 219–252.

Browman, C. P. & L. Goldstein 1992. Articulatory phonology: An overview. Phonetica 49: 155–180.

Butcher, A. 1996. Some connected speech phenomena in Australian languages: universals and idiosyncrasies. In A. P. Simpson & M. Pätzold (eds.), Sound Patterns of Connected speech – Description, Models and Explanation, Kiel: Arbeitsberichte (AIPUK) no. 31, 83–104.

Bybee, J. L. 1995. Regular morphology and the lexicon. Language and Cognitive Processes 10: 425–455.

Bybee, J. L. 1998. The emergent lexicon. In M. C. Gruber, D. Higgins, K.S. Olson & T. Wysocki (eds.) Chicago Linguistic Society 34: The Panels. Chicago: Chicago Linguistic Society, 421–435.

Bybee, J. L. 2000. The phonology in the lexicon: Evidence from lexical diffusion. In M. Barlow & S. Kemmer (eds.), Proceedings of the Rice Symposium on Usage-Based Models of Language. Stanford, CA: CSLI Publications, 65–85.

Bybee, J. L. 2001. Phonology and Language Use. Cambridge: Cambridge University Press.

Campbell, L. 1981. Generative phonology vs. Finnish phonology: Retrospect and prospect. In D. L. Goyvaert (ed.), Phonology in the 1980’s. Ghent: Story-Scientia, 147–182.

Campbell, L. 1986. Testing Phonology in the Field. In J. J. Ohala & J. J. Jaeger (eds.) Experimental Phonology. New York: Academic Press, 163–173.

Carroll, L. 1960. Alice’s Adventures in Wonderland & Through The Looking Glass, New York and Scarborough, Ont.: New American Library.

Chao, Y-R. 1934. The non-uniqueness of phonemic solutions of phonetic systems. In M. Joos (ed.), Readings in Linguistics 1. Chicago: University of Chicago Press, 38-54.

Clements, G. N. & E. V. Hume 1995. The Internal Organization of Speech Sounds. In J. Goldsmith (ed.), The Handbook of Phonological Theory Oxford: Blackwell, 245–300.

Coleman, J. 1998. Cognitive reality and the phonological lexicon: A review. Journal of Neurolinguistics 11(3): 295–320.

Costa, A., A. Colomé & A. Caramazza 2000. Lexical access in speech production: The bilingual case. Psicológica 21: 403–437.

Costa, A., M. Miozzo & A. Caramazza 1999. Lexical selection in bilingual speech production: Do words in the bilingual’s two lexicons compete for selection? Journal of Memory and Language 41: 365–397.

Croft, W. 2001. Radical Construction Grammar: Syntactic Theory in Typological Perspective. New York: Oxford University Press.

De Bot, K. K. 1992. Bilingual processing model: Levelt’s ‘speaking model’ adapted. Applied Linguistics 13: 1–24.

De Bot, K. K. 2004. The multilingual lexicon: Modeling selection and control. The International Journal of Multilingualism 1(1): 17–32.

Fabbro, F. 2001. The bilingual brain: Cerebral representation of languages. Brain and Language 79: 211–222.

Fillmore, C. 1976. Frame semantics and the nature of language. The Annals the New York Academy of Sciences 280: 20–32.

Fillmore, C. 1982. Frame semantics. In Linguistic Society of Korea (ed.) Linguistics in the Morning Calm. Seoul: Hanshin, 111–138.

Fillmore, C. 1985. Syntactic intrusions and the notion of grammatical construction. Berkeley Linguistic Society 11: 73–86.

Fillmore, C., P. Kay & M. O’Connor 1988. Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64(3): 501–538.

Finkbeiner, M., T. H. Gollan & A. Caramazza 2006. Lexical access in bilingual speakers: What’s the (hard) problem? Bilingualism: Language and Cognition 9: 153–166.

Fitch, W. T. 2006. The biology and evolution of music: A comparative perspective. http://www.st-andrews.ac.uk/~wtsf/Fitch2006BiomusicCognition.pdf.

Fougeron, C. & D. Steriade 1997. Does deletion of French schwa lead to neutralization of lexical distinctions? Proceedings of the 5th European Conference on Speech Communication and Technology. Patras: University of Patras, vi., 943–946.

Goldberg, A.E. 1995. Constructions: A construction Grammar Approach to Argument Structure. Chicago and London: University of Chicago Press.

Goldberg, A.E. 2006. Constructions at Work: The Nature of Generalizations in Language. Oxford: Oxford University Press.

Goldinger, S.D. 1998. Echoes of echoes? An episodic theory of lexical access. Psychological Review 105(2): 251–279.

Goldinger, S.D. & T. Azuma 2003. Puzzle-solving science: the quixotic quest for units in speech perception. Journal of Phonetics 31: 305–320.

Grosjean, F. 2001. The bilingual’s language modes. In J.L. Nicol (ed.), One Mind, Two Languages: Bilingual Language Processing. Oxford: Blackwell, 1–22.

Green, D.W. 1998. Mental control of the bilingual lexico-semantic system. Bilingualism: Language and cognition 1: 67–81.

Grice, H.P. 1975. Logic and Conversation. In P. Cole & J.L. Morgan (eds.), Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41–58.

Gomez-Tortosa, E., E. Martin, M. Gaviria, F. Charbel & J. Ausman 1995. Selective deficit in one language in a bilingual patient following surgery in the left perisylvian area. Brain and Language 48: 320–325.

Hernandez, A. & E. Bates 1999. Bilingualism and the brain. In R.A. Wilson & F.C. Keil (eds.), The MIT Encyclopedia of the Cognitive Sciences. Cambridge, MA: MIT Press, 80–81.

Hernandez, A., P. Li & B. MacWhinney 2005. The emergence of competing modules in bilingualism. Trends in Cognitive Sciences 9(5): 220–225.

Iivonen, A. 1998. Intonation in Finnish. In D. Hirst & A. De Cristo (eds.), Intonation Systems, A Survey of Twenty Languages. Cambridge: Cambridge University Press, 311–327.

Johnson, K. 1997. Speech perception without speaker normalization. In K. Johnson & J. Mullennix (eds.), Talker Variability in Speech Processing. San Diego, CA: Academic Press, 145–165.

Johnson, K. 2005a. Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. UC Berkeley Phonology Lab Annual Report, 95–128. http://linguistics.berkeley.edu/phonlab/annual_report.html.

Johnson, K. 2005b. Decisions and mechanisms in exemplar-based phonology. UC Berkeley Phonology Lab Annual Report, 289–311. http://linguistics.berkeley.edu/phonlab/annual_report.html.

Johnson, K., E. Flemming & R. Wright 1993. The hyperspace effect: Phonetic targets are hyperarticulated. Language 64 (3): 505–528.

Johnson, M. 1987. The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: University of Chicago Press.

Kay, P. 1997. Words and the Grammar of Context. Stanford, CA: CSLI.

Kemmer, S. & M. Barlow 2000. Introduction: Usage-based conception of language. In M. Barlow & S. Kemmer (eds.), Proceedings of the Rice Symposium on Usage-Based Models of Language. Stanford, CA: CSLI Publications, vii–xxviii.

Kohler, K. 1989. Segmental reduction in connected speech in German: Phonological facts and phonetic explanations. In W.J. Hardcastle & A. Marchal (eds.), Speech Production and Speech Modeling. Dordrecht: Kluwer, 69–92.

Kohler, K. 1996. The phonetic realization of schwa syllables in German. In A.P. Simpson & M. Pätzold (eds.), Sound Patterns of Connected speech – Description, Models and Explanation. Kiel: Arbeitsberichte (AIPUK) Nr. 31, 11–14.

La Heij, W. 2005. Selection Processes in Monolingual and Bilingual Lexical Access. In J.F. Kroll & A. M.B. de Groot (eds.), Handbook of Bilingualism. Oxford: Oxford University Press, 289–307.

Larkey, L.S. 1983. Reiterant speech: An acoustic and perceptual validation. Journal of the Acoustical Society of America 73 (4): 1337–45.

Lakoff, G. 1987. Women, Fire and Dangerous Things: What Categories Reveal about the Mind. Chicago and London: The University of Chicago Press.

Lakoff, G. & M. Johnson 1980. Metaphors We Live By. Chicago: The University of Chicago Press.

Langacker, R. 1987. Foundations of Cognitive Grammar. Volume 1:Theoretical Prerequisites. Stanford, CA: Stanford University Press.

Langacker, R. 2000. A Dynamic Usage-Based Model. In M. Barlow & S. Kemmer (eds.), Proceedings of the Rice Symposium on Usage-Based Models of Language. Stanford, CA: CSLI Publications, 65–85.

Lehtonen, J. 1970. Aspects of Quantity in Standard Finnish. Studia Philologica Jyväskyläensia VI . Jyväskylä: University of Jyväskylä.

Liberman, I.Y., A.M. Liberman, I. Mattingly & D.P. Schankweiler 1980. Orthography and the beginning reader. In J.F. Kavanagh & R.L. Venezky (eds.), Orthography, Reading, and Dyslexia. Baltimore: University Park Press, 137–153.

Lindblom, B. 1990. Explaining Phonetic Variation: A Sketch of the H&H Theory. In W.J. Hardcastle & A. Marchal (eds.), Speech Production and Modeling. Amsterdam: Kluwer Academic Publishers, 403–439.

Marcus, G.F., S. Vijayan, S. Bandi Rao & P.M. Vishton 1999. Rule learning by seven-month-old infants. Science 283 (5398): 77–80.

McGurk, H. & J. MacDonald 1976. Hearing lips and seeing voices. Nature 264 (December 23/30): 746–748.

McNeill, D. & K. Lindig 1973. The perceptual reality of phonemes, syllables words, and sentences. Journal of Verbal Learning and Verbal Behavior 12: 419–430.

Miller, J. 1994. On the internal structure of phonetic categories: a progress report.  Cognition 50: 271–285.

Mompeán-González, J.A. 2004. Category overlap and neutralization: The importance of speakers’ classifications in phonology. Cognitive Linguistics 15 (4): 429–469.

Nosofsky, R.M. 1988. Similarity, frequency and category representation. Journal of Experimental Psychology: Learning, Memory and Cognition 14: 54–65.

Ohala, J. J. & M. Ohala 1995. Speech perception and lexical representation: the role of vowel nasalization in Hindi and English. In B. Connell & A. Arvaniti (eds.), Phonology and Phonetic Evidence, Papers in Laboratory Phonology. Cambridge: Cambridge University Press, 41–60.

Okobi, A.O. 2006. Acoustic Correlates of Word Stress in American English. Doctoral dissertation, Harvard–MIT Health Sciences and Technology.

Paradis, M. 1996. Selective deficit in one language is not a demonstration of different anatomical representation: Comments on Gomez-Tortosa et al. 1995. Brain and Language 54: 170–173.

Paradis, M. 2002. Neurolinguistics of bilingualism and the teaching of languages. http://www.semioticon.com/virtuals/talks/paradis_txt.htm

Pierrehumbert, J.B. 1994. Knowledge of variation. Papers from the Parasession on Variation, 30th Meeting of the Chicago Linguistic Society, vol. 2. Chicago: Chicago Linguistic Society, 232–56.

Pierrehumbert, J.B. 2001. Exemplar dynamics: word frequency, lenition and contrast. In J. Bybee & P. Hopper (eds.), Frequency and the Emergence of Linguistic Structure. Amsterdam: John Benjamins, 137–157.

Pierrehumbert, J. B. 2002. Word-specific phonetics. In C. Gussenhoven & N. Warner (eds.) Laboratory Phonology 7. Berlin and New York: Mouton de Gruyter, 101–139.

Pierrehumbert, J.B., M.E. Beckman & D.R. Ladd 2000. Conceptual foundations of Phonology as a Laboratory Science. In N. Burton-Roberts, P. Carr & G. Docherty (eds.) Phonological Knowledge, Conceptual and Empirical Issues. Oxford: Oxford University Press, 273–303.

Pinker, S. 2003. Language as an adaptation to the cognitive niche. In M. H. Christiansen & S. Kirby, (eds.), Language Evolution: The States of the Art. Oxford: Oxford University Press, 16–37.

Port, R.F. 2004. Social, Sensory and Symbolic Aspects of Phonology. http://www.cs.indiana.edu/~port/pubs.html

Port, R.F. & A.P. Leary 2005. Against formal phonology. Language 81 (4): 927–964.

Read, C., Z. Yun-Fei, N. Hong-Yin & D. Bao-Qing 1986. The ability to manipulate speech sounds depends on knowing alphabetic writing. Cognition 24: 31–44.

Rilliard, A. & V. Aubergé 1998. Reiterant speech for the evaluation of natural vs. synthetic prosody. Third ESCA/COCOSDA Workshop on Speech Synthesis, Blue Mountains, NSW, Australia. http://www.slt.atr.jp/cocosda/jenolan/Proc/r21/r21.pdf.

Sapir, E. 1949. The psychological reality of phonemes. In D. Mandelbaum (ed.) Selected Writings of Edward Sapir. Berkeley and Los Angeles, CA: University of California Press, 46–60.

Scheer, T. 2004. A lateral theory of phonology. Vol. 1: What is CVCV, and why should it be? Berlin: Mouton de Gruyter.

Sherzer, J. 1970. Talking backwards in Cuna: the sociological reality of phonological descriptions. Southwestern Journal of Anthropology 26: 343–353.

Solé, M.-J. 1992. Phonetic and phonological processes: the case of nasalization. Language and Speech 35: 29–43.

Solé, M.-J. 2003. Is variation encoded in phonology? In M.-J. Solé, D. Recasens & J. Romero (eds.) Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS), Barcelona, 289–292.

Solé, M.-J. & J.J. Ohala 1991. The phonological representation of reduced forms. Proceedings of the ESCA workshop Phonetics and Phonology of Speaking Styles: Reduction and Elaboration in Speech Communication, Barcelona, 49–55.

Studdert-Kennedy, M. 1987. The phoneme as a perceptuomotor structure. In A.A. Allport, D.G. MacKay, W. Prinz & E. Scheerer (eds.), Language, Perception and Production. New York: Academic Press, 67–84.

Talmy, L. 1988. The relation of grammar to cognition. In B. Rudzka Ostyn (ed.), Topics in Cognitive Linguistics. Amsterdam and Philadelphia: John Benjamins, 165–205.

Talmy, L. 2000. Toward a Cognitive Semantics (2 vols.). Cambridge, MA: The MIT Press.

Taylor, J. 2002. Cognitive Grammar. Oxford: Oxford University Press.

Vago, R. M. 1985. The treatment of long vowels in word games, Phonology Yearbook 2: 329–342.

Vago, R.M. 1988. Vowel harmony in Finnish word games. In H. van der Hulst & N. Smith (eds.), Features, Segmental Structure and Harmony Processes, Part II. Dordrecht: Foris, 185–205.

Välimaa-Blum, R. 1987. Phonemic quantity, stress and the half-Long vowel in Finnish. In M. Beckman & G. Lee (eds.), Papers from the Linguistics Laboratory 1985–1987, OSU Working Papers in Linguistics 36. 101–119.

Välimaa-Blum, R. 2003. Do language games provide natural prosodic data: New evidence from Finnish. Talk at the 5èmes journées internationales du GDR Phonologie, University of Montpellier III, June 2003.

Välimaa-Blum, R. 2005. Cognitive Phonology in Construction Grammar: Analytic Tools for Students of English. Berlin and New York: Mouton de Gruyter.

Välimaa-Blum, R. 2007. Where are the phonemes in the mind and what do speakers know about them? Talk at the Second International AFLiCo Conference, Lille, 9-12 May 2007.

Vihman, M. 2002. Getting started without a system: From phonetics to phonology in bilingual development. International Journal of Bilingualism 6 (3): 239–254.

Weinreich, U. 1953. Languages in Contact. New York: Publications of the Linguistic circle of New York.

Wierzbicka, A. 1988. The Semantics of Grammar. Amsterdam and Philadelphia: John Benjamins.

Zwicky, A.M. & G.K. Pullum 1987. Plain morphology and expressive morphology. In J. Aske, N. Beery, L. Michaelis & H. Filip (eds.), Berkeley Linguistics Society: Proceedings of the Thirteenth Annual Meeting, General Session and Parasession on Grammar and Cognition. Berkeley, CA: Berkeley Linguistics Society, 330-340.

Haut de page

Notes

1  Parts of the following sections are based on a talk given at a meeting of the Réseau français de phonologie, at EHESS, Paris, in November 2005. I am grateful to the participants for their judicious comments. I am also indebted to Jacques Durand for his questions after a talk I gave in Toulouse at an ERSS seminar (Opération ‘Phonologie : Corpus, variation et universaux), October 2004. None of these people are, of course, responsible for anything I am presenting here.

2  I am grateful to Shona Whyte and Alexia Mady for some of the references on bilingualism.

3  In the absence of the oral-auditory channel, I leave human sign languages aside in this discussion.

Haut de page

Pour citer cet article

Référence électronique

Riitta Välimaa-Blum, « The phoneme in cognitive phonology: episodic memories of both meaningful and meaningless units? »CogniTextes [En ligne], Volume 2 | 2009, mis en ligne le 23 avril 2009, consulté le 16 avril 2024. URL : http://journals.openedition.org/cognitextes/211 ; DOI : https://doi.org/10.4000/cognitextes.211

Haut de page

Auteur

Riitta Välimaa-Blum

Université de Nice & CNRS–UMR 7018, 98 bd Edouard Herriot, BP 3209, F-06204 Nice, France

Haut de page

Droits d’auteur

CC-BY-NC-ND-4.0

Le texte seul est utilisable sous licence CC BY-NC-ND 4.0. Les autres éléments (illustrations, fichiers annexes importés) sont « Tous droits réservés », sauf mention contraire.

Haut de page
Rechercher dans OpenEdition Search

Vous allez être redirigé vers OpenEdition Search