Psycholinguistics James Myers February 27, 2004 Review of (or introduction to) linguistics and cognitive psychology OVERVIEW: 1. Cognitive psychology 1.1 Cognitive psychology as information processing 1.2 Serial vs. parallel processing 1.3 Top-down vs. bottom-up processing 1.4 Automatic vs. controlled processes 1.5 Modularity 2. Linguistics 2.1 Grammar and PRODUCTIVITY 2.2 The symbol and the LEXICON 2.3 Phonology and DUALITY OF PATTERNING 2.4 Syntax and PHRASE STRUCTURE ======================================================= 1. Cognitive Psychology 1.1 Cognitive psychology is any psychology that can be described with INFORMATION PROCESSING MODELS: 1.1.1 "Information processing" is like "word processing" or "food processing": it takes information and changes it into a different form, or "(RE)ENCODES" it. Thus even "social cognition" involves information that is perceived, behaviors that are produced, and memory storage. 1.1.2 An information processing model of memory: [see Carroll, p. 47] Based on Atkinson and Shiffrin (1968) 1.1.2.1 Sensory stores: >Rich in information >Very short life (visual about 1 s; auditory about 4s) >Visual sensory stores may not be in the brain, but in the retina(s) of the eye(s) 1.1.2.2 Working (short-term) memory: >Relatively short life (perhaps up to 30 minutes) >Small processing capacity: only 7+/-2 "chunks" (Miller 1956) "CHUNKING": breaking a large set into a smaller set of units so that you only need to process at most 5-9 units at one time: 00214149648320 (14 digits: too many!) 002-1-414-964-8320 (small chunks: easier!) (this is how I would call my family's old house in Milwaukee, Wisconsin) 1.1.2.3 Permanent (long-term) memory: >Limitless (?) size >Never (?) decays 1.1.2.4 Control processes: >Everything else that might affect memory processing 1.1.2.5 Evidence for this model: >Visual sensory stores: partial report technique (Sperling 1960): subjects flashed a lot of stuff for a very short time (using a tachistoscope) they cannot report on all of it, because the information fades too quickly; but they can report on any random part of it; therefore they must store all of it (for a short time) >Short-term memory (STM) and long-term memory (LTM) are distinct: >DOUBLE DISSOCIATION: A and B are distinct if A can be dissociated from B (i.e. you can have A without B) and B can be dissociated from A >STM without LTM: (Milner 1966): a patient who could recall a series of numbers momentarily presented but could not retain it over long periods of time >LTM without STM: (Warrington and Shallice 1969): a patient could only recall one digit reliably, but could remember new material for a long time 1.1.2.6 Problems for the model: Milner's patient had normal PROCEDURAL memory ("how-to" memory): he improved on a task involving motor control even though he didn't remember practicing! 1.1.2.7 Two major types of permanent memory >EPISODIC memory: memory of specific events, linked to a time and a place >SEMANTIC memory: "knowledge" not linked to a time and a place (e.g. knowledge of words) 1.2 Serial vs. parallel processing. 1.2.1 SERIAL MODEL: information is processed in separate steps, usually shown as boxes linked by arrows (e.g. the above memory model). >A serial model for recognizing written English words: Visual feature --> Letter --> Word processing processing recognition 1.2.2 PARALLEL MODEL: information of different kinds is processed at the same time in different places in the model. >A parallel model for recognizing written English words: [see Carroll, p. 98] >NOTE: both types of models operate over time. For example, a model of sentence processing will have to process words at the beginning of the sentence before words at the end of the sentence. THIS IS NOT THE SAME AS "SERIAL" PROCESSING! >Evidence for parallel processing in reading: [see Carroll, p. 53] The same visual image is recognized as different letters depending on the context, implying that letter processing and whole-word processing occur at the same time (in parallel). 1.2.3 CONNECTIONISM, or "parallel distributed processing" (PDP), or "artificial neural networks": Parallel processing models inspired by real neural networks in the brain. Information is represented by spreading activation along connections between nodes. Such models are influential in cognitive psychology, but often they are just used metaphorically, rather than actually being run on a computer. (We will read about some examples this semester.) >Example: the production of regular inflection in English (e.g. walk, walked) Serial model: STEM --> "ADD -ED" (e.g. Pinker 1991) PDP model: Stem forms (feature nodes) \/\/\/\/\/\/\ (connections) Suffixed forms (feature nodes) (e.g. Rumelhart and McClelland 1986) 1.3 Top-down vs. bottom-up processing 1.3.1 BOTTOM-UP processing: there is a one-way flow of information from input to more abstract levels of encoding. >Serial models of perception are usually bottom-up. 1.3.2 TOP-DOWN processing: information flows downward too, from more abstract levels to the levels closer to the input >Parallel models of perception usually are partly top-down. (e.g. processing of whole words can influence perception of letters) 1.4 Automatic vs. controlled processes 1.4.1 CONTROLLED processes require extra processing resources, and so occupy the processing capacity of working memory >Examples: driving a car for the first time, understanding speech in a second language... 1.4.2 AUTOMATIC processes do not require (much) processing capacity. >Examples: driving a car if you're very experienced, understanding speech in your native language... 1.4.3 A related distinction: 1.4.3.1 ON-LINE (or REAL-TIME) processing processes material as it is being input or output, e.g. during language comprehension or language production. >Experimental tasks are "on-line" if they tap into ongoing processing (e.g. subjects press buttons while listening to sentences) 1.4.3.2 OFF-LINE processing happens after the material has been put into memory, not while information is being input or output. >Experimental tasks are "off-line" if there is a delay that allows the use of controlled processes and long-term memory access (e.g. subjects fill out forms after they have heard the sentences) 1.5 Modularity: Does the mind work as a whole? Or is it MODULAR: broken into separate, independent processors (MODULES)? Modules are like the boxes in a serial model. 1.5.1 Four of the defining properties of modules (Fodor 1983): 1.5.1.1 Automatic: fast, computationally efficient, obligatory 1.5.1.2 Domain-specific: only deals with one kind of information 1.5.1.3 Informationally encapsulated: there are only very specific, limited ways of getting information into and out of each module (i.e. the arrows) 1.5.1.4 Neurologically distinct: a module is realized in brain by a distinct neural subnetwork 1.5.2 Is language a module? >Automatic: native speakers process their language quickly, efficiently and obligatorily (cannot "turn off" language comprehension!) >Domain-specific: language processors can only process language, not other kinds of sounds or structures...? >Informationally encapsulated: beliefs don't affect language structures, or vice versa...? >Neurologically distinct: there are people who seem to show double dissociations between language and intelligence 1.5.3 NOTE: the "modularity" question is independent of the "nativism" question! >E.g. adult readers seem to have a "reading" module, but that is obviously not innate. 1.5.4 All of this is controversial: Pinker (1994) believes that language is a module Elman et al. (1996) do not 2. Linguistics: What's special about human language? 2.1. GRAMMAR: the compact mental "device" that allows people to produce and understand an unlimited number of utterances; describes one's "knowledge of language" 2.1.1 When you "know" a language, you know two things: (a) a finite set of basic elements stored in memory (LEXICON) (b) a set of rules/principles that allows you to combine these basic elements to create an unlimited number of new structures (GRAMMAR proper) 2.1.2 PRODUCTIVITY: We can generate an unlimited number of new sentences just using this finite set of elements. 2.1.3 Speakers know when these rules are not being obeyed: GRAMMATICALITY JUDGMENTS 2.2 The symbol, morphology, and the lexicon. 2.2.1 An arbitrary form-meaning pairing: the SYMBOL. Symbols are at the heart of language, though modern linguists often dismiss them is boring. But many mysteries remain. For example, how do babies learn symbols? Can other animals use symbols? We'll talk about these issues later. 2.2.2 Must be stored in an internal, mental dictionary: the LEXICON. 2.2.3 The basic unit is the MORPHEME: minimal unit of meaning. >MORPHOLOGY: the internal structure of words. >Morphological rules are productive: e.g. "Taiwanification" >Are whole words stored in the lexicon, or just morphemes? (e.g. "grabbed", 字典、 水準 ) We'll discuss this issue later. 2.3 Phonology. >Four central aspects of phonology: >Duality of Patterning >Abstract phonemes >Rules and constraints >Prosody <1> DUALITY OF PATTERNING (Hockett, 1960): only human language has patterns both at the level of meaningful units (syntax & morphology), and at the level of meaningless units (phonology). > 蛋、等、的 all include the "same sound", but they have nothing in common in meaning. >PHONEMES: "Minimal units of sound" (parallel to morphemes) Traditionally they are the "size" of a letter >DISTINCTIVE FEATURES: the true minimal units of sound. (smaller than a "letter") E.g.: [aspirated] = produced with a puff of air [-aspirated] [+aspirated] ㄅ ㄆ ㄉ ㄊ ㄍ ㄎ ㄐ ㄑ ㄓ ㄔ ㄗ ㄘ >Are phonemes and features really used in language processing? We'll talk about this. <2> Phonemes are abstract -- they are not physical things! >Sapir (1933) "The Psychological Reality of Phonemes": native speakers of a language are not aware of all the sounds of their language, but rather just the distinctive sounds. >E.g. [aspirated] is not distinctive in English. >Yet English speakers produce unaspirated consonants all the time, e.g. in "stop", "spot", "school". >If you record these words and then cut off the /s/, instead of "top", "pot", "cool" you hear "dop", "bot", "gool". >What native speakers perceive can be quite different from the physical reality (phonetics) of their speech: "cat", "can", "can't" 為什麼、以為 <3> Rules and constraints: Needed to describe the connection between abstract phonemic/phonological representations and concrete phonetic representations. >Phonology has rules: >English example: /s/ -> [z] after voiced consonants cats ("s" is voiceless [s]) dogs ("s" is voiced [z]) >Chinese example: 買馬 sounds like 埋馬 Rule: Tone3+Tone3 -> Tone2+Tone3 >The rules are productive: they can apply to fake words: splorg[z] Bach[s] >Isn't this just physics or physiology? >No, because the rules are sensitive to mental objects like words and phrases: dog sale ("s" is still [s], not [z]) 狗他不買,馬他也不買。 >Rules are not enough; also need constraints: >PHONOTACTICS: constraints that tell you how phonemes can be arranged in a morpheme. /bl/ is OK in English: blink, blue, ... */bn/ is bad: *bnick, ... >Phonological theory with ONLY constraints, no rules: Optimality Theory (very popular these days) >Possible nonwords: fake words that violate no phonotactic constraints. blick, grue, spam, ... ㄆㄡˋ ㄅㄡ ? >Does phonology involve semantic memory, episodic memory, or procedural memory? Good question, but nobody has studied this carefully yet.... <4> PROSODY: the rhythmic structure of language. >Features and phonemes are arranged in syllables: 字典、 葡萄、 一點兒... >Speakers usually have very strong intuitions about syllables, even if they cannot read or write. >Syllables are arranged in larger prosodic units called metrical feet, with boundaries often indicated by stress: linGUIstics baNAna [lin][GUIstics] [ba][NAna] >Thus prosody exists in order to provide chunking! >Do all languages have metrical feet? Think about how you pronounce these words: 原子 院子 [原][子] [院子] (two feet) (one foot) 2.4 Syntax: the structure of sentences. >Three central aspects of syntax: >Syntactic constituents >Argument structure >Discontinuous dependencies <1> CONSTITUENTS: the syntactic units, sequences of words The mean dog chased the nice cat is a sentence, so: [S The mean dog chased the nice cat ] contains a noun phrase (NP) and a verb phrase (VP), so: [S [NP The mean dog] [VP chased the nice cat ] ] but the VP also contains a NP, so: [S [NP The mean dog] [VP chased [NP the nice cat] ] ] >RECURSION: Constituents can be put inside constituents of the same type. [S He said [S she said [S he said [S she said [S that's nice ]]]]] >Recursive structures sound bad when they contain too much CENTER-EMBEDDING: constituents placed right in the middle (see Chomsky & Miller, 1963). (1) The dog bit the cat. (2) The cat [the dog bit] chased the rat. (3) The rat [the cat [the dog bit] chased] ate the cheese. (Can you think of Chinese examples?) >Do they sound bad because they are ungrammatical (linguistics) or because they are difficult to process (psycholinguistics)? Well, it's apparently because they are difficult to process, since other sentences with the same structure sound much better: (4) The people [the man [I punched] called] chased me. (Can you think of a Chinese example?) >PHRASE-STRUCTURE RULES: one way to express syntactic constituents. <2> ARGUMENT STRUCTURE. >Verbs, prepositions, etc are like functions in math f(x), g(x,y); "x", "y" etc are the ARGUMENTS >"eat" can take two arguments: the SUBJECT and the (DIRECT) OBJECT >"give" can take three arguments: the subject, direct object, and INDIRECT OBJECT >The argument requirements of a verb are sometimes arbitrary, and so must be stored in the lexicon: >"talk": 1 argument (subject) >"shout": 2 arguments (subject and object) >"tell": 3 arguments (subject, object, and indirect object) >Lexical-Functional Grammar: a theory where syntax is described (almost) entirely by using these lexical argument requirements. Some psychologists prefer it, since it places a heavy emphasis on memory (which linguists often ignore). E.g.: *John slept the book. *The kid grabbed. These sentences violate lexical properties (the argument requirements of "sleep" and "grab"). <3> DISCONTINUOUS DEPENDENCIES: Words may be separated from their arguments (or other things that go with them). >"Grab" requires an object: *Did the kid grab? >So "what" must be the object of "grab" here, even though they are separated: WHAT did the kid GRAB? >Syntactic theories differ in how they handle discontinuous dependencies. >Government-Binding (GB) Theory describes them with a single general process called movement. >In order to preserve argument structure, GB marks the GAP left by the moved element with an abstract, silent "trace" (t). The moved element and the trace are coindexed (with "i") to show that they refer to the same thing. [Which cat]i did the kid grab ti? (Can you think of a Chinese example?) >GB also has a theory of coindexing without movement, called "binding theory", which describes how pronouns are matched with nouns: Maryi told us many things about herselfi. >Questions of movement and binding can be quite complex. For example, linguists want to know why the "b" sentences below sound ungrammatical to native speakers (marked with "*"). (1) a. Whati did John claim that Bill saw ti? b. *Whati did John believe the claim that Bill saw ti? (2) a. Billi told John to tell Sally about himi. b. *Billi told John to tell Sally about himselfi. >NOTE: The terms "deep structure", "surface structure", and "transformations", mentioned in Carroll, are no longer used in GB and more recent syntactic theories. >This makes syntactic theory more psychologically realistic, since the derivational theory of complexity has been proven false, as Carroll discusses. REFERENCES Atkinson, R. C. and Shiffrin, R. M. (1968). Human memory: a proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.). The psychology of learning and motivation: Advances in research and theory, Vol. 2 (pp. 89-195). Academic Press. Chomsky, N., & Miller, G. (1963). Introduction to the formal analysis of natural languages. R. Luce (Ed.), Handbook of mathematical psychology (pp. 269-321). New York: John Wiley. Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking innateness: A connectionist perspective on development. MIT Press. Fodor, J. A. (1983). The modularity of mind. MIT Press. Hockett, C.F. (1960). The origin of speech. Scientific American, 203, 88-96. Miller, G. (1956). The magical number seven: plus or minus two. Some limits on our capacity for processing information. Psychological Review, 9, 81-97. Milner, B. (1966). Amnesia following operation on the temporal lobes. In C. Whitty & O. Zangwill (Eds.), Amnesia (pp. 109-133). Butterworth. Pinker, S. (1994). The language instinct. William Morrow. Rumelhart, D. E., & McClelland, J. L. (1986). On learning past tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing, Vol. 2: Psychological and biological models (pp. 216-271). MIT Press. Sapir, E. (1933). The psychological reality of phonemes. Reprinted in D. Mandelbaum (Ed.), Selected writings of Edward Sapir in language, culture, and personality (pp. 46-60). University of California Press. Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs, 74, 1-29. Warrington, E. K., & Shallice, T. (1969). The selective impairment of auditory-verbal short-term memory tasks. Quarterly Journal of Experimental Psychology, 24A, 30-40.