, A “Sound” Approach to Teaching Listening, Lateral Communications

This is a guest post from Marisa Ueda (www.listening-marisa.com)

Here, two fundamental theories of the Ueda Methods (the effective methods of instruction and learning for EFL learners based on two theories, both of language understanding and human information processing, and also based on statistical data and its analysis) are introduced.

Cognitive Psychology Theory Anderson (2010)

, A “Sound” Approach to Teaching Listening, Lateral CommunicationsAnderson (2010) claims that language learning involves certain steps and proposes a cognitive framework of language comprehension based on perception, parsing and utilisation. Although these three phases are interrelated, recursive and possibly concurrent, they differ from one another. At the lowest cognitive level of listening, perception is the decoding of acoustic input that involves extracting phonemes from a continuous stream of speech.


With regard to the first stage, Anderson (2010) argues that there are at least two problems in speech perception or recognition, i.e. segmentation and co-articulation. The first problem, segmentation, occurs when the phonemes need to be identified, but unlike printed text, speech is not broken into discrete units. Speech is a continuous stream of sounds with no noticeable word boundaries. Thus, any new learner of English normally experiences this problem. Anderson defines phonemes as the minimal units of speech that can result in a difference in the spoken message (p. 51). Words are divided into two categories, i.e. content and function words. Nouns, verbs, adjectives, adverbs and demonstrative pronouns are categorized as content words (Gimson, 1980, p. 256); they convey relevant information unlike function words such as prepositions, conjunctions and determiners. Thus, function words are not generally stressed in listening. Furthermore, the segmentation problem and unstressed words are firmly related. Examples of the segmentation problem include assimilation, contraction, deletion, elision, liaison/linking and reduction (Yoshida, 2002, p. 32).

According to Ladefoged (1982, p. 99), assimilation occurs when one sound is changed into another because of the influence of a neighbouring sound (e.g. ‘Red Cross’ can be heard as /reg kros/ and ‘hot pie’ as /hop pai/).

Contraction is defined as a vowel-less weak form by Knowles (1987, p. 146). Examples of contractions in sentences, especially in rapid speech, include ‘going to’ which becomes ‘gonna’, as in ‘I’m gonna do it tomorrow’; ‘got to’, which becomes ‘gotta’, as in ‘I’ve gotta go’ and ‘I would’, which becomes ‘I’d’, as in ‘I’d say so’.

Deletion is the removal of a part of the pronunciation. For example, in rapid speech, ‘because’ becomes ‘cuz’, as in ‘I’m studying English cuz I’m going abroad’, and ‘them’ becomes ‘em’, as in ‘Why don’t you go with em?’

Rost and Wilson (2013, p. 305) use ‘elided’ to describe elision, which is defined as the omission of sounds in rapid connected speech. They also state that this is usually the result of one word ‘sliding’ into another, and the sound omitted is usually an initial or final sound in a word (e.g. ‘soft pillow’ can be heard as /sof pilow/ and ‘old man’ as /oul man/).’

According to Cutler (2012), liaison is ‘a final sound pronounced only when the following word begins with a vowel…it interacts with segmentation of the speech stream’ (p. 206). Examples include ‘I’ll need to think about it’, ‘The sheep licked up the milk’ and ‘Not at all’.

Finally, as an example of reduction, which reduces the number of vowels that occur in unaccented syllables (Knowles, 1987, p. 97), Yoshida (2002) introduces a sentence such as ‘You dropped your handkerchief’ in which the word ‘your’ is not stressed (p. 32). This phenomenon occurs because the word ‘your’ is a function word and is unstressed.

The second problem in speech perception involves a phenomenon known as co-articulation (Liberman, 1970). Ladefoged (1982, p. 52) defines co-articulation as the overlapping of adjacent articulations; that is, as the vocal tract is producing one sound, it moves towards the shape for the following phoneme. For example, the sound of /b/ itself and the /b/ in ‘bag’ are different. Thus, when pronouncing /b/ in ‘bag’, the vocal tract is already moving towards the next sound /a/. In addition, when pronouncing /a/ in ‘bag’, the root of our tongue is raised to produce the /g/. These segmentation problems pose complications for any learner of English, since an independent phenomenon of segmentation does not usually occur in a single sentence. Rather multiple phenomena of segmentation might occur in just a single sentence. Moreover, these difficulties exist only in perception, the lowest cognitive level of listening. Anderson (2010, p. 52) describes that speech perception poses information-processing demands that are, in many ways, greater than what is involved in other types of auditory perception.

Many Japanese learners of English encounter these segmentation problems. Ikemura (2003) indicates that the auditory recognition of words is one of the major problems at the speech perception level for Japanese learners of English. This is because reading and writing are generally emphasized at schools in Japan; this is evidenced by the fact that it was only since 2006 when a listening comprehension test was introduced in the national examination of Japanese universities.


Next, the second stage in Anderson’s cognitive psychology theory (2010) is parsing. In parsing, words are transformed into a mental representation of the combined meaning of the words. This occurs when a listener segments an utterance according to syntactic structures or meaning cues. According to Anderson (2010), people use the syntactic cues of word order and inflection to interpret a sentence (p. 366). Thus, when a sentence is presented both with and without a major constituent boundary, it is more difficult to comprehend the latter form.


The third and final stage is utilisation. In this stage, it is sometimes necessary for a listener to make different types of inferences to complete an interpretation of an utterance, especially since the actual meaning of an utterance is not always the same as what is stated. That is, to completely understand a sentence, a listener sometimes needs to make inferences and connections so that s/he can make the sentence more meaningful. In addition, mental representation is also required to comprehend the speaker’s actual meaning.

Controlled and Automatic Human Information Processing Schneider and Shiffrin (1977)

, A “Sound” Approach to Teaching Listening, Lateral CommunicationsSchneider and Shiffrin (1977) propose that learning includes two types of cognitive processing, i.e. controlled and automatic human information processing. Controlled processing involves a sequence of cognitive activities under active control which draw the conscious attention of the subject.

Conversely, automatic processing involves a sequence of cognitive activities that automatically occur without active control and generally without conscious attention. This theory is supported by numerous studies (Lynch, 1998; Goh, 2000; Buck, 2001; Anderson, 2010). Buck (2001) adeptly illustrates both types of processing by comparing them to the scenario of learning to drive a car.

In this regard, initially, the entire learning process is controlled, thus requiring conscious attention to every action. After more experience, certain parts of the process become relatively automatic and are performed subconsciously. Eventually, the entire process becomes automatic to the extent that, under normal circumstances, one has the ability to drive a car well and without much thought. The figure shown above demonstrates the hierarchical model of controlled and automatic human information processing, following Schneider and Shiffrin (1977).

Based on this theory, dictation in listening is categorized as controlled processing (bottom-up processing) since it involves phonemic decoding, which requires conscious attention to phonemes, the smallest segments of sound (Ladefoged, 1982). In contrast, from a listening strategy perspective, the identification of individual words is mainly regarded as automatic processing (top-down processing), because it can only be possible after phonemic decoding occurs automatically without active control and conscious attention.

Thus, the less automatic an activity becomes, the more time and cognitive energy it requires. In this regard, when learners take more time in phonemic decoding, their overall comprehension suffers.