Friday, July 28, 2017

Why do we hear with our eyes?



Do we really separate our senses? Do we really hear with our ears solely? I don't think that is true and this is why.

"Speech is multimodal and is produced with the mouth, the vocal tract, the hands and the entire body and perceived not only with our ears but also with our eyes"
Marion Dohen
Speech through the Ear, the Eye, the Mouth and the Hand
(Multimodal Signals: Cognitive and Algorithmic Issues, Springer, pp 24-39)

The auditory-visual (AV) speech integration has been steadily growing in importance and has most certainly benefited from recent advances in neurosciences and multisensory research. AV speech integration has started to raise questions regarding the computational rules we need in order to put together  information though one sense or across all senses. It has also made scientist wonder about the shape in which speech information is encoded in the brain (auditory vs. articulatory), or how AV speech interacts with the linguistic system as a whole.
After correcting the umpteenth student pronouncing words wrong because he was reading the word, Ihad a sort of a revelation. After spending a few months reading about AV speech integration and becoming fascinated with it I feel confident enough to say: we hear with our eyes, we listen with our eyes, we make our mouth produce sounds based on what our eyes see. Or at least on the quota our eyes share with our ears in AV speech integration. 

Think about it. Basically every single error correction I have provided regarding pronunciation in the last couple of years was, with very few exceptions, an error resulting from focusing on the visual cues. Learners were basing their expectations and performances of sound on the visual representation of the word. Now, I'm not saying it's wrong to use your eyes and prior knowledge to anticipate pronunciation, this is actually something you should be doing according to how our brains already function. But what did happen was not integration but rather superseding. What did happen was that the eyes and the expectations coupled with L1 interference and filtered through my learners mother tongue trained sound producing apparatus into English.

Any correction given to learners while they are still visually stimulated usually resulted in short-lived results, and sometimes not even those. 
Why? Because neurolinguistic research has shown that the brain learns to process different linguistic stimuli at different levels, depending on what your L1 is, how your senses developed as child, is you had any brain injury or not, etc. So what that means for learners is that something has to give sometimes. Their ears give out to their eyes and they listen with their eyes. This is why learners are so comfortable with listening with the transcript. Our job is to break that pattern and help them develop an (hopefully somewhat) equal AV speech integration that can help their brains decode and encode correctly the English language.

So what can we do about it? I've started experimenting with taking away the visual stimulation or introducing a positive visual stimulation. 
I alternate having learners say the words with their eyes closed, counting sounds and syllables, deciding stress, thinking about and focusing on sound production inside their mouths.

We record words with their IPA transcription, I teach them how to read a dictionary entry, we analyze graphic differences between letter and sound transcriptions, we look at how letters combine to create predictable patterns of sounds.
All small steps that could go a very long way.

What suggestions do you have about improving your students AV speech integration capacity?

The Sound Eater

No comments:

Post a Comment