146th Meeting of the Acoustical Society of America, Austin, TX, USA, 10.-14.11.2003.

Part IIa: Effects of congruency on localization of audiovisual three-dimensional speech sounds


Klaus A J Riederer


Laboratory of Acoustics and Audio Signal Processing
Helsinki University of Technology (HUT)
P.O. Box 3000, FIN-02015 HUT, Finland
Tel: +358 9 451 2494; Fax +358 9 460 224
Email: Klaus.Riederer@hut.fi, URL: http://www.hut.fi/~kar




Part two of the current study [Riederer, J. Acoust. Soc. Am., this issue] investigated localization of virtual audiovisual speech under exactly the same conditions. Perceived directions were signified by pushing keypad-buttons. Inside-the-head localization occurred almost only for the median-plane stimuli, being insignificant of the stimulus type (7.62% congruent, 9.38% incongruent and 6.54% auditory-only) and disregarded from further analyses. The mean of “correct” answers was 46.81%. Factorial within-subjects ANOVA showed no significance on acoustic stimuli (/ipi/, /iti/) or stimulus type but showed strong dependence on direction (p=0.000015) and its interactions with acoustic stimuli (p=0.015374) and stimulus type (p=0.00812). Reaction times were highly dependent on direction (p=0.000002). From the 384 frontal location answers (azimuths 0º, ±40º) 25.52% congruent, 28.39% incongruent and 28.65% auditory-only were perceived as backward confused, for 0º azimuth only the corresponding values were 28.13%, 28.13% and 35.94%. Back-front confusions were 13.80%, 9.64% and 8.85% (azimuths 180º, ±130º, and 18.75%, 14.06% and 14.06% (azimuth 180º). Seeing the (congruently) talking face biased the localization more to the front, especially for the median-backward sounds. Obviously, vision overcomes weaker monaural localization cues as in the ventriloquism effect [Driver, Nature 381, 66-68 (1996)]. [Work supported by Graduate School of Electronics, Telecommunication and Automation.]