The cocktail-party problem revisited: early processing and selection of multi-talker speech
详细信息    查看全文
  • 作者:Adelbert W. Bronkhorst
  • 关键词:Attention ; Auditory scene analysis ; Cocktail ; party problem ; Informational masking ; Speech perception
  • 刊名:Attention, Perception, & Psychophysics
  • 出版年:2015
  • 出版时间:July 2015
  • 年:2015
  • 卷:77
  • 期:5
  • 页码:1465-1487
  • 全文大小:799 KB
  • 参考文献:Ahveninen, J., H?m?l?inen, M., J??skel?inen, I.P., Ahlfors, S.P., Huang, S., Lin, F.-H., ?Belliveau, J.W. (2011). Attention- driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proceedings of the National Academy of Sciences 108, 4182-187. doi:10.1073/pnas.1016134108.
    Ahveninen, J., J??skel?inen, I. P., Raij, T., Bonmassar, G., Devore, S., H?m?l?inen, M., ?Belliveau, J. W. (2006). Task-modulated “what-and “where-pathways in human auditory cortex. Proceedings of the National Academy of Sciences 103, 14608-4613. doi:10.1073/pnas.0510480103.
    Alain, C., Arnott, S. R., Hevenor, S., Graham, S., & Grady, C. L. (2001). “What-and “where-in the human auditory system. Proceedings of the National Academy of Sciences, 98, 12301-2306. doi:10.-073/?pnas.-11209098
    Allen, J. B. (1994). How do humans process and recognize speech? IEEE Transactions on Speech and Audio Processing, 2, 567-77. doi:10.-109/-9.-26615
    Allen, K., Alais, D., & Carlile, S. (2009). Speech intelligibility reduces over distance from an attended location: Evidence for an auditory spatial gradient of attention. Attention, Perception, & Psychophysics, 71, 164-73. doi:10.-758/?APP.-1.-.-64
    ANSI. (1997). ANSI S3.5-1997: Methods for calculation of the speech intelligibility index. New York: American National Standards Institute.
    Arbogast, T., Mason, C., & Kidd, G. (2002). The effect of spatial separation on informational and energetic masking of speech. Journal of the Acoustical Society of America, 112, 2086-098. doi:10.-121/-.-510141 PubMed
    Assmann, P. F., & Summerfield, Q. (2004). The perception of speech under adverse conditions. In S. Greenberg, W. A. Ainsworth, A. N. Popper, & R. R. Fay (Eds.), Speech processing in the auditory system (pp. 231-08). New York: Springer.
    Atal, B. S., & Hanauer, S. L. (1971). Speech analysis and synthesis by linear prediction of the acoustic wave. Journal of the Acoustical Society of America, 50, 637-55. doi:10.-121/-1.-912679 PubMed
    Bell, R., R?er, J. P., Dentale, S., & Buchner, A. (2012). Habituation of the irrelevant sound effect: Evidence for an attentional theory of short-term memory disruption. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38, 1542-557. doi:10.-037/?a0028459 PubMed
    Best, V., Ozmeral, E. J., Kop?o, N., & Shinn-Cunningham, B. G. (2008). Object continuity enhances selective auditory attention. Proceedings of the National Academy of Sciences, 105, 13174-3178. doi:10.-073/?pnas.-803718105
    Best, V., Shinn-Cunningham, B. G., Ozmeral, E. J., & Kop?o, N. (2010). Exploring the benefit of auditory spatial continuity. Journal of the Acoustical Society of America, 127, EL258. doi:10.-121/-.-431093 PubMed Central PubMed
    Beutelmann, R., Brand, T., & Kollmeier, B. (2010). Revision, extension, and evaluation of a binaural speech intelligibility model. Journal of the Acoustical Society of America, 127, 2479-497. doi:10.-121/-.-295575 PubMed
    Binns, C., & Culling, J. F. (2007). The role of fundamental frequency contours in the perception of speech against interfering speech. Journal of the Acoustical Society of America, 122, 1765-776. doi:10.-121/-.-751394 PubMed
    Bird, J., & Darwin, C. J. (1998). Effects of a difference in fundamental frequency in separating two sentences. In A. R. Palmer, A. Rees, A. Q. Summerfield, & R. Meddis (Eds.), Psychophysical and physiological advances in hearing (pp. 263-69). London: Whurr Publishers.
    Block, C. K., & Baldwin, C. L. (2010). Cloze probability and completion norms for 498 sentences: Behavioral and neural validation using event-related potentials. Behavior Research Methods, 42, 665-70. doi:10.-758/?BRM.-2.-.-65 PubMed
    Bolia, R., Nelson, W., Ericson, M., & Simpson, B. (2000). A speech corpus for multitalker communications research. Journal of the Acoustical Society of America, 107, 1065-066. doi:10.-121/-.-28288 PubMed
    Boothroyd, A., & Nittrouer, S. (1988). Mathematical treatment of context effects in phoneme and word recognition. Journal of the Acoustical Society of America, 84, 101-14. doi:10.-121/-.-96976 PubMed
    Bregman, A. S. (1990). Auditory scene analysis: the perceptual organization of sound. Cambridge: MIT Press.
    Broadbent, D. E. (1958). Perception and communication. London: Pergamon Press.
    Broadbent, D. E., & Ladefoged, P. (1957). On the fusion of sounds reaching different sense organs. Journal of the Acoustical Society of America, 29, 708-10. doi:10.-121/-.-909019
    Brokx, J. P. L., & Nooteboom, S. G. (1982). Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics, 10, 23-6.
    Bronkhorst, A. W. (2000). The cocktail party phenomenon: a review of speech intelligibility in multiple-talker conditions. Acta Acustica united with Acustica, 86, 117-28.
    Bronkhorst, A. W., Bosman, A. J., & Smoorenburg, G. F. (1993). A model for context effects in speech rec
  • 作者单位:Adelbert W. Bronkhorst (1) (2)

    1. TNO Human Factors, POB 23, 3769 ZG, Soesterberg, The Netherlands
    2. Department of Cognitive Psychology, Vrije Universiteit, van den Boechorststraat 1, 1081 BT, Amsterdam, The Netherlands
  • 刊物主题:Cognitive Psychology;
  • 出版者:Springer US
  • ISSN:1943-393X
文摘
How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking-resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700