Auditory Scene Analysis: The Neural Mechanisms of Sound Processing in the Cortex
Prof. Shihab Shamma
|Dr. Shihab Shamma
The fundamental goal of Shamma's research is to understand and apply biologically inspired principles of auditory cortical processing to the analysis and representation of sound. Physiological and psychoacoustic experiments are conducted to explore the nature of the representation of sound in the auditory cortex and how it might provide a rich and flexible description of its perceptually significant features. Such a cortical representation can serve as a convenient framework for sound analysis in a variety of applications ranging from speech and music processing to industrial diagnostics and robotics.
The single most critical application that encompasses all aspects of our research is the so-called “auditory scene analysis,” or more commonly known as the Cocktail Party Problem. It is basically the question of how to separate multiple speech/sound streams in various environmental conditions. Ongoing projects within this framework include the design of intelligent hearing aids and automatic speech recognition systems, assessment of intelligibility of speech, complex audio classifications, as well as exploration of adaptive processes in the brain that underlie our ability to detect, discriminate, and localize sound sources.
The unique aspect of this work is its formulation of a cortical representation of sound that is mathematically tractable, and that can be adapted to a variety of complex applications. In the past, biologically motivated approaches have been largely restricted to models of the peripheral hearing stages or of psychoacoustics of sound. Instead, our methods integrate these approaches with extensive new experimental findings in humans and animals, to formulate a unique model cortical processing. Preliminary studies conducted by our group have confirmed the advantages of this representation and have also affirmed the notion that speech cannot be treated simply as any signal, ignoring its unique source and production constraints, the perceptual limitations and abilities of the receiving (auditory) system, and the rich syntactic and semantic structure it embodies. The same applies to other highly structured acoustic signals such as music, animal vocalizations, and various environmental sounds.
Sources of Funding:
Most significant funding comes from NIH, NSF, and DoD (ONR, DARPA, AFOSR, ARL). Many other sources include Honda, SWRI, and non-profit foundations.
Please visit the laboratory’s website for much more detail and graphics — http://www.isr.umd.edu/Labs/NSL/nsl.html — as well as the website of the Center for Auditory and Acoustics Research — http://www.isr.umd.edu/CAAR/caar.html — for downloads of information and programs.
return to spotlight on research