Booz Allen Hamilton Colloquium: "Algorithms for Audio, Speech, and Gestures in Microsoft Kinect"

Friday, April 20, 2012
3:00 p.m.
1110 Jeong H. Kim Engineering Bldg.
Carrie Hilmer
301 405 4471
chilmer@umd.edu
http://www.ece.umd.edu/colloquium

Booz Allen Hamilton Distinguished Colloquium in Electrical and Computer Engineering

"Algorithms for Audio, Speech, and Gestures in Microsoft Kinect"

Dr. Alex Acero
Research Area Manager, Natural Language Processing Group & Communication and Collaboration Systems Group
Microsoft Research

Abstract:

In this talk, I will describe some of the technologies developed in Microsoft Research that went into Microsoft Kinect, the fastest selling consumer electronics device, and the challenges we faced. I will describe audio processing techniques such as microphone design, multi-channel echo cancelation, array beam forming, noise reduction, and speech recognition that made possible to do hands-free speech recognition at 3 meters. I will also describe algorithms used for depth sensing, pose estimation, skeletal tracking, and face and gaze tracking that made possible playing games without a controller. I will illustrate with a few scenarios where these technologies are useful, such as Avatar Kinect and voice search, and applications outside of gaming made possible by the Kinect SDK.

Biography:

Alex Acero is a Principal Researcher at Microsoft Research. He manages a team conducting research on audio, speech, and language understanding. He has also managed research groups in natural language processing, information retrieval, multimedia signal processing and computer vision. Dr. Acero is also an affiliate Professor of Electrical Engineering at the University of Washington. He received a M.S. degree from the Polytechnic University of Madrid in 1985, a M.S. degree from Rice University in 1987, and a Ph.D. degree from Carnegie Mellon University in 1990, all in Electrical Engineering. Dr. Acero worked in Apple Computers Advanced Technology Group during 1990-1991. In 1992, he joined Telefonica I+D, Madrid, Spain, as Manager of the speech technology group. Since 1994 he has been with Microsoft Research.

Dr. Acero is a Fellow of IEEE and ISCA. Dr. Acero is author of the books "Spoken Language Processing" (Prentice Hall, 2001) and "Acoustical and Environmental Robustness in Automatic Speech Recognition" (Kluwer, 1993), has written invited chapters in 6 edited books and over 225 journal and conference papers. He holds 110 US patents. He is President-Elect for the IEEE Signal Processing Society and has served the society as Vice President Technical Directions (2007-2009), Director Industrial Relations (2009-2011), 2006 Distinguished Lecturer, member of the Board of Governors (2004-2005 and 2010-2012), Associate Editor for IEEE Signal Processing Letters (2003-2005) and IEEE Transactions of Audio, Speech and Language Processing (2005-2007), and member of the editorial board of IEEE Journal of Selected Topics in Signal Processing (2006-2008) and IEEE Signal Processing Magazine (2008-2010). He also served as member (19962000) and Chair (2000-2002) of the Speech Technical Committee of the IEEE Signal Processing Society. He was Publications Chair of ICASSP98, Sponsorship Chair of the 1999 IEEE Workshop on Automatic Speech Recognition and Understanding, and General Co-Chair of the 2001 IEEE Workshop on Automatic Speech Recognition and Understanding and 2012 ESPA Technical Program Chair. Dr. Acero served as member of the editorial board of Computer Speech and Language and member of Carnegie Mellon University Deans Leadership Council for College of Engineering.

Audience: Campus Clark School All Students Graduate Undergraduate Prospective Students K-12 Faculty Staff Post-Docs Alumni Corporate Donors Press

Browse All Events

April 2024

SU	MO	TU	WE	TH	FR	SA
31	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	1	2	3	4

Submit an Event