![]() 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing March 18-23, 2005 • Pennsylvania Convention Center/Marriott Hotel • Philadelphia, PA, USA |
|
ICASSP |
Tutorial TUT-3: Data compression, Good-Turing estimation, and language modelingInstructorsA. Orlitsky; University of California, San Diego Time & LocationFriday, March 18, 14:00 - 17:00, Location: CC: Room 113-A AbstractCompression, estimation, and modeling address essentially the same technical problem. Yet they have been studied in their respective disciplines---information-theory, statistics, and linguistics/signal processing---using vastly divergent techniques. Taking a unified universal-compression viewpoint, we relate these approaches and use results derived in each to explain phenomena observed in the others. We begin with classical data-compression definitions and results, present several popular universal-compression algorithms, and explain why their performance degrades as the alphabet size increases to that encountered in language modeling. We then provide some of the intuition behind the cryptic Good-Turing estimator of low-probability events, describe recent developments on the optimality of some of its variations, and relate these results to those derived in language modeling for speech recognition and natural language processing. We conclude by comparing the performance of these and related algorithms on benchmark corpora. Audience: The tutorial is self-contained and aimed at researchers interested in statistical and information-theoretic views of language modeling and compression. Materials: Slides, summary, and references will be provided. Outline:
Presenter Information![]() Alon Orlitsky received B.Sc. degrees in Mathematics and Electrical Engineering from Ben Gurion University in 1980 and 1981, and M.Sc. and Ph.D. degrees in Electrical Engineering from Stanford University in 1982 and 1986. From 1986 to 1996 he was with the Communications Analysis Research Department of Bell Laboratories. He spent the following year as a quantitative analyst at D.E. Shaw and Company, an investment firm in New York city. In 1997 he joined the University of California, San Diego, where he is currently a professor of Electrical and Computer Engineering and of Computer Science and Engineering. Alon's research concerns information theory, learning, and speech recognition. He is a recipient of the 1981 ITT International Fellowship and of the 1992 IEEE W.R.G. Baker Paper Award. |
©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2005.com -||- Last updated Friday, August 17, 2012 |