ICASSP 2005 Philadelphia

2005 IEEE International Conference on Acoustics, Speech, and Signal Processing

March 18-23, 2005 • Pennsylvania Convention Center/Marriott Hotel • Philadelphia, PA, USA

Human Language Technology: Applications and Challenges for Speech Processing

Organizers: M. Ostendorf, U. Washington; E. Shriberg, SRI International & ICSI; A. Stolcke, SRI International & ICSI

Over the last decade, the speech processing and natural language processing communities have developed largely independently, though many of the algorithms stem from the same fundamental theory. As speech recognition improves, the possibility for language processing of "spoken documents" is of increasing interest, raising the need to bridge this gap. One step needed is a richer representation of audio that includes speaker and structural information in addition to word transcriptions, promulgated as "Rich Transcription" by the DARPA EARS program. The goal of this Special Session is to further advance this interdisciplinary effort by outlining specific ways in which speech technology can influence downstream natural language technology.

The session will introduce speech researchers to the many interesting and useful downstream applications that take speech recognition as input. In turn, HLT researchers will have an opportunity to help direct speech engineers toward these particular problems. Finally, the session will bring to light a wealth of shared algorithmic methods that could be useful in both fields, and where cross-fertilization is likely to provide mutual benefits. Such shared techniques include statistical language modeling, dimensionality reduction techniques, and machine learning techniques in general.

Overview lecture:
  Title: Human language technology: Opportunities & challenges
  Authors: M. Ostendorf, U. Washington; E. Shriberg, SRI International & ICSI; A. Stolcke, SRI International & ICSI

Regular lectures:

  • Title: Approaches and applications of speech diarization
    Authors: D.A. Reynolds and P.A. Torres-Carrasquillo (MIT-LL)
  • Title: Structural metadata research in the EARS program
    Authors: Y. Liu (ICSI), E. Shriberg (SRI/ICSI), A. Stolcke (SRI/ICSI), B. Peskin (ICSI), J. Ang (ICSI), D. Hillard (UW), M. Ostendorf (UW), M. Tomalin (Cambridge U), M. Harper (Purdue)
  • Title: Metadata and parsing
    Authors: M. Lease, E. Charniak and M. Johnson (Brown)
  • Title: Machine translation
    Authors: D. Marcu and K. Knight (ISI/USC)
  • Title: Information extraction
    Authors: R. Weischedel and L. Ramshaw (BBN)
  • Title: Summarization
    Authors: K. McKeown and J. Hirschberg (Columbia)
  • Title: Real-world audio indexing systems
    Authors: J.M. Van Thong, B. Logan and D. Goddeau (HP/CRL)
  • Title: Combining text and audio-visual features in video retrieval and topic clustering
    Authors: S.-F. Chang (Columbia), R. Manmatha (UMass) and T.-S. Chua (NUS Singapore)
  • Title: Interactive dialog systems
    Authors: Y. Gao et al. (IBM)
  • Title: Measuring human readability of machine generated text
    Authors: D. Jones (MIT-LL) and T. Gibson (MIT)
  • Title: User-centered evaluation for machine translation of spoken language
    Authors: D. Palmer (Virage)
©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2005.com -||- Last updated Wednesday, December 08, 2004