Using Speech-Specific Characteristics for Automatic Speech Summarization

By Gabriel Murray

While the field of text summarization has developed substantially over the past few decades, the related field of speech summarization is comparatively young and under-developed. This is partly due to the challenges inherent in working with spontaneous speech data: the speech is disfluent and ungrammatical while the information density of spontaneous spoken dialogues can be quite low. Unlike text documents such as newswire articles or academic journals, there are few structural cues to exploit. However, speech contains additional information that can be exploited for summarization purposes: prosodic features such as the pitch, energy and rate-of-speech of a word or utterance, conversational features such as speaker overlap, and speaker-related features such as a speaker's dominance and status in the dialogue.

This research explores the usefulness of speech-specific features for automatic summarization in the meetings domain. This speech domain is particularly interesting because meetings are a ubiquitous aspect of life, the speech itself is spontaneous and natural (unlike, e.g., much of Broadcast News) and real-world use-cases for meeting summarization are clear. This talk will specifically address the use of speech features for term-weighting, utterance extraction, and utterance compression.

Click here to go to the LCI Forum page