Using Speech-Specific Characteristics for Automatic Speech Summarization
By Gabriel Murray
While the field of text summarization has developed substantially over the past
few decades, the related field of speech summarization is comparatively young
and under-developed. This is partly due to the challenges inherent in working
with spontaneous speech data: the speech is disfluent and ungrammatical while
the information density of spontaneous spoken dialogues can be quite low. Unlike
text documents such as newswire articles or academic journals, there are few
structural cues to exploit. However, speech contains additional information that
can be exploited for summarization purposes: prosodic features such as the
pitch, energy and rate-of-speech of a word or utterance, conversational features
such as speaker overlap, and speaker-related features such as a speaker's
dominance and status in the dialogue.
This research explores the usefulness of speech-specific features for automatic summarization in the meetings domain. This speech domain is particularly interesting because meetings are a ubiquitous aspect of life, the speech itself is spontaneous and natural (unlike, e.g., much of Broadcast News) and real-world use-cases for meeting summarization are clear. This talk will specifically address the use of speech features for term-weighting, utterance extraction, and utterance compression.