|Home People Upcoming Conferences Publications Resources Collaborations Reading Group|
The BC3: British Columbia Conversation Corpus: The First Publicly Available Annotated Corpus for Email Summarization
The corpus consists of 40 email threads/3222 sentences from the W3C corpus. Each thread has been annotated by three different annotators. The annotation consists of the following:
The BC3 Annotation Software: An open-source tool for annotating email thread or other conversations
The BC3 corpus was annotated using a web-based annotation framework. This framework is open-sourced and is available for download for conversation annotation. The framework is built with Ruby on Rails and a MySQL database so that a web server can be set up that lets researchers import and manage an email corpus. It also lets users annotate emails threads for summaries and label email features.
The BC3 Corpus is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.
The BC3 framework is licensed under the MIT license.
If you have any questions or comments, please contact one of the following team members:
Previous Team members include: