Summarizing Software Artifacts

While working with software artifacts, software developers often encounter a lack of abstraction; they have to deal with all details of the artifact. Examples include going through all the comments in a bug report or reading large subsets of code that constitute a software concern.

In this project, we are investigating approaches to generate reasonable summaries of software artifacts. The goal is to raise the level of abstraction and improve the productivity of software developers.

We have considered two kinds of software artifacts to date: those with largely natural language (e.g., bug reports) and those with largely structured information (e.g., source code).


  • Sarah Rastkar
  • Gail C. Murphy

Recent Publications

Sarah Rastkar, Gail C. Murphy, Alexander W.J. Bradley. Generating natural language summaries for crosscutting source code concerns. ICSM 2011. 

Sarah Rastkar, Gail C. Murphy, Gabriel Murray. Summarizing Software Artifacts: A Case Study of Bug Reports. ICSE2010. [PDF]

� ACM, 2010. This is the author's version of the work. It is posted here by permission of the ACM for your personal use. Not for redistribution. The definitive version will be published in ICSE'2010.

The Bug Report Corpus

The corpus consists of 36 annotated bug reports. The bug reports have been selected from four different open-source software projects: Eclipse Platform, Gnome, KDE, and Mozilla. There are 9 bug reports from each project, a total of 2361 sentences.

The corpus can be downloaded in two parts: the original bug reports and the annotation.

Each bug report has been annotated by three different annotators. The annotation consists of the following:

  • Abstractive summaries with linked sentences
  • Extractive summaries
  • Labeling for sentences:
    • 'Problem', 'Suggestion', 'Fix', 'Agreement', 'Disagreement'
    • 'Meta' sentences

The BC3 annotation framework was used to help the annotators.

Summarization of Crosscutting Code Concerns - User Study Materials

We conducted a user study to evalute if the concern summaries generated by our approach can help programmers in performing software task. In the user study, each participant was asked to perform two change tasks, one in JHotDraw and one in jEdit. 4 out of the 8 participants perform the jEdit task first, the other four performed the JHotDraw task first. The concern summary plug-in was only enabled for the second task.

jEdit task:

JHotDraw task:

A list of all the code elements that were counted as relevant for each task is available here (pdf).

The tutorial on the concern summary plug-in, given to the participants before the second task can be downloaded here (pdf).

For more information about this project email Sarah Rastkar.

a place of mind, The University of British Columbia


ICICS/CS Building 201-2366 Main Mall
Vancouver, B.C. V6T 1Z4 Canada
Tel: 604-822-3061 | Fax: 604-822-5485
Undergrad program:
Graduate program:

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia