Baker: Deductive Online Snippet Parsing

Stack Overflow contains a large number of high-quality source code snippets. The quality of these snippets has been verified by users marking them as solving a specific problem. Stack Overflow treats source code snippets as plain text and searches surface snippets as they would any other text. Unfortunately, plain text does not capture the structural qualities of these snippets; for example, snippets frequently refer to specific API (e.g., Android), but by treating the snippets as text, linkage to the Android API is not always apparent. We perform snippet analysis to extract structural information from short plain-text snippets that are often found in Stack Overflow. This analysis is able to identify 253,137 method calls and type references from 21,250 Stack Overflow code snippets. We show how identifying these structural relationships from snippets could perform better than lexical search over code blocks in practice.

Using Baker

The Baker parser can be found online here.
Unfortunately, since I left Waterloo the machine on which Baker was executing is no longer functioning; additionally, I no longer have access to the Oracle used in the paper so I am unable to start the service directly again.

Chrome Extension

To augment API details on Stack Overflow's accepted answers and the Android documentation, first install the Chrome Tampermonkey extension. You next need to load two extensions into Tampermonkey. The StackOverflow script augments StackOverflow posts with links to the Android API docs. The Android documentation script augments the Android API docs with links back to relevant StackOverflow posts.

Publications

Several other resources are available online including an overview poster and two short videos.

ICSE 2014 teaser video:

ICSE 2014 silent demo: