Technical Reports

The ICICS/CS Reading Room

UBC CS TR-2012-04 Summary

Efficient Extraction of Ontologies from Domain Specific Text Corpora, July 26, 2012 Tianyu Li, Pirooz Chubak, Laks V.S. Lakshmanan and Rachel Pottinger, 12 pages

Extracting ontological relationships (e.g., isa and hasa) from free-text repositories (e.g., engineering documents and in- struction manuals) can improve users’ queries, as well as benefit applications built for these domains. Current methods to extract ontologies from text usually miss many meaningful relationships because they either con- centrate on single-word terms and short phrases or neglect syntactic relationships between concepts in sentences. We propose a novel pattern-based algorithm to find onto- logical relationships between complex concepts by exploit- ing parsing information to extract multi-word concepts and nested concepts. Our procedure is iterative: we tailor the constrained sequential pattern mining framework to discover new patterns. Our experiments on three real data sets show that our algorithm consistently and significantly outperforms previous representative ontology extraction algorithms.

If you have any questions or comments regarding this page please send mail to