We propose a novel approach for developing a two-stage document-level discourse parser. Our parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intrasentential parsing and the other for multisentential parsing. We present two approaches to combine these two stages of discourse parsing effectively. A set of empirical evaluations over two different datasets demonstrates that our discourse parser significantly outperforms the stateof- the-art, often by a wide margin.
Click here to run the Live demo of the Discourse Parser:
You can download the Discourse Parser here.
Shafiq Joty, Giuseppe Carenini, and Raymond Ng. CODRA: A Novel Discriminative Framework for Rhetorical Analysis. Computational Linguistics, Volume 41:3 (2015), MIT Press. [ pdf ]
Shafiq Joty, Giuseppe Carenini, Raymond Ng and Yashar Mehdad. Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL) 2013, Sofia, Bulgaria. [ pdf ]
Shafiq Joty, Giuseppe Carenini and Raymond Ng. A Novel Discriminative Framework for Sentence-Level Discourse Analysis. In Proc. of the Conference on Empirical Methods in NLP and the Conference on Natural Language Learning (EMNLP-CoNLL 2012), Jeju, Korea. [ pdf ]