Anaphora in natural Language Understanding: A Survey
A problem that all computer-based natural language understanding (NLU) systems encounter is that of linguistic reference, and in particular anaphora (abbreviated reference). For example, in a text as simple as: Nadia showed Sue her new car. The seats were Day-Glo orange. knowing that "her" probably means Nadia and not Sue and that "the seats" means the seats of Nadia's new car is not a simple task. This thesis is an extensive review of the reference and anaphor problem, and the approaches to it that NLU systems have taken, from early systems such as STUDENT through to current discourse-oriented ones such as PAL. The problem is first examined in detail, and examples are given of many different types of anaphor, some of which have been ignored by previous authors. The approaches taken in traditional systems are then described and abstracted and it is shown why they were inadequate, and why discourse theme and anaphoric focus need to be taken into account. The strengths and weaknesses of current anaphora theories and approaches are evaluated. The thesis closes with a list of some remaining research problems. The thesis has been written so as to be as comprehensible as possible to both AI workers who know no linguistics, and linguists who have not studied artificial intelligence.