5.4 Knowledge Representation Issues

5.4.4 Knowledge-Level Debugging

Just as with other software, knowledge bases may have errors and omissions. Domain experts and knowledge engineers must be able to debug a knowledge base and add knowledge. In knowledge-based systems, debugging is difficult because the domain experts and users who have the domain knowledge required to detect a bug do not necessarily know anything about the internal working of the system, nor do they want to. Standard debugging tools, such as providing traces of the execution, are inappropriate because they require a knowledge of the mechanism by which the answer was produced. In this section, we show how the idea of semantics enables debugging facilities for knowledge-based systems. Whoever is debugging the system is required only to know the meaning of the symbols and whether specific atoms are true or not, and does not need to know the proof procedure. This is the kind of knowledge that a domain expert and a user may have.

Knowledge-level debugging is the process of finding errors in knowledge bases with reference only to what the symbols mean and what is true in the world. One of the goals of building knowledge-based systems that are usable by a range of domain experts is that a discussion about the correctness of a knowledge base should be a discussion about the knowledge domain. For example, debugging a medical knowledge base should involve questions of medicine that medical experts, who are not expected to be experts in AI, can answer. Similarly, debugging a knowledge base about house wiring should be with reference to the particular house, not about the internals of the system reasoning with the knowledge base.

Four types of non-syntactic errors arise in rule-based systems.

  • An incorrect answer is produced; that is, some atom that is false in the intended interpretation was derived.

  • An answer that was not produced; that is, the proof failed on a particular true atom when it should have succeeded.

  • The program gets into an infinite loop.

  • The system asks irrelevant questions.

Ways to debug the first three types of error are examined below. Irrelevant questions can be investigated using the why questions as described earlier.

Incorrect Answers

An incorrect answer is an answer that has been proved yet is false in the intended interpretation. It is also called a false-positive error. An incorrect answer is only produced by a sound proof procedure if an incorrect definite clause was used in the proof.

Assume that whoever is debugging the knowledge base, such as a domain expert or a user, knows the intended interpretation of the symbols of the language and is able to determine whether a particular proposition is true or false in the intended interpretation. The person does not have to know how the answer was computed. To debug an incorrect answer, a domain expert needs only to answer yes-or-no questions.

Suppose there is an atom g that was proved yet is false in the intended interpretation. Then there must be a rule ga1ak in the knowledge base that was used to prove g. Either

  • one of the ai is false in the intended interpretation, in which case it can be debugged in the same way, or

  • all of the ai are true in the intended interpretation; in this case, the definite clause ga1ak must be incorrect.

This leads to an algorithm, presented in Figure 5.6, to debug false positives, namely to find a false clause in a knowledge base when an atom that is false in the intended interpretation is derived.

1:procedure Debug_false(g,KB)
2:      Inputs
3:            KB a knowledge base
4:            g an atom: KBg and g is false in intended interpretation       
5:      Output
6:            clause in KB that is false
7:      Find definite clause ga1akKB used to prove g
8:      for each ai do
9:            ask user whether ai is true
10:            if user specifies ai is false then
11:                 return Debug_false(ai,KB)                   
12:      return ga1ak
Figure 5.6: An algorithm to debug false positive answers

This only requires the person debugging the knowledge base to be able to answer yes-or-no questions.

This procedure can also be carried out by the use of the how command. Given a proof for an atom g that is false in the intended interpretation, a user can ask how g was proved. This will return the definite clause that was used in the proof. If the clause was a rule, the user could use how to ask about an atom in the body that was false in the intended interpretation. This will return the rule that was used to prove that atom. The user repeats this until a definite clause is found where all of the elements of the body are true (or there are no elements in the body). This is the incorrect definite clause.

Example 5.17.

Consider Example 5.7, involving the electrical domain, but assume there is a bug in the knowledge base. Suppose that the domain expert or user had inadvertently said that whether w1 is connected to w3 depends on the status of s3 instead of s1 (see Figure 5.2). Thus, the knowledge includes the following incorrect rule:

live_w1live_w3up_s3.

All of the other axioms are the same as in Example 5.7. Given this axiom set, the atom lit_l1 can be derived, which is false in the intended interpretation. Consider how a user would go about finding this incorrect definite clause when they detected this incorrect answer.

Given that lit_l1 is false in the intended interpretation, they ask how it was derived, which will give the following rule:

lit_l1light_l1live_l1ok_l1.

They check the atoms in the body of this rule. light_l1 and ok_l1 are true in the intended interpretation, but live_l1 is false in the intended interpretation. So they ask

how 2.

The system presents the rule

live_l1live_w0.

live_w0 is false in the intended interpretation, so they ask

how 1.

The system presents the rule

live_w0live_w1up_s2.

live_w1 is false in the intended interpretation, so they ask

how 1.

The system presents the rule

live_w1live_w3up_s3.

Both elements of the body are true in the intended interpretation, so this is a buggy rule.

The user or domain expert can find the buggy definite clause without having to know the internal workings of the system or how the proof was computed. They only require knowledge about the intended interpretation and the disciplined use of how.

Missing Answers

The second type of error occurs when an expected answer is not produced. This manifests itself by a failure when an answer is expected. An atom g that is true in the domain, but is not a consequence of the knowledge base, is called a false-negative error. The preceding algorithm does not work in this case; there is no proof. We must look for why there is no proof for g.

An appropriate answer is not produced only if a definite clause or clauses are missing from the knowledge base. By knowing the intended interpretation of the symbols and by knowing what queries should succeed (i.e, what is true in the intended interpretation), a domain expert can debug a missing answer. Given a false negative, an atom that failed when it should have succeeded, Figure 5.7 shows how to debug the knowledge base to find an atom for which there is a missing definite clause.

1:procedure Debug_missing(g,KB)
2:      Inputs
3:            KB a knowledge base
4:            g an atom: KB⊬g and g is true in the intended interpretation       
5:      Output
6:            atom for which there is a clause missing
7:      if there is a definite clause ga1akKB such that all ai are true in the intended interpretation then
8:            select ai that cannot be proved
9:            return Debug_missing(ai,KB)
10:      else
11:            return g       
Figure 5.7: An algorithm for debugging missing answers (false negatives)

Suppose g is an atom that should have a proof, but which fails. Because the proof for g fails, the bodies of all of the definite clauses with g in the head fail.

  • Suppose one of these definite clauses for g should have resulted in a proof; this means all of the atoms in the body must be true in the intended interpretation. Because the body failed, there must be an atom in the body that fails. This atom is then true in the intended interpretation, but fails. So we can recursively debug it.

  • Otherwise, there is no definite clause applicable to proving g, so the user must add a definite clause for g.

A whynot question can be used by the user to ask why some g was not proved. The system can ask the relevant questions to implement Debug_missing.

Example 5.18.

Suppose that, for the axiomatization of the electrical domain in Example 5.7, the world of Figure 5.2 actually had s2 down. Thus, it is missing the definite clause specifying that s2 is down. The axiomatization of Example 5.7 fails to prove lit_l1 when it should succeed. Consider how to find the bug.

lit_l1 failed, so the system finds all of the rules with lit_l1 in the head. There is one such rule:

lit_l1light_l1live_l1ok_l1.

The user can then verify that all of the elements of the body are true. light_l1 and ok_l1 can both be proved, but live_l1 fails, so the user debugs this atom. There is one rule with live_l1 in the head:

live_l1live_w0.

The atom live_w0 cannot be proved, but the user verifies that it is true in the intended interpretation. So the system find the rules for live_w0:

live_w0live_w1up_s2.
live_w0live_w2down_s2.

The user can determine that the body of the second rule is true. There is a proof for live_w2, but there are no clauses for down_s2, so this atom is returned. The correction is to add an appropriate clause, by stating it as a fact or providing a rule for it.

Infinite Loops

Example 5.12 shows an example where the top-down derivation can loop. There can be an infinite loop only if the knowledge base is cyclic, where a knowledge base is cyclic if there is an atom a such that there is a sequence of definite clauses of the form

aa1
a1a2
ana

(where if n=0 there is a single definite clause with a in the head and body).

A knowledge base is acyclic if there is an assignment of natural numbers (non-negative integers) to the atoms so that the atoms in the body of a definite clause are assigned a lower number than the atom in the head. There cannot be an infinite loop in an acyclic knowledge base.

To detect a cyclic knowledge base, the top-down proof procedure can be modified to maintain the set of all ancestors for each atom in the proof.

Initially, the set of ancestors of each atom is empty. When the rule

aa1ak

is used to prove a, the ancestors of ai will be the ancestors of a together with a. That is,

ancestors(ai)=ancestors(a){a}.

The proof can fail if an atom is in its set of ancestors. This failure only occurs if the knowledge base is cyclic. This is a refined version of cycle pruning used in search, with each atom having its own set of ancestors.

A cyclic knowledge base is often a sign of a bug. When writing a knowledge base, it is often useful to ensure an acyclic knowledge base by identifying a value that is being reduced at each iteration. For example, in the electrical domain, the number of steps away from the outside of the house is meant to be reduced by one each time through the loop.

Note that the bottom-up proof procedure does not get into an infinite loop, because it selects a rule only when the head has not been derived.