Randomised Experiments

## Randomised Experiments

The following procedure was used to generate the random networks. Given the n variables, we impose a total ordering. We build a decision tree for each variable. The leaves of the decision trees correspond to contexts and a variable that the tree is for. We start with the empty decision tree for each variable. We randomly (with uniform probability) choose a leaf and a variable. If the variable is a possible split (i.e., it is a predecessor in the total ordering of the variable the leaf is for and the context corresponding to the leaf doesn't commit a value to that variable), we split that leaf on that variable. This is repeated until we have n+s leaves. Then for each leaf, we built a confactor that has the same context and each predecessor of the variable that the leaf is for is included in the confactor with probability p. The variable that the leaf is for is also in the confactor. We assign random numbers the probabilities (these numbers won't affect anything in the results assuming that the times for operations on floating point numbers isn't affected by the actual values of the numbers).

The algorithm to construct the random examples used in Section 6.2 is given in Figure *. Note that "choose random" means to choose randomly from a uniform distribution.

 generate_random_CBN(n,s,p): create n Boolean variables X1,...,Xn Let S = {<{},Xi>: i <= i <= n} {S is a set of context-variable pairs} repeat choose random in S choose random j such that 1 <= j < n if j in S with and until there are n+s elements of S for each in S Let V={Xi} for each j to the contextual belief network

Algorithm for constructing the randomised examples

For the examples biased towards using fewer variables, we replaced the line:

 replace in S with and
with
 if there is k in S with and else replace in S with and
where, if more than one such k exists, the k is chosen uniformly from all of the values that satisfy the condition.

Figure * shows some of the details of some of the data shown in Figure 11. All of the times are for a Java implementation running on a 700 MHz Pentium running Linux with 768 megabytes of memory. In order to allow for the variation in run times due to garbage collection, each evaluation was run three times and the smallest time was returned.

 n s p CBN BN CVE VE Size Size Time MTS Time MTS 30 5 (5) 0.2 10922 8398266 31388 2097152 84801 4194304 30 5 (5) 0.2 1846 132692 9653 393216 12418 524288 30 5 (5) 0.2 1964 13456 2022 131072 3748 131072 30 5 (5) 0.2 1600 4908 3922 196608 5572 524288 30 5 (5) 0.2 1668 8292 60304 614400 61612 2097152 30 5 (5) 0.2 904 1906 637 32768 1005 65536 30 5 (5) 0.2 1738 10786 720 42048 1340 131072 30 5 (4) 0.2 1744 18538 2223 131072 2546 262144 30 5 (4) 0.2 3060 87292 11681 524288 12298 1048576 30 5 (3) 0.2 2692 69602 5325 262144 9530 524288 30 10 (9) 0.2 3842 530622 22288 524288 31835 2097152 30 10 (9) 0.2 1262 36070 4063 147456 11038 524288 30 10 (10) 0.2 3908 80704 112537 1966080 31214 4194304 30 10 (8) 0.2 4904 33568176 81450 5111808 203284 16777216 30 10 (7) 0.2 10456 314126 31627 589824 44079 2097152 30 10 (8) 0.2 1790 28758 1590 98304 2974 262144 30 10 (9) 0.2 2054 24452 13642 262144 5253 262144 30 10 (8) 0.2 3608 58352 24948 819200 18574 1048576 30 10 (9) 0.2 6392 1188654 403347 5767168 359992 33554432 30 10 (8) 0.2 6180 42344 10078 253952 17501 1048576 30 15 (10) 0.2 2724 2104338 56636 1572864 49316 2097152 30 15 (11) 0.2 5896 8425520 185925 6389760 260645 16777216 30 15 (11) 0.2 2096 2239982 27065 825344 42180 2097152 30 15 (12) 0.2 3674 39928 6393 360448 9631 524288 30 15 (11) 0.2 2388 552768 8623 425984 13641 524288 30 15 (11) 0.2 1938 49388 11299 438272 27303 2097152 30 15 (13) 0.2 4188 351374 18602 432776 38843 2097152 30 15 (12) 0.2 2806 111632 3213 138240 5463 1048576 30 15 (12) 0.2 3512 126464 16118 258048 11479 1048576 30 15 (10) 0.2 1700 541498 5986 172032 8414 524288

Comparisons of random networks that exhibit CSI.

For each generated example, the table of Figure * shows

• n the number of variables
• s the number of splits and, in parentheses, the number of different variables on which the splits occur (different leaves can be split on the same variable).
• p the probability of splitting on a variable it is possible to split on.
• CBN size: the total size (summing over the size of the tables) of the contextual belief network representation.
• BN size: the total size of the factors for the corresponding belief network (i.e., assuming the probabilities are stored in tables).
• CVE time is the runtime (in msecs) of contextual variable elimination and CVE MTS is the maximum sum of the table sizes created for the elimination of a single variable.
• VE time: the runtime (in msecs) of variable elimination and VE MTS is the maximum table size created for the elimination of a single variable.

 n s p CBN BN CVE VE Size Size Time MTS Time MTS 30 5 (3) 0.2 1602 3658 662 49152 604 65536 30 5 (2) 0.2 6108 7240 1583 131072 1945 131072 30 5 (2) 0.2 17526 19632 6754 393216 8833 1048576 30 5 (2) 0.2 892 4164 1604 81920 3119 131072 30 5 (1) 0.2 1540 2640 261 16512 457 32768 30 5 (2) 0.2 1156 5572 608 40960 816 65536 30 5 (2) 0.2 2344 68676 12462 1048576 9370 1048576 30 5 (1) 0.2 1406 7284 2197 81920 4021 262144 30 5 (3) 0.2 7740 209652 17736 851968 18279 2097152 30 5 (2) 0.2 4184 8838 8053 528384 16437 1048576 30 10 (3) 0.2 3038 1119562 30669 1048576 53732 2097152 30 10 (3) 0.2 888 5176 565 24576 2625 131072 30 10 (2) 0.2 3372 4222408 21834 917504 108732 4194304 30 10 (3) 0.2 2240 30968 20308 524800 129885 4194304 30 10 (2) 0.2 1106 11660 741 32768 433 65536 30 10 (2) 0.2 778 4582 274 18432 656 65536 30 10 (3) 0.2 1888 72260 6659 262144 11986 524288 30 10 (2) 0.2 9740 413892 11271 868352 179564 8388608 30 10 (2) 0.2 1102 7744 313 16384 1395 65536 30 10 (3) 0.2 4078 298438 61140 1048576 79858 4194304 30 15 (2) 0.2 2710 50698 8845 524288 39265 2097152 30 15 (3) 0.2 1246 84836 1935 90112 3652 262144 30 15 (3) 0.2 2046 75956 54571 1310720 141386 8388608 30 15 (2) 0.2 1588 138888 14280 458752 43059 2097152 30 15 (3) 0.2 2260 20230 522 28672 1815 131072 30 15 (4) 0.2 2842 67385366 322274 10485760 726989 33554432 30 15 (3) 0.2 3074 533738 85480 2752512 89431 8388608 30 15 (3) 0.2 1834 278426 18560 753664 43554 1048576 30 15 (3) 0.2 4362 209186 22872 1441792 131704 4194304 30 15 (2) 0.2 3142 151160 4476 164096 26426 1048576

Some of the details of data from Figure 12 (biased towards fewer different variables)

David Poole and Nevin Lianwen Zhang,Exploiting Contextual Independence In Probabilistic Inference, Journal of Artificial Intelligence Research, 18, 2003, 263-313.

 Randomised Experiments