Rather than having 1024 single-node blocks, it is possible to use just two tree-structured blocks for the noisy X problem. There are many ways to do this, and the diagram below illustrates one of them (the red points on the lattice are one block, the blue points are another):

The red blocks form a tree-structured UGM conditioned on the blue blocks, and vice versa. In this case, the blocks form a *partition* of the nodes, meaning that each variable is in one and only one block. For the block mean field algorithm below, it is important that the blocks form a partition. For the block ICM and Gibbs sampling algorithm, each variable should still be assigned to at least one block, but we can include each variable in more than one block.

In UGM, the blocks are stored in a cell array. We can make the blocks above for the noisy X problem as follows:

getNoisyX nodeNums = reshape(1:nNodes,nRows,nCols); blocks1 = zeros(nNodes/2,1); blocks2 = zeros(nNodes/2,1); b1Ind = 0; b2Ind = 0; for j = 1:nCols if mod(j,2) == 1 blocks1(b1Ind+1:b1Ind+nCols-1) = nodeNums(1:nCols-1,j); b1Ind = b1Ind+nCols-1; blocks2(b2Ind+1) = nodeNums(nRows,j); b2Ind = b2Ind+1; else blocks1(b1Ind+1) = nodeNums(1,j); b1Ind = b1Ind+1; blocks2(b2Ind+1:b2Ind+nCols-1) = nodeNums(2:nCols,j); b2Ind = b2Ind+nCols-1; end end blocks = {blocks1;blocks2};

In UGM, the block ICM algorithm takes an anonymous function that will do the conditional decoding of the blocks, and we can use the block ICM algorithm with UGM_Decode_Tree to do the decoding as follows:

Below is an example of the results obtained with ICM compared to block ICM on a sample of the noisy X problem:BlockICMDecoding = UGM_Decode_Block_ICM(nodePot,edgePot,edgeStruct,blocks,@UGM_Decode_Tree);

Even though both algorithms used the same initialization and perform simple greedy moves, the more global updates performed in the block method typically lead to better results.

We can even make a comment about the relative 'strengths' of the local optima found by the ICM and block ICM methods. In particular, ICM finds a local optimum in a very weak sense; the approximate decoding found by ICM can not be improved by changing any single state. In contrast, the block ICM algorithm finds a local optimum in a stronger sense; the approximate decoding by block ICM can not be improved by changing any combination of states within any block. Of course, this is still weaker than the decoding given by graph cuts; the graph cuts decoding can not be improved by changing any combination of states (making it the global optimum). But, we can apply block ICM in cases where graph cuts can not be applied. Further, we can improve the strength of the local optimum found by increasing the number of blocks that we consider.

We can apply the block mean field method using UGM_Infer_Tree for the inference within the blocks using:

Below, we show the difference between the estimated marginals with the mean field and block mean field method:[nodeBelBMF,edgeBelBMF,logZBMF] = UGM_Infer_Block_MF(nodePot,edgePot,edgeStruct,blocks,@UGM_Infer_Tree);

In principle we can apply the block mean field method even if the blocks don't form a partition of the nodes. However, this seems like it might further complicate the convergence of the method, since if the blocks don't form a partition then it isn't clear what objective function the updates would be optimizing.

burnIn = 10; edgeStruct.maxIter = 20; samplesBlockGibbs = UGM_Sample_Block_Gibbs(nodePot,edgePot,edgeStruct,burnIn,blocks,@UGM_Sample_Tree);

- Saul Jordan (1995). Boltmann chains and hidden Markov models. Advances in Neural Information Processing Systems.

- Tannger Wong (1987). The Calculation of Posterior Distributions by Data Augmentation. Journal of the American Statistical Association.

- Hamze de Freitas (2004). From Fields to Trees. Uncertainty in Artificial Intelligence.

PREVIOUS DEMO NEXT DEMO

Mark Schmidt > Software > UGM > Block Demo