Contents

% a simple demo of how to do structure learning with data from a DBN

Demo configuration

rand('seed',0); % Deterministic randomness
randn('seed',0);

HMM = 1;
HUSMEIER = 2;

dbnType = HUSMEIER;

Create a DBN

if dbnType == HMM

elseif dbnType == HUSMEIER;

    % Reproduced from a subset of the network from figure 3, Husmeier.
    % "Sensitivity and specificity of inferring genetic
    % regulatory interactions from microarray experiments with dynamic
    % Bayesian networks". Bioinformatics, Vol. 19 no. 17 2003, pages
    % 2271–2282.
    % See "husmeier_network.jpg" image in Demos directory to see network
    % topology (compare to Husmeier, figure 3).

    % show the ground truth
    groundTruthImage = imread('husmeier_network.jpg');
    figure('Position',[100 100 size(groundTruthImage,2) size(groundTruthImage,1)]);
    image(groundTruthImage); axis('image'); axis('off'); set(gca,'Position',[0 0 1 1])
    title('Husmeier ground truth');

    % With this choice of inter-slice connections, we can expect to learn
    % the full casual bayes-net with observational-only data (assuming the
    % sampled parameters induce a faithful distribution).

    CLN2 = 1;
    CDC5 = 2;
    RNR3 = 3;
    SRO4 = 6;
    CLN1 = 4;
    SVS1 = 5;
    ALK1 = 7;
    CLB2 = 8;
    MY01 = 9;

    % There appears to be a bug in BNT mk_bnet automatically sorts nodes in topological order, while mk_dbn
    % does not (on the intra adjacency matrix)

    % From Husmeier's paper
    intra = zeros(9);
    intra(CLN2,CLN1)=1;
    intra(CLN2,SRO4)=1;
    intra(CLN2,SVS1)=1;
    intra(CLN2,RNR3)=1;
    intra(CDC5,SVS1)=1;
    intra(CDC5,ALK1)=1;
    intra(CDC5,CLB2)=1;
    intra(CDC5,MY01)=1;

    % Add some arbitrarily chosen inter-slice edges
    inter = zeros(9);
    inter(CLN2,CLN2)=1;
    inter(CDC5,CDC5)=1;

    dnodes = 1:9;
    bnet = mk_dbn(intra, inter, 2*ones(1,9), 'discrete', dnodes);

    % sample parameters for the network
    bnet = myMkBnet( length(bnet.CPD), bnet.node_sizes(:), 'bnet', bnet, 'method', 'meek');

    maxFanIn = 4;
    nData = 5000;
end

Sample data from the DBN

data = cell2mat(sample_dbn(bnet, 'length', nData )); % sample some data from the DBN

Convert the data representation to one suitable for direct input into the BNSL routines

Transform the sampled data into a format suitable for BNSL (from time-series to time-slices) Returns a struct holding the new (doubled) # of nodes, a layering that encodes the flow of time, and the transformed data.

dataDbn = transformDbnData(data, 'maxFanIn', maxFanIn); % *** Important function

Learn the DBN back with the DP algorithm

Since all the parameters are randomly sampled, they could lead to a non-faithful distrubtion (ie. one that encodes conditionally independences not encoded by the DBN).

% These are the same functions used in simpleDemo.m (they are used throughout the BNSL package)
aflp = mkAllFamilyLogPrior( dataDbn.nNodes, 'nodeLayering', dataDbn.nodeLayering, 'maxFanIn', dataDbn.maxFanIn);
aflml = mkAllFamilyLogMargLik( dataDbn.data, 'nodeArity', bnet.node_sizes(:), 'impossibleFamilyMask', aflp~=-Inf, 'verbose', 1);
ep = computeAllEdgeProb( aflp, aflml);

% There is no point in doing structure learning on the first time step, since
% we cannot learn anything from a single data point (in terms of posterior features, such as edges)
% aflp = mkAllFamilyLogPrior( dataDbn.nNodes );
% aflml = mkAllFamilyLogMargLik( data(:,1), 'nodeArity', 2*ones(1,2));
% ep0 = computeAllEdgeProb( aflp, aflml);
Node 1/18
Node 2/18
Node 3/18
Node 4/18
Node 5/18
Node 6/18
Node 7/18
Node 8/18
Node 9/18
Node 10/18
Node 11/18
Node 12/18
Node 13/18
Node 14/18
Node 15/18
Node 16/18
Node 17/18
Node 18/18

Plot the result

nNodes = length(bnet.intra);
figure;
subplot(1,2,1); imagesc(bnet.dag(:,nNodes+1:end)); title('Generating structure'); axis('equal'); axis('tight');
set(gca,'ytick',1:dataDbn.nNodes); set(gca,'xtick',1:nNodes); set(gca,'xticklabel',nNodes+1:size(bnet.dag,2));

subplot(1,2,2); imagesc(ep(:,nNodes+1:end),[0 1]); title('Learned edge features'); axis('equal'); axis('tight');
set(gca,'ytick',1:dataDbn.nNodes); set(gca,'xtick',1:nNodes); set(gca,'xticklabel',nNodes+1:size(bnet.dag,2));