A Markov chain is a belief network with random
variables in a sequence, where each variable only directly depends on
its predecessor in the sequence. Markov chains are used to represent
sequences of values, such as the sequence of states in a dynamic
system or the sequence of words in a sentence. Each point in the
sequence is called stage.
Figure 8.11 shows a generic Markov chain as a
belief network. The network has 5 stages, but does not have to
stop at stage $4$; it can extend indefinitely. The
belief network conveys the independence assumption
Often the sequences are in time and, ${S}_{t}$ represents the state at time $t$. Intuitively, ${S}_{t}$
conveys all of the information about the history that could affect the
future states. The independence assumption of the Markov chain can be seen as
“the future is conditionally independent of the past given the present.”
A Markov chain is a stationary model
or time-homogenous model if the variables all have the
same domain, and
the transition probabilities are the same for each stage, i.e.,
To specify a stationary Markov chain, two conditional
probabilities are provided:
•
$P({S}_{0})$ specifies the initial conditions
•
$P({S}_{i+1}\mid {S}_{i})$ specifies the dynamics, which is the same for
each $i\ge 0$.
Stationary Markov chains are of interest for the following reasons:
•
They provide a simple model that is easy to specify.
•
The assumption of stationarity is often the natural model, because
the dynamics of the world typically does not
change in time. If the dynamics does change in time, it is usually
because of some other feature that could also be modeled.
•
The network extends indefinitely. Specifying a small number
of parameters gives an infinite network. You can ask queries or
make observations
about any arbitrary points in the future or the past.
PagerankGoogle’s initial search engine [Brin and Page, 1998] was based on
Pagerank. Pagerank [Page et al., 1999] is a probability measure over web pages
where the most influential web pages have the highest
probability. It is based on a Markov chain of a random web surfer who
starts on a random page, and with some probability $\colorbox[rgb]{1,1,0.701960784313725}{$d$}$ picks a
random page that is linked from the current page, and otherwise (if
the current page has no outgoing links or with probability $\colorbox[rgb]{1,1,0.701960784313725}{$1$}\colorbox[rgb]{1,1,0.701960784313725}{$-$}\colorbox[rgb]{1,1,0.701960784313725}{$d$}$)
picks a page at random.
The Markov chain is defined as follows:•The domain of ${\colorbox[rgb]{1,1,0.701960784313725}{$S$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$i$}}$ is the set of all web pages.•$\colorbox[rgb]{1,1,0.701960784313725}{$P$}\colorbox[rgb]{1,1,0.701960784313725}{$$}\colorbox[rgb]{1,1,0.701960784313725}{$($}{\colorbox[rgb]{1,1,0.701960784313725}{$S$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$0$}}\colorbox[rgb]{1,1,0.701960784313725}{$)$}$ is the uniform distribution of web pages: $\colorbox[rgb]{1,1,0.701960784313725}{$P$}\colorbox[rgb]{1,1,0.701960784313725}{$($}{\colorbox[rgb]{1,1,0.701960784313725}{$S$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$0$}}\colorbox[rgb]{1,1,0.701960784313725}{$=$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}\colorbox[rgb]{1,1,0.701960784313725}{$)$}\colorbox[rgb]{1,1,0.701960784313725}{$=$}\colorbox[rgb]{1,1,0.701960784313725}{$1$}\colorbox[rgb]{1,1,0.701960784313725}{$/$}\colorbox[rgb]{1,1,0.701960784313725}{$N$}$ for each web page ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}$,
where $\colorbox[rgb]{1,1,0.701960784313725}{$N$}$ is the number of web pages.
•The transition is defined as follows:$\colorbox[rgb]{1,1,0.701960784313725}{$P$}\colorbox[rgb]{1,1,0.701960784313725}{$($}{\colorbox[rgb]{1,1,0.701960784313725}{$S$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$i$}\colorbox[rgb]{1,1,0.701960784313725}{$+$}\colorbox[rgb]{1,1,0.701960784313725}{$1$}}$$\colorbox[rgb]{1,1,0.701960784313725}{$=$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}\colorbox[rgb]{1,1,0.701960784313725}{$\mid $}{\colorbox[rgb]{1,1,0.701960784313725}{$S$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$i$}}\colorbox[rgb]{1,1,0.701960784313725}{$=$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}\colorbox[rgb]{1,1,0.701960784313725}{$)$}$$\colorbox[rgb]{1,1,0.701960784313725}{$=$}\colorbox[rgb]{1,1,0.701960784313725}{$($}\colorbox[rgb]{1,1,0.701960784313725}{$1$}\colorbox[rgb]{1,1,0.701960784313725}{$-$}\colorbox[rgb]{1,1,0.701960784313725}{$d$}\colorbox[rgb]{1,1,0.701960784313725}{$)$}\colorbox[rgb]{1,1,0.701960784313725}{$/$}\colorbox[rgb]{1,1,0.701960784313725}{$N$}\colorbox[rgb]{1,1,0.701960784313725}{$+$}\colorbox[rgb]{1,1,0.701960784313725}{$d$}\colorbox[rgb]{1,1,0.701960784313725}{$*$}\colorbox[rgb]{1,1,0.701960784313725}{$\{$}\begin{array}{cc}\colorbox[rgb]{1,1,0.701960784313725}{$1$}\colorbox[rgb]{1,1,0.701960784313725}{$/$}{\colorbox[rgb]{1,1,0.701960784313725}{$n$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}\hfill & \colorbox[rgb]{1,1,0.701960784313725}{$\text{if}$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}\colorbox[rgb]{1,1,0.701960784313725}{$\text{links to}$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}\hfill \\ \colorbox[rgb]{1,1,0.701960784313725}{$1$}\colorbox[rgb]{1,1,0.701960784313725}{$/$}\colorbox[rgb]{1,1,0.701960784313725}{$N$}\hfill & \colorbox[rgb]{1,1,0.701960784313725}{$\text{if}$}{\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}\colorbox[rgb]{1,1,0.701960784313725}{$\text{has no links}$}\hfill \\ \colorbox[rgb]{1,1,0.701960784313725}{$0$}\hfill & \colorbox[rgb]{1,1,0.701960784313725}{$\text{otherwise}$}\hfill \end{array}$where there are $\colorbox[rgb]{1,1,0.701960784313725}{$N$}$ web pages and there ${\colorbox[rgb]{1,1,0.701960784313725}{$n$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$ links on page ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$. The
way to think about this is that ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$ is the current web page, and
${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}$ is the next web page. If ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$ has no outgoing links, then ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$j$}}$
is a page at random, which is the effect of the middle case. If ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$
has outgoing links, with probability $\colorbox[rgb]{1,1,0.701960784313725}{$d$}$ the surfer picks a random
page linked from ${\colorbox[rgb]{1,1,0.701960784313725}{$p$}}_{\colorbox[rgb]{1,1,0.701960784313725}{$k$}}$, and otherwise picks a page at random.
•$\colorbox[rgb]{1,1,0.701960784313725}{$d$}\colorbox[rgb]{1,1,0.701960784313725}{$\approx $}\colorbox[rgb]{1,1,0.701960784313725}{$0.85$}$ is the probability someone picks a link on the
current page.This Markov chain converges to a distribution over web pages.
Page et al. [1999] reported the search engine had converged
to “a reasonable tolerance” for $\colorbox[rgb]{1,1,0.701960784313725}{$i$}\colorbox[rgb]{1,1,0.701960784313725}{$=$}\colorbox[rgb]{1,1,0.701960784313725}{$52$}$ with 322 million links.Pagerank provides a measure of influence. To get a high Pagerank, a
web page should be linked from other pages with a high Pagerank. It is
difficult, yet not impossible, to manipulate Pagerank for selfish
reasons. One could try to artificially boost Pagerank for a specific page, by creating many pages
that point to that page, but it is difficult for those referring pages
to also
have a high pagerank.In the initial reported version, Brin and Page [1998] used 24 million web pages and 76 million links. The web is more complex now with many pages being dynamically
generated, and search engines use much more sophisticated algorithms.
To determine the probability distribution of state ${S}_{i}$, variable elimination can be used to sum out the preceding variables. The
variables after ${S}_{i}$ are irrelevant to the probability of ${S}_{i}$ and
need not be considered.
To compute $P({S}_{i}\mid {S}_{k})$, where $i>k$, only the variables between
${S}_{i}$ and ${S}_{k}$ need to be considered, and if $$, only the variables
less than $k$ need to be considered.
A stationary distribution of a Markov chain is a
distribution of the states such that
if it holds at one time, it holds at the next time. Thus $P$ is a
stationary distribution if for each state $s$,
$P({S}_{i+1}=s)=P({S}_{i}=s)$.
Thus,
A Markov chain is ergodic if,
for any two states ${s}_{1}$ and ${s}_{2}$ in the domain of ${S}_{i}$, there is a
non-zero probability of eventually reaching ${s}_{2}$ from
${s}_{1}$. A Markov chain is periodic with period $p$ if the difference
between the times when it visits the same state is always divisible by
$p$. For example, consider the Markov
chain with states
the integers 0 to 9, and at each time it either adds 1 or adds 9 (modulo
10), each with probability 0.5. This Markov chain is periodic with
period 2; if it starts in an even state at time 0, it will be in an
even state at even times, and in an odd state at odd times. If the
only period of a Markov chain is a period of 1, then the Markov
chain is aperiodic.
If a Markov chain is ergodic and aperiodic, then there is a unique stationary
distribution, and this is the equilibrium
distribution that will be approached from any starting state.
Thus for any distribution over ${S}_{0}$, the distribution over ${S}_{i}$ will
get closer and closer to the equilibrium distribution, as $i$ gets larger.