I'm taking Daphne Koller's class on Probabilistic Graphical Models. Wish me luck - it looks tough. So, first off, why graphical models?
The first chapter of the book lays out the rational. PGMS are a general framework that can be used to allow a computer to take the available information about an uncertain situation and reach conclusions, both about what might be true in the world and about how to act. Uncertainty arises because of limitations in our ability to observe the world, limitations in our ability to model it, and possibly even because of innate nondeterminism. We can only rarely (if ever) provide a deterministic specification of a complex system. Probabilistic models make this fact explicit, and therefore often provide a model which is more faithful to reality.
More concretely, our knowledge about the system is encoded the graphical model which helps us exploit efficiencies arising from the structure of the system.
Distributions over many variables can be expensive to represent naively. For example, a table of joint probabilities of n binary variables requires storing O(2n) foating-point numbers. The insight of the graphical modeling perspective is that a distribution over very many variables can often be represented as a product of local functions that each depend on a much smaller subset of variables. This factorization turns out to have a close connection to certain conditional independence relationships among the variables - both types of information being easily summarized by a graph. Indeed, this relationship between factorization, conditional independence, and graph structure comprises much of the power of the graphical modeling framework: the conditional independence viewpoint is most useful for designing models, and the factorization viewpoint is most useful for designing inference algorithms
An Introduction to Conditional Random Fields by Charles Sutton and Andrew McCallum
Bayesian networks and Markov networks (aka Markov random fields) are the two basic models used in the class, the key difference being directed vs. undirected edges. In a Bayesian network, the edges are directed while they are undirected in a Markov network.
How are these different kinds of graphic models related? Let's hope we'll find out.
There's a study group meetup here at the Institute for Systems Biology (and maybe other locations) on Thursday night. Come join us, if you're in Seattle and you're doing the class.
Supplemental reading
- Inference in Bayesian networks
- What is a hidden Markov model?, a primer by Sean Eddy
- Rabiner's Tutorial on Hidden Markov Models
- Octave Cheat Sheet. Like Andrew Ng's Machine learning class, the PGM class uses Octave, too.