Efforts over the past 60 years to use computers to implement human-like reasoning have favored the interpretation of probabilities as reflecting degrees of belief, fueling the rapid growth of Bayesian formalisms. While theoretically attractive, these formalisms present formidable challenges for learning and inference due to the very large size of the probability space that they must manipulate. In practice, not all variables depend on one another, and restrictions to their interactions are naturally formalized as a graph, in which nodes represent random variables and edges capture dependence constraints among their distributions. Different intuitions have realized this simple concept in many different ways. Bayesian belief networks, Markov networks, hidden Markov models, Kalman filters, plate models, influence diagrams, and Markov decision processes are only a few examples of this fruitful marriage between graphical structure and Bayesian probability.
In this weighty tome, Koller and Friedman offer a coherent, unified presentation of this approach to machine reasoning that treats most of the structures that have been proposed. After two chapters reviewing the main ideas of probability theory and introducing their notation, they divide their presentation into four sections.
The first section describes different graphical representations of joint probability distributions, including both directed (Bayesian network) and undirected (Markov network) models. They introduce the notion of a template model, which replicates a smaller structure many times, to unify temporal models (such as hidden Markov models and Kalman filters) and plate models. Other chapters in this section focus on the details of the local representation of conditional dependencies that give graphical models their power, Gaussian models to deal with continuous variables, and the theoretical undergirdings of the exponential family of distributions.
The second and longest section of the book deals with inference, the process of deriving information from a probabilistic graphical structure. Starting with an explanation of the basic process of variable elimination, the section continues to show how clique trees can reduce the complexity of the process, and then goes on to optimization-based and particle-based approaches to approximate inference, MAP inference (which seeks joint probabilities rather than estimates of the probability of a single variable), and inference in hybrid (discrete-continuous) and temporal models.
The third section deals with learning, including both parameter estimation in a model whose structure is known and learning structure. Starting with the assumption of complete data, the chapter moves on to discuss partially observed data.
The fourth section introduces the ideas of action and decision, based on Pearl’s notion of conditioning a distribution on a Do() operator instead of the See() operator implicit in the usual conditional formalism. After formalizing the notion of utility, the book discusses a range of structured decision problems.
Graphical models have made major contributions to machine reasoning. Their very diversity bears testimony to their potential, but at the same time makes it difficult for students to understand them, or for users of one variety to take advantage of alternative structures. By providing a common theoretical foundation, notation, and pedagogical perspective across the range of graphical models, this volume unifies the field, and will find a welcome home on the reference shelves of many practitioners and educators.