The intermingling of rather different domains can, at times, produce rather interesting results. Here the author explores the intersection of probability theory (in the guise of Markov chains and Markov decision processes) and formal proof (through the interactive theorem prover Isabelle/HOL), and even throws in some programming language semantics as an application. Given the explosion of interest in machine learning, the largest application of probability theory, and the current worries that parts of it are closer to alchemy than to modern science, it is rather important that certain parts of it be formally understood.

The task that the author has set for himself is rather clear, if daunting: based on some pre-existing foundations in Isabelle/HOL, formalize the theory around Markov chains and Markov decision processes. And, not content with just doing that, ensure that the resulting theories can be applied to work on both the operational and denotational semantics of a small probabilistic programming language. And, as if that weren’t enough, throw in a healthy measure of aesthetics, where the author chooses a nontraditional route to this formalization, namely via co-algebras, order theory, and fixed points.

The latter choice might seem like complete insanity to a classically minded mathematician. However, to a computer scientist steeped in the current best practices of formalization of the semantics of programming languages (like me), the thought is rather the opposite: that’s the only reasonable way to attack the problem! Satisfyingly, it is the latter opinion that proves to produce a “most efficient” formalization, as the author demonstrates.

This paper, which is fairly dense and fairly long, actually represents an even larger piece of work (278 pages!) of formalized mathematics, available online [1]. The author does an excellent job of providing an overview of the work and some of the background material before diving in to the actual contribution. The previously existing formalizations that could be reused are well documented, and the reason some were not reused but instead redone in a different way is also documented. A survey of related work is also given. The bulk of the paper and its contributions are in the three long sections: 3, 4, and 5.

Work such as this presents a conundrum: it should be done, it should be of this quality, and it should be published because it represents a significant scientific advance. The few people who are working on formalizing these topics in other systems should definitely read this paper. Anyone else? That is harder to say. It relies on quite a lot of background knowledge from different topics, which few will possess. I have only learned the appropriate material in the last two years. It is a research paper, not an introduction or a survey. I enjoyed it, even though I have no intention of formalizing this part of mathematics myself.