Even for specialists well acquainted with more classical topics in machine learning, for example, plain supervised or unsupervised tasks (regression, classification, clustering, and the like), reinforcement learning (RL) can be quite challenging to comprehend at the beginning. The fact that training examples might be generated in a way induced by the decisions taken can be counterintuitive in light of the practices related to the mentioned approaches. Yet, if we would like to use machine learning to control various environments, the issue of RL must be covered. Fortunately, Graesser and Keng’s book helps with a smooth introduction to the topic.

Contrary to many books related to RL, the authors take a practitioner’s perspective and lead readers through the field with a lot of code examples, thus letting them start their own adventures with this type of model very fast. That means that the discussed book is a good beginning read on the topic, allowing readers to get acquainted with the basic ideas without having to wade through a jungle of tough theory. Since it is important to understand the mathematical background well, other more advanced books should follow, since the authors minimize the formal material that experts in the field should know.

The authors start with a basic presentation of the topic intertwined with an introduction to their software on which the coding examples are based. The entire code can be downloaded and exercised locally. The fundamental ideas are presented, along with the basic mathematical modeling related to them. Even though the authors do not delve into very advanced modeling, what they decide to present is done in a way that really deepens one’s understanding of the given issues. The book is then divided into four parts, where the first two parts present the most important algorithms in a cumulative way. This is done in such a way as to make the learning quite fast and relatively easy.

Part 1 covers the most classical policy- and value-based algorithms (REINFORCE, SARSA, deep Q-networks), whereas Part 2 looks at more advanced models (actor-critic, proximal policy optimization, and approaches based on parallelization). Part 3 focuses mainly on implementation issues: software engineering aspects of RL, a practice code suite prepared by the authors, an introduction to basic neural network concepts, and problems of interaction between computational infrastructure and software. The last part is an overview of various concepts of all the RL ontologies, that is, states, actions, rewards, and transitions. I must say that the location of these topics is quite surprising in comparison to other books that typically elaborate on such examples at the beginning, but I do like the idea taken here. Apparently, the authors focus first on giving readers a flavor of what it’s like working with various approaches, and only then encourage more depth with some of the ideas. Therefore, this consequently follows the aforementioned “practitioner’s perspective.”

For sure, this is a book aimed at self-study. Apart from the fact that the presented ideas are exemplified with well-explained code illustrations, all of the presented mathematical concepts are shown in a way that helps readers understand them easily. Thus, the work can be recommended not only to specialists that enter the field of RL with some background knowledge of machine learning, but even for undergraduate students that are keen on jumping to code examples after reading just 30-some introductory pages. After processing the whole book, readers are equipped with a basic toolbox for RL.

More reviews about this item: Amazon