A few years ago, while reading Stuart Russell’s Human Compatible (Penguin Books, 2019), a book about the perils and opportunities of AI, I came across a very intriguing term: “probabilistic programming”. These two words were so alluring that I couldn’t help but dive in to learn as much as I could about the subject.
It didn’t take long to discover Statistical Rethinking by Richard McElreath (second edition, 2020), a book relatively unknown in traditional machine learning circles (although it is cited in Murphy’s book), but a cult book with a tick base of devoted followers in the Bayesian camp.
The book is simply superb, a model of how to teach and present ideas. If I ever write a technical book, this is how I would like to do it. It is structured around four main pillars: Bayesian analysis, model comparison, multi-level models and causality (a la Judea Pearl). It includes fairly in-depth treatment of orthogonal topics such as Monte Carlo methods and information theory.
The author is very clear, authoritative and an amazing storyteller, which literally kept me reading to the end of the chapter notes (packed with lots of references, interesting stories and curious facts). I personally find it very refreshing that most of the examples come from the biological or social sciences.
The examples in the book are written in R and Stan, but the fans of the book have ported them to many frameworks. I personally followed the examples in Numpyro, but there are also PyMC, Turing.jl (if you prefer Julia), and many more.