Have you seen the Scooby-Doo meme claiming that machine learning is really statistics? Well, guess what? Machine learning is actually optimization (if you dig a little deeper, youโll find that everything is actually optimization ๐).
Algorithms for Optimization (MIT Press, 2019) by Mykel J. Kochenderfer and Tim A. Wheeler is a modern and very nice looking introduction to optimization algorithms.
The book starts with the basics of numerical and automatic differentiation and goes on to cover first-order methods, including all variants used in modern machine learning, second-order and Monte Carlo methods, black-box optimization, and so on. A good part of the book is devoted to discrete optimization (including new ---at least to me--- methods such as ant colony optimization), as well as more advanced topics such as surrogate methods, multidisciplinary optimization, etc.
The book has a nice design, is written in LaTeX (Tufte style), and has tons of figures and, of course, algorithms. The algorithms are all written in Julia.
I am not going to switch from Python to Julia at this point in my life, but I have always found it a very, very charming language. I have extensively used JuMP, which is domain-specific algebraic modeling language for mathematical optimization, and open the Julia REPL whenever I need to do a multiplication. The mere fact that you can name variables with Greek (or any other Unicode) symbols makes it definitely the right choice for this kind of books. Thatโs probably why Kevin P. Murphy, in the introduction to his massive Probabilistic Machine Learning (MIT Press, 2022), toyed with the idea of writing the next iteration of the book in Julia.