Three amazing data science books to read in 2023 (if you won’t manage in 2022)

Four amazing data science books to read in 2023 (if you won’t manage in 2022). Image by yours truly.

Introduction

2022 is a truly amazing year for the machine learning community worldwide! Many long-awaited titles have been or will be soon released, including new editions of all-time classics.

In this post I want to share with you three 2022 titles that I believe are especially worth reading (not only) this year.

At the end of each section, you’ll find a series of links related to each book, including a link to an e-book, hard copy, free copy and a code repository (if available).

Ready? Let’s dive in! 🌊

1. Probabilistic Machine Learning: An Introduction (2022 edition)

The title page and the cover of “Probabilistic Machine Learning: An Introduction” by Kevin Murphy (2022). Image by yours truly.

A new edition of a true classic from Kevin P. Murphy published by The MIT Press.

The brand new edition contains Python code (in the accompanying repository) and covers countless topics from basic probability to graph neural networks. And… all of the topics are presented from the probabilistic point of view! The book is over 750 pages long (excluding appendices and references), contains rich mathematical explanations, helpful graphs and plots and inspiring exercises.

I love Murphy’s style of writing and I find it clear and appealing even when he discusses complex topics. This book can be challenging, but it’s also quite self-contained. Wherever more background is needed the author provides us with helpful references. The book comes with an extremely rich bibliography that takes almost 33 pages.

The sequel to this book — “Probabilistic Machine Learning: Advanced Topics” — will contain deeper dives on topics like Bayesian inference, generative models, causality and structure discovery. Personally — I cannot wait to get it! If you feel similarly, check this link for the most recent updates!

Probabilistic Machine Learning: An Introduction” is a great book if you want to broaden, deepen or organize your statistical and machine learning knowledge. It’s an excellent resource if you need a refresher on some topics or you’re striving to gain deeper mathematical understanding of the concepts that you use in your everyday work. It’s also a staggeringly rich source of references and inspiring code.

Resources:

Interested how to integrate probabilistic modeling and neural networks? Check the series of articles on probabilistic neural networks in Python:

2. Bayesian Modeling and Computation in Python

“Bayesian Modeling and Computation in Python” by Martin et al. (2022). Image by yours truly.

“Bayesian Modeling and Computation in Python” by Osvaldo A. Martin, Ravin Kumar and Junpeng Lao has been published by CRC Press early 2022. The book provides you with over 380 pages of brilliant content, including ample appendices and bibliography.

It’s a great resource to help you solidify your knowledge on Bayesian inference and workflows. Each chapter comes with practical examples and a set of exercises at the end. The book covers basics of Bayesian inference, model exploratory analysis, linear models (incl. hierarchical and mixed-effects models), splines, Bayesian time series & regression trees, end-to-end Bayesian workflows and more.

Practical aspects of modeling are at the heart of the book. Rich visual content helps to build intuitive understanding of inner workings of the models, which is incredibly helpful in the context of debugging complex architectures.

The authors use PyMC3 and TensorFlow Probability as main probabilistic frameworks in the book and briefly discuss other probabilistic languages like Stan and NumPyro. The content heavily relies on ArviZ — a phenomenal Python library for exploratory analysis of Bayesian models. The code is available within the book and in the accompanying repository.

Ample appendices provide us with a solid overview of theoretical basics, but if something’s missing, you can always refer to Kevin Murphy’s “Probabilistic Machine Learning: An Introduction”. Interestingly, Kevin Murphy wrote the foreword for “Bayesian Modeling and Computation in Python”. Both books complement each other greatly and reading them in parallel is a joyful experience!

Resources:

3. Deep Learning on Graphs

“Deep learning on graphs” by Yao Ma & Jiliang Tang (2020). Image by yours truly.

Written by Yao Ma and Jiliang Tang with Chinese version by Yiqi Wang, Wei Jin, Yao Ma and Jiliang Tang and published by Cambridge University Press in September 2021, the book is a comprehensive guide to using deep learning techniques on graphs.

The book covers everything from graph and deep learning foundations to advanced topics in graph neural networks (GNNs). The authors provide solid and clear mathematical and intuitive explanations for discussed concepts. You’ll find popular architectures like GCNs, GAT or GraphSAGE discussed here as well as less popular — but definitely not less interesting — topics like variational autoencoders on graphs.

The book is neatly divided into four main sections: (1) Foundations, (2) Methods, (3) Applications and (4) Advances. You might think that the best way to read it is to follow this structure, but there are less linear recommendations made by the authors depending on your background and your goals. These recommendations have a form of a… graph:

How to read “Deep learning on graphs” by Yao Ma & Jiliang Tang (2020). Image by yours truly.

The book provides solid foundations of GNNs. If you want to delve deeper into a given topic, there’s a very useful Further Reading list at the end of each section.

If you’d like to triangulate your GNN learning expereince, I think reading this book in parallel with this great series of lectures by Jure Leskovec is a brilliant idea:

A great playlist on GNNs by Jure Leskovec @ Stanford University.

Resources:

Thank you

Thank you for reading the article. Feel free to let me know your thoughts in the comments.

Let’s stay in touch and connect on LinkedIn 👋🏼

________________

❤️ If you want to support my writing, you can consider becoming a Medium member using this link:

_______________

Amazon links in this article are affiliate links. This means that if you decide to buy a book using them, you’ll support the authors of the books and also the author of this post (yours truly)!

Thank you! ❤️

_______________

Innovation Lead & Machine Learning Researcher. Author of #SundayAiPapers — a Linked-In-based microblog on NLP, causality & probabilistic modeling || alxndr.io

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Using rain-flow counting methods for process wear out studies

Rainflow cycle count

This blog post is part of the Udacity Data Scientists Nanodegree Program.

Train Your Custom Deep Learning Model in AWS SageMaker

Different Ways of Creating DataFrame With Python

The Hidden Class of Python

Get a better focus on market sentiment so you would be able to make better predictions.

Understanding Customer Churning

Rules Not To Follow About System

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aleksander Molak

Aleksander Molak

Innovation Lead & Machine Learning Researcher. Author of #SundayAiPapers — a Linked-In-based microblog on NLP, causality & probabilistic modeling || alxndr.io

More from Medium

Welcome, 2022🎉. What Has Changed in Data Science in 2021?

What is Relational Machine Learning?

D4S Sunday Briefing #148

Top Machine Learning Book Recommendations by AI expert