Computing Reviews

Foundational Python for data science
Behrman K., Addison-Wesley Professional,Boston, MA,2021. 256 pp.Type:Book
Date Reviewed: 04/20/23

The collection of books on data science is becoming so large that it is itself an interesting subject for data science analysis [1,2]. This is amply justified by the galloping success of data science among students, researchers, and practitioners in both academia and industry. In this large readership, the supply of books is very diverse. Behrman’s Foundational Python for data science fits right in with this collection, though of course with its own distinctive features.

Data science is a relatively new field, strongly interdisciplinary, which is at the intersection of math, statistics, and computer science [3], but also requires economic and legal skills, among others. The increasingly available data coming from several sources (social networks, Internet of Things devices, user interactions, and so on) accompanied with affordable computational resources has attracted the interest of many companies that want to extract value from data. Managing such data, often characterized by large volume, extreme variety, or high production velocity, requires a new profession, that is, the data scientist, which Harvard Business Review called “the sexiest job of the 21st century” [4]. In turn, the potential for a career in data science attracts people with heterogeneous competences, who need to be aligned on some basic skills like programming, especially in Python, which is the elective language for data science.

Many books on Python programming focus on the use of some libraries specifically suited for data science tasks, while giving little space to the fundamentals of the Python language. Other books, like Behrman’s, focus on the language (and few libraries devoted to data science), leaving the detailed study of libraries to other books. (There is also a third category of book: one that covers both the language and its libraries in depth.) Books like Behrman’s are especially suited to study programs, for example, undergraduate programs in data science, where the study of programming is separate from the study of other topics that require specialized libraries (such as machine learning, statistics, natural language processing, and so on). Books of this category can be very succinct [5] or quite large [6]; Behrman’s is quite slim and may suit a short crash course on Python programming.

Learning programming for data science is different from learning programming for computer science. While the latter puts great emphasis on problem solving, abstraction, programming in the large, programming paradigms, and so on, in data science the learning approach is usually more small-scale, mainly based on scripting, and more pragmatic. (A computer scientist knows how to implement a method; a data scientist knows how to use the implementation for seeking a goal.) With this distinction in mind, Behrman’s book unfolds the basic concepts of Python programming in a classical fashion, but puts greater emphasis on the native Python data structures, which are presented before execution control and functional abstraction. (Books that follow the standard approach adopted in CS courses are usually organized around a different order of topics; see, for example, Deitel and Deitel [6].) Each chapter ends with a bunch of questions, with corresponding answers in a separate appendix. In my opinion, the provided questions are too few and not enough to learn a language that, like any programming language, requires a lot of practice.

Furthermore, some of the author’s personal choices may not find general agreement. For example, Behrman discourages the use of lambda functions (despite Python’s orientation to the functional paradigm that is so helpful in data science programming), while nothing is said about the readability issues of the reduce() function (in fact, because of readability issues, Python developers have confined it to a specialized module). Furthermore, some key Python features like lazy evaluation are overlooked (with just a trace when generators are introduced); in fact, lazy evaluation motivates the use of map() and filter() functions, which otherwise remain inexplicably indistinguishable from comprehensions. Finally, errors in some listings could confuse the novice reader.

Overall, this book is best used in a short course on programming fundamentals for data science, with tight supervision from a teacher who can help students by integrating materials and providing many exercises. For self-study, insights into Python, and self-paced practice, other resources may be recommended.

More reviews about this item: Amazon


Amazon Data Science Books Dataset. (accessed 4/19/23).


Thu Vu data analytics, “I Analyzed 1000 Data Science Books on Amazon: Here's What I Found,” YouTube video, 23:14, Nov. 9, 2022,


ACM Data Science Task Force. Computing competencies for undergraduate data science curricula. ACM, Jan. 2021,


Davenport, T. H.; Patil, DJ. Data scientist: the sexiest job of the 21st century. Harvard Business Review, (Oct. 2012),


Needham, T. C. Python for beginners: a crash course guide to learn Python in 1 week. Independently published, 2017.


Deitel, P.; Deitel, H. Intro to Python for computer science and data science: learning to program with AI, big data and the cloud. Pearson, Harlow, UK, 2022.

Reviewer:  Corrado Mencar Review #: CR147578 (2306-0068)

Reproduction in whole or in part without permission is prohibited.   Copyright 2023™
Terms of Use
| Privacy Policy