Efficient Parallel Computing with Dask
ebook ∣ Definitive Reference for Developers and Engineers
By Richard Johnson
Sign up to save your library
With an OverDrive account, you can save your favorite libraries for at-a-glance information about availability. Find out more about OverDrive accounts.
Find this title in Libby, the library reading app by OverDrive.

Search for a digital library with this title
Title found at these libraries:
Library Name | Distance |
---|---|
Loading... |
"Efficient Parallel Computing with Dask"
"Efficient Parallel Computing with Dask" offers a comprehensive and authoritative guide to mastering the principles and practice of modern parallel and distributed computation using the Dask framework. Beginning with foundational concepts—such as parallelism, scalability, memory models, and task scheduling—the book lays the groundwork for understanding the theoretical and practical challenges that face large-scale data analytics and high-performance computing. Readers are equipped with a systemic view of Python's parallel ecosystem, motivating the rise of Dask as a leading solution for scalable analytics.
Building on this foundation, the book explores Dask's core architecture in detail, from the construction of task graphs and the nuances of scheduling, to worker and cluster topologies, robust fault tolerance, and extensibility via powerful plugin interfaces. Advanced Dask collections are demystified, including large-scale arrays, distributed dataframes, and unstructured data, providing practical guidance for scalable analytics across diverse data formats. Extensive coverage is devoted to deploying and orchestrating Dask clusters in both on-premises and cloud environments, with detailed strategies for security, elasticity, monitoring, and seamless upgrades.
Finally, the book addresses real-world performance tuning, resource optimization, and the integration of Dask into end-to-end data science and machine learning pipelines. It covers deploying in hybrid and multi-cloud environments, ensuring observability, security, and efficient production operations. With a forward-looking perspective on emerging hardware, edge computing, and the evolving Dask ecosystem, this book is an indispensable resource for engineers, data scientists, and practitioners focused on robust, scalable, and future-proof parallel computing with Python.