Efficient Parallel Computing with Dask

ebook Definitive Reference for Developers and Engineers

By Richard Johnson

cover image of Efficient Parallel Computing with Dask

Sign up to save your library

With an OverDrive account, you can save your favorite libraries for at-a-glance information about availability. Find out more about OverDrive accounts.

   Not today

Find this title in Libby, the library reading app by OverDrive.

Download Libby on the App Store Download Libby on Google Play

Search for a digital library with this title

Title found at these libraries:

Library Name Distance
Loading...

"Efficient Parallel Computing with Dask"
"Efficient Parallel Computing with Dask" offers a comprehensive and authoritative guide to mastering the principles and practice of modern parallel and distributed computation using the Dask framework. Beginning with foundational concepts—such as parallelism, scalability, memory models, and task scheduling—the book lays the groundwork for understanding the theoretical and practical challenges that face large-scale data analytics and high-performance computing. Readers are equipped with a systemic view of Python's parallel ecosystem, motivating the rise of Dask as a leading solution for scalable analytics.
Building on this foundation, the book explores Dask's core architecture in detail, from the construction of task graphs and the nuances of scheduling, to worker and cluster topologies, robust fault tolerance, and extensibility via powerful plugin interfaces. Advanced Dask collections are demystified, including large-scale arrays, distributed dataframes, and unstructured data, providing practical guidance for scalable analytics across diverse data formats. Extensive coverage is devoted to deploying and orchestrating Dask clusters in both on-premises and cloud environments, with detailed strategies for security, elasticity, monitoring, and seamless upgrades.
Finally, the book addresses real-world performance tuning, resource optimization, and the integration of Dask into end-to-end data science and machine learning pipelines. It covers deploying in hybrid and multi-cloud environments, ensuring observability, security, and efficient production operations. With a forward-looking perspective on emerging hardware, edge computing, and the evolving Dask ecosystem, this book is an indispensable resource for engineers, data scientists, and practitioners focused on robust, scalable, and future-proof parallel computing with Python.

Efficient Parallel Computing with Dask