Submission Number: 95
Submission ID: 3441
Submission UUID: beb19e4c-3bb9-4635-acf3-6f805b8243d1
Submission URI: /form/resource

Created: Wed, 03/15/2023 - 13:56
Completed: Wed, 03/15/2023 - 13:58
Changed: Fri, 03/14/2025 - 11:43

Remote IP address: 73.229.137.18
Submitted by: Daniel Howard
Language: English

Is draft: No
Approved: Yes
Title: Using Dask on HPC Systems
Category: Learning
Tags:
training (381), jupyterhub (214), python (69)

Skill Level:
Beginner (304), Intermediate (305)

Description:
A tutorial on the effective use of Dask on HPC resources. The four-hour tutorial will be split into two sections, with early topics focused on novice Dask users and later topics focused on intermediate usage on HPC and associated best practices. The knowledge areas covered include (but are not limited to):

Beginner section
High-level collections including dask.array and dask.dataframe 
Distributed Dask clusters using HPC job schedulers
Earth Science data analysis using Dask with Xarray
Using the Dask dashboard to understand your computation 

Intermediate section
Optimizing the number of workers and memory allocation
Choosing appropriate chunk shapes and sizes for Dask collections
Querying resource usage and debugging errors

Link to Resource:
- Dask Tutorial Github Page (https://github.com/NCAR/dask-tutorial)
- Video Recording of Tutorial - Part 1 (https://youtu.be/wJHosuzqLaU)
- Video Recording of Tutorial - Part 2 (https://youtu.be/E4utSzWgJEo)

Domain:
ACCESS CSSN (780), Campus Champions (572), CAREERS (323), CCMNet (835), Great Plains (311), Kentucky (322), Northeast (308)