Résumé du cours
Learn how to perform multiple analysis tasks on large datasets using NVIDIA RAPIDS™, a collection of data science libraries that allows end-to-end GPU acceleration for data science workflows.
Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.
Moyens d'évaluation :
- Quiz pré-formation de vérification des connaissances (si applicable)
- Évaluations formatives pendant la formation, à travers les travaux pratiques réalisés sur les labs à l’issue de chaque module, QCM, mises en situation…
- Complétion par chaque participant d’un questionnaire et/ou questionnaire de positionnement en amont et à l’issue de la formation pour validation de l’acquisition des compétences
Pré-requis
Experience with Python, ideally including pandas and NumPy.
Suggested resources to satisfy prerequisites: Kaggle's pandas Tutorials, Kaggle's Intro to Machine Learning, Accelerating Data Science Workflows with RAPIDS
Objectifs
- Implement GPU-accelerated data preparation and feature extraction using cuDF and Apache Arrow data frames
- Apply a broad spectrum of GPU-accelerated machine learning tasks using XGBoost and a variety of cuML algorithms
- Execute GPU-accelerated graph analysis with cuGraph, achieving massive-scale analytics in small amounts of time
- Rapidly achieve massive-scale graph analytics using cuGraph routines
Suite de parcours
Contenu
Introduction
- Meet the instructor.
- Create an account at courses.nvidia.com/join
GPU-Accelerated Data Manipulation
- Ingest and prepare several datasets (some larger-than-memory) for use in multiple machine learning exercises later in the workshop:
- Read data directly to single and multiple GPUs with cuDF and Dask cuDF.
- Prepare population, road network, and clinic information for machine learning tasks on the GPU with cuDF.
GPU-Accelerated Machine Learning
- Apply several essential machine learning techniques to the data that was prepared in the first section:
- Use supervised and unsupervised GPU-accelerated algorithms with cuML.
- Train XGBoost models with Dask on multiple GPUs.
- Create and analyze graph data on the GPU with cuGraph.
Project: Data Analysis to Save the UK
- Apply new GPU-accelerated data manipulation and analysis skills with population-scale data to help stave off a simulated epidemic affecting the entire UK population:
- Use RAPIDS to integrate multiple massive datasets and perform real-world analysis.
- Pivot and iterate on your analysis as the simulated epidemic provides new data for each simulated day.
Assessment and Q&A
Moyens Pédagogiques :