Name: Enhancing Data Science Outcomes With Efficient Workflow
Price: 500 USD

Enhancing Data Science Outcomes With Efficient Workflow (EDSOEW)

Résumé du cours

Learn how to create an end-to-end, hardware-accelerated machine learning pipeline for large datasets. Throughout the development process, you’ll use diagnostic tools to identify delays and learn to mitigate common pitfalls.

Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.

Moyens Pédagogiques :

Quiz pré-formation de vérification des connaissances (si applicable)
Réalisation de la formation par un formateur agréé par l’éditeur
Formation réalisable en présentiel ou en distanciel
Mise à disposition de labs distants/plateforme de lab pour chacun des participants (si applicable à la formation)
Distribution de supports de cours officiels en langue anglaise pour chacun des participants
- Il est nécessaire d'avoir une connaissance de l'anglais technique écrit pour la compréhension des supports de cours

Moyens d'évaluation :

Quiz pré-formation de vérification des connaissances (si applicable)
Évaluations formatives pendant la formation, à travers les travaux pratiques réalisés sur les labs à l’issue de chaque module, QCM, mises en situation…
Complétion par chaque participant d’un questionnaire et/ou questionnaire de positionnement en amont et à l’issue de la formation pour validation de l’acquisition des compétences

Pré-requis

Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.
Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the “Get Started” guide from Dask.
Completion of the DLI’s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.

Objectifs

Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
Scale data science workflows using distributed computing
Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
Enhance machine learning solutions through feature engineering and rapid experimentation
Improve data processing pipeline performance by optimizing memory management and hardware utilization

Suite de parcours

Building AI-Based Cybersecurity Pipelines (BABCP)

Contenu

Introduction

Meet the instructor.
Create an account at courses.nvidia.com/join

Advanced Extract, Transform, and Load (ETL)

Learn to process large volumes of data efficiently for downstream analysis:
- Discuss current challenges of growing data sizes.
- Perform ETL efficiently on large datasets.
- Discuss hidden slowdowns and perform DataFrame transformations properly.
- Discuss diagnostic tools to monitor and optimize hardware utilization.
- Persist data in a way that’s conducive for downstream analytics.

Training on Multiple GPUs With PyTorch Distributed Data Parallel (DDP)

Learn how to improve data analysis on large datasets:
- Build and compare classification models.
- Perform feature selection based on predictive power of new and existing features.
- Perform hyperparameter tuning.
- Create embeddings using deep learning and clustering on embeddings.

Deployment

Learn how to deploy and measure the performance of an accelerated data processing pipeline:
Deploy a data processing pipeline with Triton Inference Server.
Discuss various tuning parameters to optimize performance.

Assessment and Q&A

Prix & Delivery methods

Formation en ligne

Durée
0,5 jours

Prix

US $ 500,–

Dates et Inscription

Demande de date

Actuellement aucune session planifiée

Modalités de financement

Handicap