Data Science and Big Data Analytics (MR-1CP-DSBDA)

 

Course Overview

This course provides practical foundation level training that enables immediate and effective participation in Big Data and other analytics projects. It includes an introduction to Big Data and the data analytics lifecycle to address business challenges that leverage Big Data. The course provides grounding in basic and advanced analytic methods and an introduction to Big Data analytics technology and tools, including MapReduce and Hadoop. Labs offer opportunities for students to understand how these methods and tools may be applied to real world business challenges by a practicing data scientist. The course takes an “open”, or technology-neutral approach and includes a final lab which addresses a big data analytics challenge by applying the concepts taught in the course in the context of the data analytics lifecycle. The course prepares the student for the Dell EMC Proven™ Professional Data Scientist Associate (EMCDSA) certification exam.

Moyens Pédagogiques :
  • Quiz pré-formation de vérification des connaissances (si applicable)
  • Réalisation de la formation par un formateur agréé par l’éditeur
  • Formation réalisable en présentiel ou en distanciel
  • Mise à disposition de labs distants/plateforme de lab pour chacun des participants (si applicable à la formation)
  • Distribution de supports de cours officiels en langue anglaise pour chacun des participants
    • Il est nécessaire d'avoir une connaissance de l'anglais technique écrit pour la compréhension des supports de cours
Moyens d'évaluation :
  • Quiz pré-formation de vérification des connaissances (si applicable)
  • Évaluations formatives pendant la formation, à travers les travaux pratiques réalisés sur les labs à l’issue de chaque module, QCM, mises en situation…
  • Complétion par chaque participant d’un questionnaire et/ou questionnaire de positionnement en amont et à l’issue de la formation pour validation de l’acquisition des compétences

Who should attend

This course is intended for individuals seeking to develop an understanding of Data Science from the perspective of a practicing Data Scientist,including:

  • Managers of teams of business intelligence, analytics, and big data professionals
  • Current Business and Data Analysts looking to add big data analytics to their skills.
  • Data and database professionals looking to exploit their analytic skills in a big data environment
  • Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of data science and big data
  • Individuals seeking to take advantage of the EMC Proven™ Professional Data Scientist Associate (EMCDSA) certification

Prerequisites

To complete this course successfully and gain the maximum benefits from it, a student should have the following knowledge and skill sets:

  • A strong quantitative background with a solid understanding of basic statistics, as would be found in a statistics 101 level course
  • Experience with a scripting language, such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (with an RStudio GUI), which is an open source statistical tool and programming
  • Experience with SQL

Course Objectives

Upon successful completion of this course, participants should be able to:

  • Immediately participate as a data science team member
  • Work with large data sets and generate insights
  • Build predictive and classification models
  • Manage a data analytics project through the entire lifecycle

Course Content

Module 1 - Introduction to Big Data analytics
  • Big Data and its characteristics Lesson
  • Business value from Big Data
  • Data scientist
Module 2 – Data Analytics Lifecycle
  • Data analytics lifecycle overview
  • Discovery phase
  • Data preparation phase
  • Model planning phase
  • Model building phase
  • Communicate results phase
  • Operationalize phase
Module 3 – Basic data analytics methods using R
  • Introduction to the R programming language
  • Analyzing and exploring data
  • Statistics for model building and evaluation
Module 4– Advanced analytics theory and methods
  • Introduction to advanced analytics—theory and methods
  • K-means clustering
  • Association rules
  • Linear regression
  • Logistic regression
  • Text analysis
  • Naïve Bayes
  • Decision trees
  • Time series analysis
Module 5: Advanced analytics—technology and tools
  • Introduction to advanced analytics—technology and tools
  • Hadoop ecosystem
  • In-database analytics SQL essentials
  • Advanced SQL and MADlib
Module 6: Putting it all together
  • Preparing to operationalize
  • Preparing project presentations
  • Data visualization techniques

Prix & Delivery methods

Formation en ligne

Durée
5 jours

Prix
  • 2 824,– €
 

Agenda

Délai d’accès – inscription possible jusqu’à la date de formation
Instructor-led Online Training :   Cours en ligne avec instructeur

Anglais

Fuseau horaire : Heure normale d'Europe centrale (HNEC)   ±1 heure

Formation en ligne Fuseau horaire : Heure normale d'Europe centrale (HNEC) Langue : Anglais
Formation en ligne Fuseau horaire : Heure normale d'Europe centrale (HNEC) Langue : Anglais
Formation en ligne Fuseau horaire : Heure normale d'Europe centrale (HNEC) Langue : Anglais
Formation en ligne Fuseau horaire : Heure normale d'Europe centrale (HNEC) Langue : Anglais
Formation en ligne Fuseau horaire : Heure d'été d'Europe centrale (HAEC) Langue : Anglais
Formation en ligne Fuseau horaire : Heure normale d'Europe centrale (HNEC) Langue : Anglais

7 heures de différence

Formation en ligne Fuseau horaire : Central Standard Time (CST) Langue : Anglais