Exploring and Analyzing Data with Splunk (EADS) – Outline

Detailed Course Outline

Topic 1 – What is Data Science

  • Define terms related to analytics and data science
  • Describe the analytics workflow
  • Describe Artificial Intelligence and Machine Learning
  • Examine common Machine Learning myths
  • Describe Splunk’s Machine Learning tools

Topic 2 – Exploratory Data Analysis

  • Use bin and makecontinuous to restructure and visualize data
  • Examine field statistics with fieldsummary
  • Transform fields with eval and fillnull
  • Clean text with the rex and cleantext commands
  • Solve Anscombe’s Quartet
  • Apply boxplots and 3d scatterplots to visualize data

Topic 3 – Event Clustering

  • Take a behavioral based approach to cluster data
  • Cluster numerical fields using the kmeans command
  • Cluster based of string similarity with the cluster command
  • Find patterns in clusters

Topic 4– Correlations and Transactions

  • Define correlation and co-occurrence
  • Use SPL correlation commands
  • Use the statistical tests from the Machine Learning Toolkit to correlate fields
  • Use streamstats and chart commands to correlate data

Topic 5– Anomaly Detection

  • Define Statistical Outliers
  • Use Add-hoc methods of numerical anomaly detection
  • Find numerical or categorical anomalies with the AnomalyDetection command

Topic 6 – Forecasting

  • Define forecasting use cases
  • Use the predict command to forecast future timeseries