Detailed Course Outline
Topic 1 – What is Data Science
- Define terms related to analytics and data science
- Describe the analytics workflow
- Describe Artificial Intelligence and Machine Learning
- Examine common Machine Learning myths
- Describe Splunk’s Machine Learning tools
Topic 2 – Exploratory Data Analysis
- Use bin and makecontinuous to restructure and visualize data
- Examine field statistics with fieldsummary
- Transform fields with eval and fillnull
- Clean text with the rex and cleantext commands
- Solve Anscombe’s Quartet
- Apply boxplots and 3d scatterplots to visualize data
Topic 3 – Event Clustering
- Take a behavioral based approach to cluster data
- Cluster numerical fields using the kmeans command
- Cluster based of string similarity with the cluster command
- Find patterns in clusters
Topic 4– Correlations and Transactions
- Define correlation and co-occurrence
- Use SPL correlation commands
- Use the statistical tests from the Machine Learning Toolkit to correlate fields
- Use streamstats and chart commands to correlate data
Topic 5– Anomaly Detection
- Define Statistical Outliers
- Use Add-hoc methods of numerical anomaly detection
- Find numerical or categorical anomalies with the AnomalyDetection command
Topic 6 – Forecasting
- Define forecasting use cases
- Use the predict command to forecast future timeseries