Splunk Data Science Analyst Fast Track (SDSA-FT) – Outline

Detailed Course Outline

Topic 1 – Using Lookup Commands
  • Understand lookups
  • Use the inputlookup command to search lookup files
  • Use the lookup command to invoke field value lookups
  • Use the outputlookup command to create lookups
  • Invoke geospatial lookups in search
Topic 2 – Adding a Subsearch
  • Define subsearch
  • Use subsearch to filter results
  • Identify when to use subsearch
  • Understand subsearch limitations and alternatives
Topic 3 – Using the return Command
  • Use the return command to pass values from a subsearch
  • Compare the return and fields commands
Topic 4 – Optimize Search
  • Understand how search modes affect performance
  • Examine the role of the Splunk Search Scheduler
  • Review general search practices
Topic 5 – Report Acceleration
  • Define acceleration and acceleration types
  • Understand report acceleration and create an accelerated report
  • Reveal when and how report acceleration summaries are created
  • Search against acceleration summaries
Topic 6 – Data Model Acceleration
  • Understand data model acceleration
  • Accelerate a data model
  • Use the datamodel command to search data models
Topic 7 – Using the tstats Command
  • Explore the tstats command
  • Search acceleration summaries with tstats
  • Search data models with tstats
  • Compare tstats and stats
Topic 8 – What is Data Science
  • Define terms related to analytics and data science
  • Describe the analytics workflow
  • Describe Artificial Intelligence and Machine Learning
  • Examine common Machine Learning myths
  • Describe Splunk’s Machine Learning tools
Topic 9 – Exploratory Data Analysis
  • Use bin and makecontinuous to restructure and visualize data
  • Examine field statistics with fieldsummary
  • Transform fields with eval and fillnull
  • Clean text with the rex and cleantext commands
  • Solve Anscombe’s Quartet
  • Apply boxplots and 3d scatterplots to visualize data
Topic 10 – Event Clustering
  • Take a behavioral based approach to cluster data
  • Cluster numerical fields using the kmeans command
  • Cluster based of string similarity with the cluster command
  • Find patterns in clusters
Topic 11– Correlations and Transactions
  • Define correlation and co-occurrence
  • Use SPL correlation commands
  • Use the statistical tests from the Machine Learning Toolkit to
  • correlate fields
  • Use streamstats and chart commands to correlate data
Topic 12– Anomaly Detection
  • Define Statistical Outliers
  • Use Add-hoc methods of numerical anomaly detection
  • Find numerical or categorical anomalies with the
  • AnomalyDetection command
Topic 13 – Forecasting
  • Define forecasting use cases
  • Use the predict command to forecast future timeseries