Detailed Course Outline
Module 1 - Data Warehouse Solutions on Google Cloud
Topics:
- Implementing Big Data Solutions on Google Cloud
- Customer Needs
- Sample Architectures
- Migration Strategies and Planning
- Working with PSO
Objectives:
- Describe the Google portfolio of Data Warehouse and Data Processing services
- Identify the Google strategy for Data Warehouse products and services
- Locate technical resources for Data Warehouse partners
Module 2 - BigQuery for Data Warehousing Professionals
Topics:
- BigQuery Concepts
- BigQuery Permissions and Security
- Monitoring and Auditing
- Schema Design
- Partitioning and Clustering
- Data Capture and Load Jobs
- Handling Change and Slowly Changing Dimensions
- Querying Data
- Managing Workloads and Concurrency
- Analyzing Data
- Sizing and Cost Management
- Query Optimization
- Storage Optimization
Objectives:
- Describe the key components of a successful Data Warehouse implementation on BigQuery
- Identify best practices for implementing a Data Warehouse with BigQuery
- Use the Google Cloud console to access public datasets
- Perform queries using the console and analyze query results using client libraries
- Combine ecommerce datasets to create enhanced datasets using BigQuery joins and unions
Module 3 - Migrating to BigQuery
Topics:
- Migration Phases
- Security
- Google Cloud data warehouse Architecture
- Post Migration
- User Adoption
Objectives:
- Assess an existing data warehouse and develop a strategy to migrate it to BigQuery
- Describe best practices for migrating existing data warehouses to BigQuery
- Identify key resources, tools, and partner assets for migrating to BigQuery
- Migrate sample SQL Server data to BigQuery using Striim
- Identify resources to translate product-specific SQL queries to BigQuery Standard SQL
Module 4 - ETL Tools and Positioning
Topics:
- Dataproc
- Cloud Data Fushion
- Dataflow
Objectives:
- Describe the key features of Dataproc, Cloud Data Fusion, and Dataflow
- Migrate Apache Spark Jobs to Dataproc
- Identify best practices for creating Dataflow workflows using Dataflow templates
- Configure Cloud Data Fusion to create a data transformation pipeline joining multiple sources with BigQuery as an output data sink
- Build data pipelines that will ingest data from Cloud Storage into BigQuery using Dataflow
Module 5 - Streaming Analytics
Topics:
- Why Streaming Analytics?
- The Pub/Sub Service
- Dataflow Windows and Triggers
- Dataflow Sources and Sinks
- Migration and Adoption Challenges
Objectives:
- Identify the components of a streaming analytics solution on Google Cloud
- Create a streaming IoT pipeline using Pub/Sub and Kafka
- Explore design patterns and optimization considerations for streaming analytics solutions
- Create and run a streaming Dataflow pipeline that ingests data from Pub/Sub to BigQuery using a Dataflow template
Module 6 - Introduction to Looker as a Data Platform
Topics:
- Looker Platform Overview
- Looker Platform Architecture
- Paradigm Shift: Modeling Language versus Hardcoded SQL
- Core Analytical Concepts
Objectives:
- Navigate the Looker platform
- Describe the Looker platform architecture
- Discover the advantages of Looker Modeling Language (LookML) over hardcoded SQL
- Describe the four core analytical concepts in Looker
- Analyze and visualize data using Explores in Looker
Module 7 - BigQuery Extended Capabilities
Topics:
- BigQuery GIS
- BigQuery ML
Objectives:
- Describe the key features of BigQuery GIS and BigQuery ML
- Analyze data using BigQuery GIS functions and visualize results using BigQuery Geo Viz
- Train and evaluate an ML model with BigQuery ML to predict taxi fares