Google Cloud Data Engineer

Live Online (VILT) & Classroom Corporate Training Course

This GCP course covers structured, unstructured, and streaming data.

How can we help you?

Thanks for sharing your details. Our team will get in touch with you soon.
There was an error trying to send your message. Please try again later.

  • CloudLabs

  • Projects

  • Assignments

  • 24x7 Support

  • Lifetime Access

Google Cloud Data Engineer

Overview

This Data Engineering on Google Cloud Platform training course teaches attendees how to design data processing systems, build end-to-end data pipelines, analyze data, and carry out machine learning.

Objectives

At the end of Google Data Engineer training course, participants will be able to

  • Design and build data processing systems on Google Cloud
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Derive business insights from extremely large datasets using Google BigQuery
  • Train, evaluate, and predict using machine learning models using Tensorflow and Cloud ML
  • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  • Enable instant insights from streaming data

Prerequisites

  • Basic proficiency with common query language such as SQL
  • Experience with data modeling, extract, transform, load activities
  • Experience developing applications using a common programming language such as Python
  • Familiarity with Machine Learning and/or statistics

Course Outline

Google Cloud Dataproc Overview2021-06-26T18:08:10+05:30
  • Creating and managing clusters.
  • Leveraging custom machine types and preemptible worker nodes
  • Scaling and deleting Clusters
Running Dataproc Jobs2021-06-26T18:09:54+05:30
  • Running Pig and Hive jobs.
  • Separation of storage and compute.
Integrating Dataproc with Google Cloud Platform2021-06-26T18:10:03+05:30
  • Customize cluster with initialization actions.
  • BigQuery Support.
Making Sense of Unstructured Data with Google’s Machine Learning APIs2021-06-26T18:10:09+05:30
  • Google’s Machine Learning APIs
  • Common ML Use Cases
  • Invoking ML APIs
  • Serverless Data Analysis with Google BigQuery and Cloud Dataflow
Serverless Data Analysis with BigQuery2021-06-26T18:10:19+05:30
  • What is BigQuery
  • Queries and Functions
  • Loading data into BigQuery
  • Exporting data from BigQuery
  • Nested and repeated fields
  • Querying multiple tables
  • Performance and pricing
Serverless, Autoscaling Data Pipelines with Dataflow2021-06-26T18:10:36+05:30
  • The Beam programming model
  • Data pipelines in Beam Python
  • Data pipelines in Beam Java
  • Scalable Big Data processing using Beam
  • Incorporating additional data
  • Handling stream data
  • GCP Reference architecture
  • Serverless Machine Learning with TensorFlow on Google Cloud Platform
Getting Started with Machine Learning2021-06-26T18:12:04+05:30
  • What is machine learning (ML)
  • Effective ML: concepts, types
  • ML datasets: generalization
Building ML Models with Tensorflow2021-06-26T18:12:12+05:30
  • Getting started with TensorFlow
  • TensorFlow graphs and loops + lab
  • Monitoring ML training
Scaling ML Models with CloudML2021-06-26T18:12:24+05:30
  • Why Cloud ML?
  • Packaging up a TensorFlow model
  • End-to-end training
Feature Engineering2021-06-26T18:12:32+05:30
  • Creating good features
  • Transforming inputs
  • Synthetic features
  • Preprocessing with Cloud ML
  • Building Resilient Streaming Systems on Google Cloud Platform
Architecture of Streaming Analytics Pipelines2021-06-26T18:12:43+05:30
  • Stream data processing: Challenges
  • Handling variable data volumes
  • Dealing with unordered/late data
Ingesting Variable Volumes2021-06-26T18:13:05+05:30
  • What is Cloud Pub/Sub?
  • How it works: Topics and Subscriptions
Implementing Streaming Pipelines2021-06-26T18:13:40+05:30
  • Challenges in stream processing.
  • Handle late data: watermarks, triggers, accumulation.
Streaming Analytics and Dashboards2021-06-26T18:13:47+05:30
  • Streaming analytics: from data to decisions
  • Querying streaming data with BigQuery
  • What is Google Data Studio?
High Throughput and Low-Latency with Bigtable2021-06-26T18:14:14+05:30
  • What is Cloud Spanner?
  • Designing Bigtable schema
  • Ingesting into Bigtable
2023-01-06T15:29:30+05:30

Go to Top