Overview
This course explores how to the use of the iterative machine learning (ML) process pipeline to solve a real business problem in a project-based learning environment. Students will learn about each phase of the process pipeline from instructor presentations and demonstrations and then apply that knowledge to complete a project solving one of three business problems: fraud detection, recommendation engines, or flight delays. By the end of the course, students will have successfully built, trained, evaluated, tuned, and deployed an ML model using Amazon SageMaker that solves their selected business problem. Learners with little to no machine learning experience or knowledge will benefit from this course. Basic knowledge of Statistics will be helpful.
Activites
This course includes presentations, group exercises, demonstrations, and hands-on labs.
Objectives
In this Data Warehousing on AWS course, participants will be able to:
Prerequisites
We recommend that attendees of The Machine Learning Pipeline on AWS course have:
- Basic knowledge of Python programming language
- Basic understanding of AWS Cloud infrastructure (Amazon S3 and Amazon CloudWatch)
- Basic experience working in a Jupyter notebook environment
Course Outline
Introduction to Machine Learning and the ML Pipeline
- Overview of machine learning, including use cases, types of machine learning, and key concepts
- Overview of the ML pipeline
- Introduction to course projects and approach
Introduction to Amazon SageMaker
- Introduction to Amazon SageMaker
- Demo: Amazon SageMaker and Jupyter notebooks
- Hands-on: Amazon SageMaker and Jupyter notebooks
Problem Formulation
- Overview of problem formulation and deciding if ML is the right solution
- Converting a business problem into an ML problem
- Demo: Amazon SageMaker Ground Truth
- Hands-on: Amazon SageMaker Ground Truth
- Practice problem formulation
- Formulate problems for projects
Preprocessing
- Overview of data collection and integration, and techniques for data preprocessing and visualization
- Practice preprocessing
- Preprocess project data
- Class discussion about projects
Model Training
- Choosing the right algorithm
- Formatting and splitting your data for training
- Loss functions and gradient descent for improving your model
- The Machine Learning Pipeline on AWS
- AWS Classroom Training
- Demo: Create a training job in Amazon SageMaker
Model Evaluation
- How to evaluate classification models
- How to evaluate regression models
- Practice model training and evaluation
- Train and evaluate project models
- Initial project presentations
Feature Engineering and Model Tuning
- Feature extraction, selection, creation, and transformation
- Hyperparameter tuning
- Demo: SageMaker hyperparameter optimization
- Practice feature engineering and model tuning
- Apply feature engineering and model tuning to projects
- Final project presentations