Developer Introduction to AI on AWS

Developer Who Wanted to Get
Introduced to AI on AWS

Many AI classes cover theoretical techniques of algorithms. In this curriculum - we will train you to become a Full Stack Data Scientist, capable of not just training models but also deploying and managing them in production for business value. You will learn your Full Stack DS skills by building 7 production AI microservices in AWS.

This curriculum is designed to have 8 hands on sessions where you will build, step by step, production grade AI services for business applications. You will learn how to build production AI in AWS, how to manage an AI through its lifecycle, and how to select the right AI algorithms for different business use cases.

Each session will be 1.5 hours and will cover

A business use case and how AI maps to that use case
An AI algorithm and AI internals technical topic. For example, how and when to use Regression, Classification and how to map the right powerful AI algorithms for each data type.
How to build a production AI (Feature Engineering, Model Training, Model Validation, Endpoints, Gateway/Lambda Integration, Application Integration).
One or more AWS tools. Across the 8 sessions we will cover AWS Sagemaker, AWS Sagemaker BuiltIn Algorithms, AWS Sagemaker with Custom Docker Containers, AWS Sagemaker Endpoint, AWS Lambda, AWS API Gateway, AWS Roles and Authentication, AWS Cloudwatch, AWS S3, Postman API testing, cURL, Production AI Best Practices. Microservice design pattern for AI deployment. We will also cover Navigator, a life cycle overlay tool that makes AWS tools easier to configure and use.
You will build 7 working production AI services.

We will cover the following business use cases:

Churn: Detect whether your customers are about leave your business or your product. Predict which customers are at risk of churn.
Pricing Analysis: Understand what factors most affect price. Predict how price can change with features. Assess viability of price for future products
Customer approvals: Should you approve a loan for a particular customer? Predict whether a customer is likely to be delinquent on a bill?
Appointment planning: Is a customer likely to miss an appointment? Can you plan your appointment schedule more effectively?
Removing bias in Business Analysis: Bias in your AI data can lead to poor outcomes, unhappy users or even legal problems. How to detect and remove sources of bias?
Sentiment Analysis: Are you customer’s happy with your product? What do their comments, tweets and other writings say about their feelings?
Making recommendations: What can you learn about your customers or users? Can you analyze their usage and see what else you can upsell to some users?

You will learn how to use these Production AI Cloud Tools:

How to configure and use S3 for your data
How to feature engineer your data with Python code
How to configure and use AWS Sagemaker. Deep dive on built in AWS Sagemaker algorithms KNN and XGBoost. How to hyper-parameter tune Sagemaker algorithms.
How to bring custom code into AWS Sagemaker as a Docker container
Configuring and using a Sagemaker Endpoint.
Connecting a Sagemaker Endpoint to a public URL via AWS Gateway and Lambda.
Integrating REST microservices with applications. Using CURL/Postman for API testing.
Cloud AWS best practices. Cloudwatch for logs, managing endpoints.
Navigator for ease of use. How to use Navigator and AWS together.

AI Algorithms, Algorithm Internals and ML Technical Concepts

How to build and use production grade AI. How to select an algorithm for a use case, train, deploy and use it in production, and measure how well it is doing.

How to map each use case to AI - what type of AI can be used, for what data type. How to measure the AI.
Powerful general purpose algorithms like XGBoost, LinearLearner, KNN, etc. and how they work internally and how to hyper-parameter tune them for best performance.
Metrics and practices for algorithmic evaluation and how to map them to use case.
AI Trust, Fairness and Bias. Managing AI related risk in business use case.
Explainability
An intro to advanced aspects of Production AI - live monitoring and diagnosis, model versioning, retraining and others.

Prerequisites

Bring an AWS account (you can sign up one for free at AWS). You will use your own AWS account to run the hands on labs and have all artifacts (models, datasets) in their account for further use.
We assume that you have some coding experience in some language. We will provide examples primarily in Python.

Lesson Plan

Session 01 (1 hour)

Session 02 (1 hour)

Session 03 (1 hour)

Session 04 (1 hour)

Session 05 (1 hour)

Session 06 (1 hour)

Session 07 (1 hour)

Session 08 (1 hour)

Build and run a production ML in the cloud - Customer Churn

In this first session, we will show the steps needed to build and run a production ML in the cloud. We will illustrate these steps with our first business use case, Customer Churn.

Overview of the production ML lifecycle and all of its steps. How to go from data to running production ML
Description of cloud services that can be used for each stage
Overview of a customer churn problem and how to build a ML service for it with AWS with Binary Classification Algorithms
Lab 1: Build a ML service for Customer Churn and test it.

Feature Engineering and KNN

This session builds upon the first. We will go over an important aspect of the ML pipeline, Feature Engineering. We will go over the transformations that are required before a dataset can be used by an AWS ML algorithm for building a model. We also examine the use case of Pricing Analysis and how a production AI based on Regression techniques can address this use case. We follow the lifecycle steps shown in the first session, but delve into more depth on how to perform appropriate feature engineering for different scenarios. We also introduce attendees to an ML algorithm called KNN and how it can be used for both regression and classification problems. In the code lab, attendees will perform feature engineering transforms, save the transformed data back in S3 and get started with configuring a KNN algorithm in AWS SageMaker.

How to perform feature engineering as a part of the AI life cycle in production
Overview of a business Pricing Problem and how to build an AI for it using Regression Algorithms
Common approaches to Feature Engineering - One Hot Encoding and Missing Value handling
Lab 2: Perform feature engineering on two datasets, Pricing and Churn. Build an AI in the cloud using AWS SageMaker (KNN)

Evaluating effectiveness of production AI and XGBoost

In this session, we will explore a new algorithm called XGBoost and configure its hyper-parameters in AWS SageMaker. We will also learn how to evaluate the performance of an AI algorithm and use this criterion to tune the hyper-parameters and pick the best model. In particular, we delve into evaluation metrics beyond accuracy and cover both ML metrics (such as Confusion Matrix) and production service metrics (such as latency and scale) that are important for production. As a case study, we will use the Pricing and Churn datasets that were introduced in the previous sessions

How to build and configure hyper-parameters for both regression and classification type of problems
Understand the hyper-parameters exposed by AWS SageMaker and use KNN and XGBoost for performing training on the two types of problems: Classification and Regression
How hyper-parameters affect solution quality and performance
Lab 3: Train and evaluate several ML algorithms in the cloud using AWS SageMaker and AWS CloudWatch

AI deployment as a microservice in the Cloud

In this session, we will focus on deployment of the trained ML model into production as a microservice. This will involve several aspects, such as endpoint creation, IAM role creation and configuration of a Lambda and API Gateway. This session will focus on the application integration of production AI and how to test and evaluate this integration. In the code lab, attendees will build an end to end working AI and perform external evaluation of the AI to access its performance.

How to deploy an AI in the cloud as a microservice
Evaluate a production AI with both ML metrics (like Accuracy, Confusion Matrix, True/False Positive/Negative etc.) and code metrics (throughput, accuracy, etc.)
How to test their AI service using CURL and Postman
How to integrate their AI service into Python, Java or Javascript applications
Lab 4: Build an end to end production pipeline in the cloud

Custom AI algorithms in the Cloud: Use Case - Sentiment Analysis

In this session, we will focus on creating custom algorithms and using them in the cloud. We dive deep into algorithms for Text Classification, particularly the Bag of Words approach, its operation, advantages and limitations. We will look into how AWS SageMaker allows the deployment of custom algorithms, such as text classification using bag of words. This will involve building a custom docker container and configuring an AWS client in the local environment to push the container into AWS ECR. This session will focus on building custom algorithms and deploying them as a microservice. In the code lab, you will build a custom algorithm (Sentiment analysis use case) and create a docker container that conforms to AWS SageMaker requirements for custom algorithms. You will build an end to end working AI for the custom algorithm

How to build and evaluate production AIs that use Text Classification algorithms
Overview of a business Sentiment Analysis problem and how to build an AI for it using Text Classification in either Binary or Multiclass form
How to build custom AI algorithms
How to build docker containers with the custom algorithms that conform to AWS requirements.
How to push the docker container into AWS ECR so that they can be used by SageMaker.
Lab 5: build an end to end production pipeline in the cloud using custom algorithms.

Skew in AI. How to Detect and Remove them in Production AI.

In this session, we will spend a part of the time to finish up the end to end lifecycle using the custom algorithm described in the previous session as it typically takes more than one session to cover. After that, we focus on the critical issue of data Skew in AI. and AI Bias. We cover how skew in data impacts the performance of AI and steps that can be taken to eliminate it. We will also cover the common ways that Bias can enter a production AI implementation and how to detect and remove Bias. As a case study, we use the publicly available Appointment planning and COMPAS dataset. In the code lab, attendees will build several AIs using these datasets. We will also introduce the overlay tool Pyxeda Navigator and provide resources for attendees to use it

What is data Skew and how to overcome it.
AI Bias and why should it be avoided?
How can AI Bias enter? How can it be detected and removed?
How to test for Bias and apply Feature Engineering to reduce it.
Lab 6: Build several AIs and perform analysis to detect skew and bias

Mapping AI problems to Techniques and Drift in Production: Case Study - Making Recommendations

In the first 6 sessions, we have covered different types of use cases, how to map AI to each use case, and different aspects of production AI (hyper parameter tuning, REST API test and integration, etc.) In this session we combine these into a methodology for mapping problems to AI methods for numerical, categorical and free form text data. We cover in depth another powerful topic called Drift and its impact on real world production deployments. In the code lab, attendees will build a production recommendation system using publicly available data, illustrating the concepts covered so far

A methodology for mapping use cases (with different data types) into AI algorithms and lifecycle steps
The concept of drift in production and how it impacts a Business
Techniques to detect drift in production
Lab 7: Build and use a production AI for recommendations

Advanced topics in Production Cloud AI, Complete and showcase your project.

About AIClub’s Professional Development Program (AIClubPro)

AIClub is an education technology company focused on Artificial Intelligence Literacy. We educate individuals from students to professionals, covering all aspects of Artificial Intelligence and related technologies at appropriate depth from introductory to advanced, depending on the individual's prior knowledge and future interests. Our AIClubPro programs have educated professionals from many industries, with roles ranging from Engineering and Operations to Product Management, Marketing and Executive Leadership.

The AIClubPro program is led by three founders with exceptional depth and expertise in both industry and academia. Together, our leadership team has over 40 years of professional experience in technology, having served in executive roles in public companies and startups, founded four companies, and with over 200 patents and over 100 research publications.