Description
In this course, you will learn:
- Apply best practices for storing and digesting a huge volume of text data to support distributed learning.
- Investigate data parallelism and model parallelism libraries to facilitate distributed training in SageMaker.
- Explain SageMaker's choices for improving training performance, such as Amazon SageMaker Training Compiler and Elastic Fabric Adapter (EFA).
- Investigate large language model (LLM) optimization methods for efficient model deployment.
- Show how to fine-tune the core models accessible in SageMaker. Jumpstart
Syllabus:
1. Addressing the Challenges of Building Language Models
- Section 1: Common Challenges
- Common LLM Practitioner Challenges
- Section 2: Multi-Machine Training Solutions
- Scaling LLMs with Distributed Training
- Applying Data Parallelism Techniques
- Applying Model Parallelism Techniques
- Section 3: Performance Optimization Solutions
- Performance Optimization Techniques
- Using Purpose-Built Infrastructure
2. Using Amazon SageMaker for Training Language Models
- Section 1: Configuring SageMaker Studio
- SageMaker Basics
- Setting up a SageMaker Studio Domain
- Section 2: SageMaker Infrastructure
- Choosing Compute Instance Types
- Section 3: Working with the SageMaker Python SDK
- SageMaker Python SDK Basics
- Training and Deploying Language Models with the SageMaker Python SDK
3. Ingesting Language Model Data
- Section 1: Preparing Data
- Data Management Overview
- Preparing Data for Ingestion
- Section 2: Analyzing Data Ingestion Options
- Loading Data with the SageMaker Python SDK
- Ingesting Data from Amazon S3
- Ingesting Data with FSx for Lustre
- Additional Data Ingestion Options
- Data Ingestion and Storage Considerations
4. Training Large Language Models
- Section 1: Creating a SageMaker Training Job
- Launching SageMaker Training Jobs
- Modifying Scripts for Script Mode
- Section 2: Optimizing Your SageMaker Training Job
- Monitoring and Troubleshooting
- Optimizing Computational Performance
- SageMaker Training Features for Language Model Training
- Section 3: Using Distributed Training on SageMaker
- SageMaker Distributed Training Support
- Using the SageMaker Distributed Data Parallel Library
- Using the SageMaker Model Parallel Library
- Using the SageMaker Model Parallel Library and Sharded Data Parallelism
- Training with the EFA
- Section 4: Compiling Your Training Code
- Using the SageMaker Training Compiler
5. Deploying Language Models
- Section 1: Deploying a Model in SageMaker
- Introduction to SageMaker Deployment
- Choosing a SageMaker Deployment Option
- Section 2: Deploying Models for Inference
- Real-Time Inference Overview
- Using the SageMaker Python SDK for Model Deployment
- Using the SageMaker Inference Recommender
- Section 3: Deploying Large Language Models for Inference
- Optimization Techniques
- Model Compression Techniques
- Model Partitioning
- Optimized Kernels and Compilation
- Deploying with SageMaker LMI Containers
- Section 4: Additional Considerations
- Other Considerations When Deploying Models on SageMaker
6. Customizing Foundation Language Models for Generative AI Tasks
- Section 1: Introduction
- Introduction to Foundation Models
- Section 2: Using SageMaker JumpStart
- Getting Started with SageMaker JumpStart
- Deploying SageMaker JumpStart Models with the SageMaker Python SDK
- Selecting an FM
- Section 3: Customizing FMs
- Prompt Engineering
- Fine-tune JumpStart Models with the SageMaker Python SDK
- Section 4: Retrieval Augmented Generation (RAG)
- Using Retrieval Augmented Generation (RAG)