NoShade.Vision

Use Amazon SageMaker to Build Generative AI Applications- AWS Virtual Workshop

AWS-Online-Tech-Talks

Use Amazon SageMaker to Build Generative AI Applications- AWS Virtual Workshop by AWS-Online-Tech-Talks

The AWS virtual workshop showcases the use of Amazon SageMaker to build generative AI applications. The speakers discuss the concept of prompt engineering to obtain specific responses from a generative model and demonstrate the use of SageMaker console and its proprietary and available models. They also touch upon the training process of custom models and the installation of additional packages using containers, data parallelism, and accelerators like Trainium and Inferentia. The speakers highlight the optimization options available for inference containers to parallelize models across multiple GPUs and reduce memory requirements. Overall, the workshop emphasizes experimenting with different parameters and options to select the right model for each use case.

00:00:00

In this section of the video, Emily Webber and Mani Khanuja from Amazon Web Services discuss generative AI and its use cases on Amazon's SageMaker. They define a foundation model as a way to combine vast amounts of data with neural networks to create a model that can be used in many downstream cases. Generative AI involves generating artifacts, such as text, code, image from text, and image-to-image. For instance, a user has created a generative AI assistant that summarizes book reviews. They also discuss how billions of image-text pairs can be combined into a model that generates new images, with Stability and Stable Diffusion as a prime example of a customer using SageMaker.

00:05:00

In this section of the AWS virtual workshop, the speakers explain how generative AI models can be used to generate new and unique images based on provided prompts. Prompt engineering is emphasized as the process of writing a good prompt for a generative AI model to get the best response. The model they show on the left-hand side generates a new interior based on the prompt provided. They discuss the process of few-shot learning where the model is given a few examples to learn from and zero-shot learning where basic text prompts are given. The speakers also introduce Cohere's model and AI21 lab as examples of how generative AI models can be used in the industry. They later touch on proprietary models hosted on AWS and how to use them.

00:10:00

engineering with existing models in SageMaker. However, for those who want to go deeper and customize their models, they can train their own models using SageMaker. This was the case for LG's AI designer Tilda, which was able to generate prints and design clothing in just one month compared to the nine months it would take a human fashion designer. The use of generative AI models like Tilda not only speeds up production but also enables sustainable practices by minimizing waste. Ultimately, the first step for those starting their journey with generative AI is prompt engineering with existing models, but the possibility of training custom models is also available in SageMaker.

00:15:00

In this section, the speaker discusses the different stages of building artificial intelligence (AI) applications with Amazon SageMaker. The first phase involves using existing models and custom data sets to modify the models for a particular domain. The second stage is fine-tuning, which involves modifying the model parameters with specific training data. The third stage is pre-training, which entails creating a new foundation model using large-scale GPU optimization and data sets that are specific to the industry. The speaker provides a demonstration of SageMaker Console, which features proprietary and publicly available models that can be used for different benchmarks. A Holistic Evaluation of Language Models (HELM) benchmark is also available to help users select the appropriate model for their use case.

00:20:00

In this section of the video, the speakers discuss the Stanford Independent Model Evaluation, which provides an idea on which generative AI model to pick based on scenarios, however, what works for one use case may not work for another. Stanford's HELM provides a way to evaluate language models, and afterwards, the best approach is to test the models directly using the playground. The speakers then move onto experimenting with generative AI image models and discuss the different parameters that can be adjusted for generating output, such as increasing denoising steps and adjusting the guidance scale for model creativity. Seed is also noted as an interesting parameter for changing the overall style of an image.

00:25:00

In this section, Mani and Emily discuss how to deploy and use the trained AI model on Amazon SageMaker. The proprietary models run in an escrow account, ensuring that the providers retain control and ownership. Customers can send confidential data to the proprietary models without sharing it with the providers. Mani shows how to subscribe to and deploy the model, and how to prompt it to generate text. They also highlight the Quick Start solutions available for various models, including FLAN-T5 XXL for chain-of-thought reasoning data, and the importance of selecting the right model for each use case.

00:30:00

In this section, the speaker discusses how to use Amazon SageMaker to build generative AI applications. They highlight that once a model has been subscribed to, users receive a GitHub repository and code to deploy the model. The speaker also explains that there are different ways to test out the model, such as using the notebook experience or a small utility within the studio. Users can run the utility by installing Streamlit, and then running a command in a terminal. They can then use the utility to select endpoints and invoke models using one single playground. The speaker provides an example of adding inference parameters or a JSON template for the FLAN-T5 model and using prompts for summarization and few shot tasks.

00:35:00

In this section, Mani and Emily demonstrate the capabilities of Amazon SageMaker in prompt engineering. They give an example of entity extraction, where the model extracts entities from a given prompt. They also explore the zero-shot chain of thought prompting where the model provides reasoning for its given conclusion. Additionally, they demonstrate mathematical reasoning using prompt engineering and finally give a real-life example of personal inflation using Amazon SageMaker.

00:40:00

In this section, the video discusses the process of fine-tuning machine learning models using Amazon SageMaker. They explain how to fine-tune models using scripts or pre-built examples, and they explain that the first step is always installing the necessary software. They also explain the importance of having a dataset and how you can plot it before launching a distributed training job on SageMaker. Finally, they mention that there are various examples available on the open-source repository to help developers fine-tune their models.

00:45:00

In this section of the video, the speakers discuss the different aspects needed to use Amazon SageMaker to build generative AI applications. These aspects include specifying the instance on which the job will run, setting up the environment with necessary packages through containers, and providing training scripts written in Python. They also mention three ways to install additional packages: getting your own container, mentioning requirements in a TXT file in the source directory, or using containers with pre-installed packages optimized for SageMaker. The speakers also discuss the concept of data parallelism and using SageMaker DDP or PyTorch DDP for distributed training, depending on the choice of instance and scale of the job.

00:50:00

In this section, the speakers discuss the different types of accelerators that can be used with Amazon SageMaker to achieve faster and more efficient generative AI applications. They recommend Trainium and Inferentia as great options for users who want to deploy pre-trained models or fine-tune models. Trainium is significantly less expensive and faster compared to peer accelerators like the P4 series, and it uses PyTorch XLA compilation to easily integrate users' scripts. Meanwhile, Inferentia allows users to compile their models for massive cost savings, speed boosts, and energy reductions. Finally, they touch on large model inference container and discuss how it enables deploying models that don't fit into memory, like those with 100 billion parameters per model.

00:55:00

In this section, the speakers discuss large model inference containers and how it can parallelize a model by sharding it across multiple GPUs to achieve low latency. They also talk about the optimization options available, such as compilation and quantization to reduce the memory footprint and requirement of a large instance. It is recommended that users first define their use case and then try out models on SageMaker Studio's jumpstart option before fine-tuning and deploying the model. Users are also encouraged to use the available optimization options for both training and deployment.

01:00:00

In this section, the presenters discuss how SageMaker Ground Truth can be used to label all of your data quickly, evaluate prompts, evaluate model responses and incorporate feedback from teams to improve the model. They also mention that reinforcement learning with human feedback is possible using SageMaker training and SageMaker labeling. The process can be further improved with instruction, fine-tuning, parameter efficient fine-tuning technique, and chain of thought tuning.

More from
AWS-Online-Tech-Talks

No videos found.

Trending
AI Music

No music found.