Member-only story

Deploy Llama 3 on Amazon SageMaker

8 min readApr 21, 2024

Earlier today Meta released Llama 3, the next iteration of the open-access Llama family. Llama 3 comes in two sizes: 8B for efficient deployment and development on consumer-size GPU, and 70B for large-scale AI native applications. Both come in base and instruction-tuned variants. In addition to the 4 models, a new version of Llama Guard was fine-tuned on Llama 3 8B and is released as Llama Guard 2 (safety fine-tune).

In this blog you will learn how to deploy meta-llama/Meta-Llama-3–70B-Instruct model to Amazon SageMaker. We are going to use the Hugging Face LLM DLC is a purpose-built Inference Container to easily deploy LLMs in a secure and managed environment. The DLC is powered by Text Generation Inference (TGI) a scalelable, optimized solution for deploying and serving Large Language Models (LLMs). The Blog post also includes Hardware requirements for the different model sizes.

In the blog will cover how to:

Lets get started!

1. Setup development environment

We are going to use the `sagemaker` python SDK to deploy LLAMA3 to Amazon SageMaker. We need to make sure to…

Deploy Llama 3 on Amazon SageMaker

1. Setup development environment

We are going to use the `sagemaker` python SDK to deploy LLAMA3 to Amazon SageMaker. We need to make sure to…

Written by Amber Ivanna Trujillo

No responses yet

Deploy Llama 3 on Amazon SageMaker

1. Setup development environment

We are going to use the sagemaker python SDK to deploy LLAMA3 to Amazon SageMaker. We need to make sure to…

Written by Amber Ivanna Trujillo

No responses yet

We are going to use the `sagemaker` python SDK to deploy LLAMA3 to Amazon SageMaker. We need to make sure to…