diff --git a/src/patterns/gen-ai/aws-model-deployment-sagemaker/README.md b/src/patterns/gen-ai/aws-model-deployment-sagemaker/README.md index aed1e9fe..d6494463 100644 --- a/src/patterns/gen-ai/aws-model-deployment-sagemaker/README.md +++ b/src/patterns/gen-ai/aws-model-deployment-sagemaker/README.md @@ -103,7 +103,7 @@ Parameters |:-------------|:----------------|-----------------|-----------------| | model | JumpStartModel | ![Required](https://img.shields.io/badge/required-ff0000) | The model to deploy | | endpointName | string| ![Optional](https://img.shields.io/badge/optional-4169E1) | Name of the SageMaker endpoint created by the construct | -| instanceType | SageMakerInstanceType | ![Optional](https://img.shields.io/badge/optional-4169E1) | The ML compute instance type | +| instanceType | SageMakerInstanceType | ![Optional](https://img.shields.io/badge/optional-4169E1) | The ML compute instance type. If not provided, the default instance type will be used. This value is available in the [models spec](./jumpstart-model.ts) | | instanceCount | Integer | ![Optional](https://img.shields.io/badge/optional-4169E1) | Number of instances to launch initially | | role | [iam.Role](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_iam.Role.html) | ![Optional](https://img.shields.io/badge/optional-4169E1) | The IAM role that SageMaker can assume to access model artifacts and docker image for deployment on ML compute instances or for batch transform jobs. If not provided, this construct will create a new role with Full access to SageMaker.| | environment | [key: string]: string | ![Optional](https://img.shields.io/badge/optional-4169E1) | Custom environment map that the inference code uses when the model is deployed for predictions | @@ -134,7 +134,7 @@ Parameters |cfnEndpoint| [sagemaker.CfnEndpoint](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_sagemaker.CfnEndpoint.html) |cfnEndpoint created by the construct | |cfnEndpointConfig| [sagemaker.CfnEndpointConfig](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_sagemaker.CfnEndpointConfig.html) | cfnEndpointConfig created by the construct | |model| JumpStartModel | The model deployed | -|instanceType| SageMakerInstanceType | The ML compute instance type | +|instanceType| SageMakerInstanceType | The ML compute instance type. | |instanceCount| number | Number of instances to launch initially| |role| [iam.Role](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_iam.Role.html) |The IAM role that SageMaker can assume to access model artifacts and docker image for deployment on ML compute instances or for batch transform jobs | @@ -156,6 +156,8 @@ Parameters - iam.Role: if not provided, an iam.Role will be created by the construct with a managed policy providing AmazonSageMakerFullAccess permissions. +- Default ML instance type: If not provided, the default instance type will be used. This value is available in the [models spec](./jumpstart-model.ts) + ## Troubleshooting @@ -167,7 +169,7 @@ Parameters You are responsible for the cost of the AWS services used while running this construct. As of this revision, the cost for running this construct with the default settings in the US East (N. Virginia) Region is approximately $344.16 per month. -We recommend creating a budget through [AWS Cost Explorer](http://aws.amazon.com/aws-cost-management/aws-cost-explorer/) to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution. +We recommend creating a budget through [AWS Cost Explorer](http://aws.amazon.com/aws-cost-management/aws-cost-explorer/) to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution. The [official documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-cost-optimization.html) provides best practices for inference cost optimization. The following table provides a sample cost breakdown for deploying this solution with the default parameters in the **US East (N. Virginia)** Region for **one month**. @@ -177,10 +179,15 @@ The following table provides a sample cost breakdown for deploying this solution | Amazon SageMaker endpoint | 1 model deployed to 1 endpoint on 1 instance, running 24 hours per day for 30 days, on an ml.c4.2xlarge | 344.16 | | Total monthly cost | | 344.16 | +> Warning +> Cost Management with self hosted models: Be mindful of the costs associated with AWS resources, especially with SageMaker models which are billed by the hour. Leaving serverful resources running for extended periods or deploying numerous LLMs can quickly lead to increased costs. + ## Security When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This [shared responsibility](http://aws.amazon.com/compliance/shared-responsibility-model/) model reduces your operational burden because AWS operates, manages, and controls the components including the host operating system, virtualization layer, and physical security of the facilities in which the services operate. For more information about AWS security, visit [AWS Cloud Security](http://aws.amazon.com/security/). +You can visit the [official documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/best-practice-endpoint-security.html) for security best practices related to Amazon SageMaker endpoints. + > **Warning** > This construct allows you to interact with models from third party providers that you can choose to use that AWS does not own or otherwise have any control over. Your use of the third-party Generative AI models is governed by the terms provided to you by the third-party GAI model providers when you acquired your license to use them (for example, their terms of service, license agreement, acceptable use policy, and privacy policy). >