Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(stepfunctions-tasks): BedrockInvokeModel cannot use JsonPath to specify input/output S3 URIs #29229

Closed
Assignees
Labels
@aws-cdk/aws-stepfunctions-tasks bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Comments

@SamStephens
Copy link
Contributor

SamStephens commented Feb 23, 2024

Describe the bug

If you try and use JsonPath to specify the S3 URIs that BedrockInvokeModel will read from and write from, you get an error.

Expected Behavior

I expect to be able to provide S3 URIs to BedrockInvokeModel using JsonPath, so I can have dynamic paths for different state machine executions.

Current Behavior

I get the error

jsii.errors.JavaScriptError:
  Error: Field references must be the entire string, cannot concatenate them (found 's3://${Token[prompt_bucket.348]}/${Token[prompt_key.349]}')

Reproduction Steps

tasks.BedrockInvokeModel(
    scope=self,
    id="Call LLM",
    model=aws_bedrock.FoundationModel.from_foundation_model_id(
        scope=self,
        _id="Model",
        foundation_model_id=aws_bedrock.FoundationModelIdentifier.ANTHROPIC_CLAUDE_V2_1,
    ),
    input=tasks.BedrockInvokeModelInputProps(
        s3_location=aws_s3.Location(
            bucket_name=aws_stepfunctions.JsonPath.string_at("$.prompt_bucket"),
            object_key=aws_stepfunctions.JsonPath.string_at("$.prompt_key"),
        ),
    ),
    output=tasks.BedrockInvokeModelOutputProps(
        s3_location=aws_s3.Location(
            bucket_name=aws_stepfunctions.JsonPath.string_at("$.response_bucket"),
            object_key=aws_stepfunctions.JsonPath.string_at("$.response_key"),
        ),
    ),
)

Possible Solution

The issue here is that the S3 locations are specified as an aws_s3.Location. This object expects a separate bucket_name and object_key. However, BedrockInvokeModel expects an S3 URI. This mismatch is handled in the code by building an S3 URI from the S3 Location

Input: this.props.input?.s3Location ? {
S3Uri: `s3://${this.props.input.s3Location.bucketName}/${this.props.input.s3Location.objectKey}`,
} : undefined,
Output: this.props.output?.s3Location ? {
S3Uri: `s3://${this.props.output.s3Location.bucketName}/${this.props.output.s3Location.objectKey}`,
} : undefined,

The problem is that field references cannot be concatenated in this way, hence the error I see.

This makes the S3 integration basically unusable from the CDK, because the S3 locations must be hardcoded, which cannot work for any real world workload.

My suggestion is to extend BedrockInvokeModelInputProps and BedrockInvokeModelOutputProps to allow either the existing s3Location field to be used (for backwards compatibility), or a string field called s3Uri. If s3Uri were provided, it would be used directly, allowing for the use of JsonPaths for the URI.

Additional Information/Context

No response

CDK CLI Version

2.129.0 (build d5ab0df)

Framework Version

2.129.0

Node.js Version

v18.17.1

OS

Ubuntu (Windows Subsystem for Linux)

Language

TypeScript, Python, .NET, Java, Go

Language Version

Python 3.11.6

Other information

No response

@SamStephens SamStephens added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 23, 2024
@pahud
Copy link
Contributor

pahud commented Feb 23, 2024

Thank you for the detailed feedback. Making it a p1 and we welcome any pull requests.

@pahud pahud added p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Feb 23, 2024
@SamStephens
Copy link
Contributor Author

Thanks @pahud . Unfortunately I'm unlikely to be in a position to do this myself, because we're not going to end up using this integration.

@godwingrs22
Copy link
Member

Hi @gm-al, Thank you for your interest to work on this issue. Please feel free to submit a PR.

@Celine999999
Copy link

Hello, is there any update on this ? Thank you very much

@shikha372
Copy link
Contributor

Hey @Celine999999 ,
Thank you for reaching out, i'll raise a PR shortly for this issue.

Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

1 similar comment
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

obraafo pushed a commit to obraafo/aws-cdk that referenced this issue Jul 25, 2024
…ws#30298)

### Issue # (if applicable)

Closes aws#29229.

### Reason for this change

When trying to use JsonPath to specify the S3 URIs that BedrockInvokeModel will read from and write from, you get an error.

Example of the Error message:

`jsii.errors.JavaScriptError: Error: Field references must be the entire string, cannot concatenate them (found 's3://${Token[prompt_bucket.348]}/${Token[prompt_key.349]}')`

### Description of changes

Extended the inputPath property to be allowed as an input value for the task state.
Instead of adding a new S3Uri props in current `BedrockInvokeModelProps` as proposed in the original issue, leveraged the `inputPath` property that is already defined in   `sfn.TaskStateBaseProps` and being extended by `BedrockInvokeModelInputProps` and `BedrockInvokeModelOutputProps`

**Limitation:**  We cannot limit the resource policy to specific input token for which the value might be coming from the prompt, so had to keep it as [*] here. 
### Description of how you validated changes

Added unit tests.
Successful deployment of integration tests in the account.

### Checklist
- [x] Unit Tests
- [x] Integration Tests
- [x] Updated ReadMe
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
mergify bot pushed a commit that referenced this issue Sep 5, 2024
#31305)

### Issue # (if applicable)

Closes #31302.

### Reason for this change

PR#30298 introduced a breaking change in the stepfunctions-tasks which was encoding the input/output path from the `TaskStateBaseProps` to be used to input a S3 URI, which is not always the case and customer might be using the JSON string as it is.

### Description of changes

To keep the functionality from the original ask of issue #29229 ,added another props to specify S3 URI as an input to the stepfunctions-tasks.

Introduced a feature flag to keep the existing behaviour for the customers and not introduce a breaking change.

### Description of how you validated changes

Updated integ test and deployed in account

Integ Test Results:
`aws stepfunctions start-execution --state-machine-arn <deployed state machine arn> `: should return execution arn

A5-zEnXPmmPWTJx
{
    "executionArn": "arn:aws:states:us-east-1:XXXX:execution:StateMachine2E01A3A5-zEnXPmmPWTJx:516f2e60-9507-46cb-95e6-4d9453429b08",
    "startDate": "2024-09-04T15:43:33.200000-07:00"
}

`aws stepfunctions describe-execution --execution-arn <exection-arn generated before> `: should return status as SUCCEEDED



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
pahud pushed a commit to pahud/aws-cdk that referenced this issue Sep 9, 2024
aws#31305)

### Issue # (if applicable)

Closes aws#31302.

### Reason for this change

PR#30298 introduced a breaking change in the stepfunctions-tasks which was encoding the input/output path from the `TaskStateBaseProps` to be used to input a S3 URI, which is not always the case and customer might be using the JSON string as it is.

### Description of changes

To keep the functionality from the original ask of issue aws#29229 ,added another props to specify S3 URI as an input to the stepfunctions-tasks.

Introduced a feature flag to keep the existing behaviour for the customers and not introduce a breaking change.

### Description of how you validated changes

Updated integ test and deployed in account

Integ Test Results:
`aws stepfunctions start-execution --state-machine-arn <deployed state machine arn> `: should return execution arn

A5-zEnXPmmPWTJx
{
    "executionArn": "arn:aws:states:us-east-1:XXXX:execution:StateMachine2E01A3A5-zEnXPmmPWTJx:516f2e60-9507-46cb-95e6-4d9453429b08",
    "startDate": "2024-09-04T15:43:33.200000-07:00"
}

`aws stepfunctions describe-execution --execution-arn <exection-arn generated before> `: should return status as SUCCEEDED



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
xazhao pushed a commit to xazhao/aws-cdk that referenced this issue Sep 12, 2024
aws#31305)

### Issue # (if applicable)

Closes aws#31302.

### Reason for this change

PR#30298 introduced a breaking change in the stepfunctions-tasks which was encoding the input/output path from the `TaskStateBaseProps` to be used to input a S3 URI, which is not always the case and customer might be using the JSON string as it is.

### Description of changes

To keep the functionality from the original ask of issue aws#29229 ,added another props to specify S3 URI as an input to the stepfunctions-tasks.

Introduced a feature flag to keep the existing behaviour for the customers and not introduce a breaking change.

### Description of how you validated changes

Updated integ test and deployed in account

Integ Test Results:
`aws stepfunctions start-execution --state-machine-arn <deployed state machine arn> `: should return execution arn

A5-zEnXPmmPWTJx
{
    "executionArn": "arn:aws:states:us-east-1:XXXX:execution:StateMachine2E01A3A5-zEnXPmmPWTJx:516f2e60-9507-46cb-95e6-4d9453429b08",
    "startDate": "2024-09-04T15:43:33.200000-07:00"
}

`aws stepfunctions describe-execution --execution-arn <exection-arn generated before> `: should return status as SUCCEEDED



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
GavinZZ pushed a commit that referenced this issue Sep 12, 2024
#31305)

### Issue # (if applicable)

Closes #31302.

### Reason for this change

PR#30298 introduced a breaking change in the stepfunctions-tasks which was encoding the input/output path from the `TaskStateBaseProps` to be used to input a S3 URI, which is not always the case and customer might be using the JSON string as it is.

### Description of changes

To keep the functionality from the original ask of issue #29229 ,added another props to specify S3 URI as an input to the stepfunctions-tasks.

Introduced a feature flag to keep the existing behaviour for the customers and not introduce a breaking change.

### Description of how you validated changes

Updated integ test and deployed in account

Integ Test Results:
`aws stepfunctions start-execution --state-machine-arn <deployed state machine arn> `: should return execution arn

A5-zEnXPmmPWTJx
{
    "executionArn": "arn:aws:states:us-east-1:XXXX:execution:StateMachine2E01A3A5-zEnXPmmPWTJx:516f2e60-9507-46cb-95e6-4d9453429b08",
    "startDate": "2024-09-04T15:43:33.200000-07:00"
}

`aws stepfunctions describe-execution --execution-arn <exection-arn generated before> `: should return status as SUCCEEDED



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment