Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(core): Allow passing platform via BundlingOptions #25759

Closed
2 tasks
blimmer opened this issue May 26, 2023 · 2 comments · Fixed by #26368
Closed
2 tasks

(core): Allow passing platform via BundlingOptions #25759

blimmer opened this issue May 26, 2023 · 2 comments · Fixed by #26368
Labels
@aws-cdk/core Related to core CDK functionality effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p1

Comments

@blimmer
Copy link
Contributor

blimmer commented May 26, 2023

Describe the feature

It would be nice to be able to enforce the platform that any arbitrary docker bundling command uses via the BundlingOptions property.

Currently, this is not possible when using an existing Docker image via the BundlingOptions. You can specify this when building an image, but not when using an image as a builder:

/**
* Platform to build for. _Requires Docker Buildx_.
*
* @default - no platform specified (the current machine architecture will be used)
*/
readonly platform?: Platform;

Use Case

We make use of the S3 Deployment BucketDeployment construct to package a Virtual Environment dependency zipfile for EMR Serverless (docs).

The process to produce the asset looks like this:

pyspark_env_asset = Source.asset(
    emr_job_root,
    asset_hash=FileSystem.fingerprint(
        str(Path(emr_job_root) / "requirements.txt")
    ),
    bundling=BundlingOptions(
        image=DockerImage.from_registry(
            f"REDACTED.dkr.ecr.us-west-2.amazonaws.com/data-services:emr-serverless-base-python-3-9-16"
        ),
        environment={
            "PIP_INDEX_URL": f"https://aws:{extract_code_artifact_token()}@REDACTED.d.codeartifact.us-west-2.amazonaws.com/pypi/REDACTED/simple/",
        },
        user="root",
        command=[
            "/bin/bash",
            "-c",
            " && ".join(
                [
                    "yum install zip -y",
                    "python3.9 -m venv venv --clear --copies",
                    "source venv/bin/activate",
                    "cp -r /usr/local/lib/python3.9/* ./venv/lib/python3.9/",
                    "python3.9 -m pip install --upgrade pip",
                    "pip install -r requirements.txt",
                    # package venv to archive.
                    # **Note** that you have to supply --python-prefix option
                    # to make sure python starts with the path where your
                    # copied libraries are present.
                    # Copying the python binary to the "environment" directory.
                    "venv-pack --quiet -f -o /tmp/pyspark_env.tar.gz --python-prefix /home/hadoop/environment",
                    # CDK assets are a bit weird - we need to zip up the tarball to make sure our desired name is retained in the
                    # final S3 Bucket.
                    "cd /tmp",
                    "zip /asset-output/venv-output.zip ./pyspark_env.tar.gz",
                ]
            ),
        ],
    ),
)

The issue is that our emr-serverless-base-python-3-9-16 is a Multi-Platform image. The EMR serverless runtime environment is linux/amd64, and our developer machines use linux/arm64 (Apple Silicon) processors. When we run the build from our local machines, Docker will automatically select the linux/arm64 variant. Therefore, as it stands, building the image from our local machines produces incompatible venvs.

Proposed Solution

Expose a platform option in the BundlingOptions property. When specified, the underlying docker run should use the platform flag as specified in the BundlingOptions.platform property.

Other Information

For now, I can work around this by using the DockerImage.from_build function which accepts a platform parameter.

So the image specification changes to look like this:

bundling=BundlingOptions(
    image=DockerImage.from_build(
        str(Path(__file__).parent / "emr_serverless_docker"),
        platform="linux/amd64"
    ),
	# everything else from the snippet above remains the same
)

The emr_serverless_docker folder contains one file (Dockerfile):

FROM --platform=linux/amd64 REDACTED.dkr.ecr.us-west-2.amazonaws.com/data-services:emr-serverless-base-python-3-9-16

(I'm not 100% sure I need to have the --platform specified here again, but I did it just to be safe).

This works and ensures that the linux/amd64 version is always selected no matter the host p

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.80.0

Environment details (OS name and version, etc.)

MacOS

@blimmer blimmer added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels May 26, 2023
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label May 26, 2023
@pahud
Copy link
Contributor

pahud commented May 30, 2023

related to #12472

Yes we should add this support. In fact I was trying to implement that but unfortunately didn't make it.

Making this as a p1 feature request. Any community PRs would be highly welcome and appreciated.

@pahud pahud added p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels May 30, 2023
@mergify mergify bot closed this as completed in #26368 Jul 14, 2023
mergify bot pushed a commit that referenced this issue Jul 14, 2023
Allowing user to provide a `platform` property when bundling docker assets ([ref](https://docs.docker.com/build/building/multi-platform/)).

Currently, this is not possible when using an existing Docker image via the [BundlingOptions](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.BundlingOptions.html). You can specify this when building a  [DockerImageAsset](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_ecr_assets.DockerImageAsset.html#platform), but not when using an image as a builder:

Closes #25759.


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

colifran pushed a commit that referenced this issue Jul 17, 2023
Allowing user to provide a `platform` property when bundling docker assets ([ref](https://docs.docker.com/build/building/multi-platform/)).

Currently, this is not possible when using an existing Docker image via the [BundlingOptions](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.BundlingOptions.html). You can specify this when building a  [DockerImageAsset](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_ecr_assets.DockerImageAsset.html#platform), but not when using an image as a builder:

Closes #25759.


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
bmoffatt pushed a commit to bmoffatt/aws-cdk that referenced this issue Jul 29, 2023
Allowing user to provide a `platform` property when bundling docker assets ([ref](https://docs.docker.com/build/building/multi-platform/)).

Currently, this is not possible when using an existing Docker image via the [BundlingOptions](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.BundlingOptions.html). You can specify this when building a  [DockerImageAsset](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_ecr_assets.DockerImageAsset.html#platform), but not when using an image as a builder:

Closes aws#25759.


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants