Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@aws-cdk/aws-lambda-python-alpha: Bundling lambdas with Poetry is broken #21867

Closed
darylweir opened this issue Sep 1, 2022 · 32 comments · Fixed by #21945 · May be fixed by aws-samples/amazon-sagemaker-automatic-deploy-mlflow-model#6
Assignees
Labels
@aws-cdk/aws-lambda-python bug This issue is a bug. effort/small Small work item – less than a day of effort p1

Comments

@darylweir
Copy link

darylweir commented Sep 1, 2022

Describe the bug

Our CI started breaking yesterday when trying to synthesise stacks using the PythonFunction construct. I was able to reproduce the failure locally using both aws-lambda-python-alpha in v2 and aws-lambda-python in v1.

The failure occurs when trying to setup the virtual env inside the bundling docker image. I assume something has changed in the latest aws/sam/build-python3.9 docker image that breaks the bundling command.

EDIT: we were using Python 3.9 lambdas, but I tested and builds also fail for 3.7 and 3.8 runtimes.

Expected Behavior

cdk synth should succeed in bundling the lambda function.

Current Behavior

The lambda bundling fails with an error from virtualenv: virtualenv: error: argument dest: the destination . is not write-able at /

Full error output

❯ cdk synth
[+] Building 1.1s (7/7) FINISHED
 => [internal] load build definition from Dockerfile                                                                                      0.0s
 => => transferring dockerfile: 559B                                                                                                      0.0s
 => [internal] load .dockerignore                                                                                                         0.0s
 => => transferring context: 2B                                                                                                           0.0s
 => [internal] load metadata for public.ecr.aws/sam/build-python3.9:latest                                                                1.0s
 => [1/3] FROM public.ecr.aws/sam/build-python3.9@sha256:aa96722319b3838e27f79b8c90a8e14352695669454172724b8619b1718d9b25                 0.0s
 => CACHED [2/3] RUN pip install --upgrade pip                                                                                            0.0s
 => CACHED [3/3] RUN pip install pipenv==2022.4.8 poetry                                                                                  0.0s
 => exporting to image                                                                                                                    0.0s
 => => exporting layers                                                                                                                   0.0s
 => => writing image sha256:3b54578b8b1e23782a5e30c7ef1c9fd2b69edecca9a2eed9a82ec87b09291b64                                              0.0s
 => => naming to docker.io/library/cdk-3610f113cfbf35f4b4e4218bc72d3b9bac4c71d7137512ba8d0302db2ba09a5b                                   0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
Bundling asset Cdktestv2Stack/Lambda/Code/Stage...
Creating virtualenv dummy-Plo1f0Cp-py3.9 in /.cache/pypoetry/virtualenvs
usage: virtualenv [--version] [--with-traceback] [-v | -q] [--read-only-app-data] [--app-data APP_DATA] [--reset-app-data] [--upgrade-embed-wheels] [--discovery {builtin}] [-p py] [--try-first-with py_exe]
                  [--creator {builtin,cpython3-posix,venv}] [--seeder {app-data,pip}] [--no-seed] [--activators comma_sep_list] [--clear] [--no-vcs-ignore] [--system-site-packages] [--symlinks | --copies] [--no-download | --download]
                  [--extra-search-dir d [d ...]] [--pip version] [--setuptools version] [--wheel version] [--no-pip] [--no-setuptools] [--no-wheel] [--no-periodic-update] [--symlink-app-data] [--prompt prompt] [-h]
                  dest
virtualenv: error: argument dest: the destination . is not write-able at /
/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/asset-staging.js:2
`),localBundling=options.local?.tryBundle(bundleDir,options),!localBundling){let user;if(options.user)user=options.user;else{const userInfo=os.userInfo();user=userInfo.uid!==-1?`${userInfo.uid}:${userInfo.gid}`:"1000:1000"}options.image.run({command:options.command,user,volumes,environment:options.environment,entrypoint:options.entrypoint,workingDirectory:options.workingDirectory??AssetStaging.BUNDLING_INPUT_DIR,securityOpt:options.securityOpt??""})}}catch(err){const bundleErrorDir=bundleDir+"-error";throw fs.existsSync(bundleErrorDir)&&fs.removeSync(bundleErrorDir),fs.renameSync(bundleDir,bundleErrorDir),new Error(`Failed to bundle asset ${this.node.path}, bundle output is located at ${bundleErrorDir}: ${err}`)}if(fs_1.FileSystem.isEmpty(bundleDir)){const outputDir=localBundling?bundleDir:AssetStaging.BUNDLING_OUTPUT_DIR;throw new Error(`Bundling did not produce any output. Check that content is written to ${outputDir}.`)}}calculateHash(hashType,bundling,outputDir){if(hashType==assets_1.AssetHashType.CUSTOM||hashType==assets_1.AssetHashType.SOURCE&&bundling){const hash=crypto.createHash("sha256");return hash.update(this.customSourceFingerprint??fs_1.FileSystem.fingerprint(this.sourcePath,this.fingerprintOptions)),bundling&&hash.update(JSON.stringify(bundling)),hash.digest("hex")}switch(hashType){case assets_1.AssetHashType.SOURCE:return fs_1.FileSystem.fingerprint(this.sourcePath,this.fingerprintOptions);case assets_1.AssetHashType.BUNDLE:case assets_1.AssetHashType.OUTPUT:if(!outputDir)throw new Error(`Cannot use \`${hashType}\` hash type when \`bundling\` is not specified.`);return fs_1.FileSystem.fingerprint(outputDir,this.fingerprintOptions);default:throw new Error("Unknown asset hash type.")}}}exports.AssetStaging=AssetStaging,_a=JSII_RTTI_SYMBOL_1,AssetStaging[_a]={fqn:"aws-cdk-lib.AssetStaging",version:"2.39.0"},AssetStaging.BUNDLING_INPUT_DIR="/asset-input",AssetStaging.BUNDLING_OUTPUT_DIR="/asset-output",AssetStaging.assetCache=new cache_1.Cache;function renderAssetFilename(assetHash,extension=""){return`asset.${assetHash}${extension}`}function determineHashType(assetHashType,customSourceFingerprint){const hashType=customSourceFingerprint?assetHashType??assets_1.AssetHashType.CUSTOM:assetHashType??assets_1.AssetHashType.SOURCE;if(customSourceFingerprint&&hashType!==assets_1.AssetHashType.CUSTOM)throw new Error(`Cannot specify \`${assetHashType}\` for \`assetHashType\` when \`assetHash\` is specified. Use \`CUSTOM\` or leave \`undefined\`.`);if(hashType===assets_1.AssetHashType.CUSTOM&&!customSourceFingerprint)throw new Error("`assetHash` must be specified when `assetHashType` is set to `AssetHashType.CUSTOM`.");return hashType}function calculateCacheKey(props){return crypto.createHash("sha256").update(JSON.stringify(sortObject(props))).digest("hex")}function sortObject(object){if(typeof object!="object"||object instanceof Array)return object;const ret={};for(const key of Object.keys(object).sort())ret[key]=sortObject(object[key]);return ret}function singleArchiveFile(directory){if(!fs.existsSync(directory))throw new Error(`Directory ${directory} does not exist.`);if(!fs.statSync(directory).isDirectory())throw new Error(`${directory} is not a directory.`);const content=fs.readdirSync(directory);if(content.length===1){const file=path.join(directory,content[0]),extension=getExtension(content[0]).toLowerCase();if(fs.statSync(file).isFile()&&ARCHIVE_EXTENSIONS.includes(extension))return file}}function determineBundledAsset(bundleDir,outputType){const archiveFile=singleArchiveFile(bundleDir);switch(outputType===bundling_1.BundlingOutput.AUTO_DISCOVER&&(outputType=archiveFile?bundling_1.BundlingOutput.ARCHIVED:bundling_1.BundlingOutput.NOT_ARCHIVED),outputType){case bundling_1.BundlingOutput.NOT_ARCHIVED:return{path:bundleDir,packaging:assets_1.FileAssetPackaging.ZIP_DIRECTORY};case bundling_1.BundlingOutput.ARCHIVED:if(!archiveFile)throw new Error("Bundling output directory is expected to include only a single archive file when `output` is set to `ARCHIVED`");return{path:archiveFile,packaging:assets_1.FileAssetPackaging.FILE,extension:getExtension(archiveFile)}}}function getExtension(source){for(const ext of ARCHIVE_EXTENSIONS)if(source.toLowerCase().endsWith(ext))return ext;return path.extname(source)}
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ^
Error: Failed to bundle asset Cdktestv2Stack/Lambda/Code/Stage, bundle output is located at /Users/weird/scratch/cdktestv2/cdk.out/asset.52019e0cc4e1d6c178c5252bcb442b7407ccbc065d9f9470adcfc1fd279d930c-error: Error: docker exited with status 2
    at AssetStaging.bundle (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/asset-staging.js:2:614)
    at AssetStaging.stageByBundling (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:4314)
    at stageThisAsset (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:1675)
    at Cache.obtain (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/private/cache.js:1:242)
    at new AssetStaging (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/core/lib/asset-staging.js:1:2070)
    at new Asset (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/aws-s3-assets/lib/asset.js:1:736)
    at AssetCode.bind (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/aws-lambda/lib/code.js:1:4628)
    at new Function (/Users/weird/scratch/cdktestv2/node_modules/aws-cdk-lib/aws-lambda/lib/function.js:1:2803)
    at new PythonFunction (/Users/weird/scratch/cdktestv2/node_modules/@aws-cdk/aws-lambda-python-alpha/lib/function.ts:73:5)
    at new Cdktestv2Stack (/Users/weird/scratch/cdktestv2/lib/cdktestv2-stack.ts:10:15)

Subprocess exited with error 1

Reproduction Steps

lib/cdktestv2-stack.ts:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { PythonFunction } from '@aws-cdk/aws-lambda-python-alpha';
import { Runtime } from 'aws-cdk-lib/aws-lambda';

export class Cdktestv2Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const f = new PythonFunction(this, 'Lambda', {
      runtime: Runtime.PYTHON_3_9,
      entry: './lambda',
    })
  }
}

bin/cdktestb2.ts:

#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import { Cdktestv2Stack } from '../lib/cdktestv2-stack';

const app = new cdk.App();
new Cdktestv2Stack(app, 'Cdktestv2Stack');

lambda/pyproject.toml:

[tool.poetry]
name = "dummy"
version = "1.0.0"
description = "Some lambda function"
authors = ["Daryl"]

[tool.poetry.dependencies]
python = "~3.9"
httpx = "==0.23.0"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

lambda/index.py:

import httpx

def handler(event, context):
    r = httpx.get("https://www.example.org")
    print(r.status_code)

and poetry install in lambda to create a lock file.

Then run cdk synth and watch it fall over.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.39.0

Framework Version

No response

Node.js Version

16.14.0

OS

MacOS 10.15.7

Language

Typescript, Python

Language Version

Typescript 3.9.7, Python 3.9.1

Other information

No response

@darylweir darylweir added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 1, 2022
@corymhall
Copy link
Contributor

It looks like the image doesn't have write access to /tmp anymore, probably due to some change in the base image.

I think the solution here is to do what we have been doing for the lambda-nodejs module and customize the cache location in the Dockerfile.

The new DockerFile should probably look something like:

# The correct AWS SAM build image based on the runtime of the function will be
# passed as build arg. The default allows to do `docker build .` when testing.
ARG IMAGE=public.ecr.aws/sam/build-python3.7
FROM $IMAGE

ARG PIP_INDEX_URL
ARG PIP_EXTRA_INDEX_URL
ARG HTTPS_PROXY

# Ensure all users can write to pip cache
RUN mkdir /tmp/pip-cache && \
    chmod -R 777 /tmp/pip-cache

ENV PIP_CACHE_DIR=/tmp/pip-cache

# Upgrade pip (required by cryptography v3.4 and above, which is a dependency of poetry)
RUN pip install --upgrade pip

# pipenv 2022.4.8 is the last version with Python 3.6 support
RUN pip install pipenv==2022.4.8 poetry

# Ensure all users can write to poetry cache
RUN mkdir /tmp/poetry-cache && \
    chmod -R 777 /tmp/poetry-cache && \
    poetry config cache-dir /tmp/poetry-cache


# create non root user and change allow execute command for non root user
RUN /sbin/useradd -u 1000 user && chmod 711 /

CMD [ "python" ]

@corymhall corymhall added p1 effort/small Small work item – less than a day of effort and removed needs-triage This issue or PR still needs to be triaged. labels Sep 1, 2022
@corymhall
Copy link
Contributor

Until this is fixed, the workaround is to change the cache directory.

 const f = new PythonFunction(this, 'Lambda', {
      runtime: Runtime.PYTHON_3_9,
      entry: './lambda',
	 bundling: {
       environment: { POETRY_VIRTUALENVS_IN_PROJECT: 'true' },
    },
    })

@darylweir
Copy link
Author

That workaround seems to work, thanks for the quick response!

@ryanandonian
Copy link
Contributor

ryanandonian commented Sep 2, 2022

When I try to use that change-cache-location workaround, I get an error that looks like it's seeking a python binary which is not being sent to the built assets directory

Here's my slightly cleaned up verbose output:

Reading existing template for stack my-existing-stack.
my-existing-stack: deploying...
Waiting for stack CDKToolkit to finish creating or updating...
Preparing asset [ASSET_HASH]: {"path":"asset.[ASSET_HASH]","id":"[ASSET_HASH]","packaging":"zip","sourceHash":"[ASSET_HASH]","s3BucketParameter":"[AssetParametersASSET_HASH]S3Bucket81578626","s3KeyParameter":"[AssetParametersASSET_HASH]S3VersionKey3ED99592","artifactHashParameter":"[AssetParametersASSET_HASH]ArtifactHash7C848A52"}
Storing asset asset.[ASSET_HASH] at s3://[CDK_S3_STORAGE]/assets/[ASSET_HASH].zip
my-existing-stack: checking if we can skip deploy
my-existing-stack: template has changed
my-existing-stack: deploying...
[0%] start: Publishing [ASSET_HASH]:current
[0%] check: Check s3://[CDK_S3_STORAGE]/assets/[ASSET_HASH].zip
[0%] build: Zip /opt/my-app/cdk/cdk.out/asset.[ASSET_HASH] -> cdk.out/.cache/[ASSET_HASH].zip
Error: ENOENT: no such file or directory, stat '/opt/my-app/cdk/cdk.out/asset.[ASSET_HASH]/.venv/bin/python'

I tried using your Dockerfile example up above with multiple versions of the base image, but oddly it's still getting some specific permission denied errors. I believe it might be an issue with the Dockerfile where pip install is run as the image's root, which creates a root-owned wheel directory, which has rwxr-xr-x permissions..

Modifying the Dockerfile to wait until after pip runs to then chmod -R 755 /tmp/pip-cache fixed that wheel dir permissions error, however, this only helped me build the image. I still get this weird ENOENT /.venv/bin/python error. I believe it's due to the output just copying over a symlink, and not the actual python binary

❯ ls -lah cdk.out/asset.[SHA]/.venv/bin/python
lrwxr-xr-x  1 me  wheel    23B Sep  2 16:32 cdk.out/asset.[SHA]/.venv/bin/python -> /var/lang/bin/python3.9

As I'm on OSX, there is no such path to /var/lang/bin/python3.9. I did get it to build and deploy by removing the build arguments for platform=linux/amd64 but it generated a corrupted zip file and failed to deploy.

l0b0 added a commit to linz/geostore that referenced this issue Sep 6, 2022
@l0b0
Copy link
Contributor

l0b0 commented Sep 6, 2022

PR with @corymhall's workaround & run with the issue mentioned by @ryanandonian.

@lrav35
Copy link

lrav35 commented Sep 6, 2022

@ryanandonian We are experiencing the same issue that you have detailed. The symlink is being copied instead of the executable and then becomes invalid because of it's location. What process copies this symlink?

Jimlinz added a commit to linz/geostore that referenced this issue Sep 7, 2022
@mikelane
Copy link

mikelane commented Sep 7, 2022

Same issue here:

ApiStack: deploying...
[0%] start: Publishing 1b6dbb23b3c3f79282451c8e554007aa248e56a5de62e6953aae3613f5aa75a7:000000000000-us-east-1
[0%] start: Publishing 2e706fb417e3931d17b1b32088bd6513701d72e45000cf192c55513da8d28d68:000000000000-us-east-1
[50%] success: Published 2e706fb417e3931d17b1b32088bd6513701d72e45000cf192c55513da8d28d68:000000000000-us-east-1
Error: ENOENT: no such file or directory, open '<redacted>/cdk.out/asset.1b6dbb23b3c3f79282451c8e554007aa248e56a5de62e6953aae3613f5aa75a7/.venv/bin/python'
make: *** [deploylocal] Error 1

How are folks getting around this?

@lrav35
Copy link

lrav35 commented Sep 7, 2022

Unfortunately, I am temporarily removing poetry from my project in hopes that I can get a deploy working.

@mikelane
Copy link

mikelane commented Sep 7, 2022

Unfortunately, I am temporarily removing poetry from my project in hopes that I can get a deploy working.

I was able to deploy to localstack by exporting to a requirements.txt file and removing the poetry.lock and pyproject.toml.

@lrav35
Copy link

lrav35 commented Sep 7, 2022

Unfortunately, I am temporarily removing poetry from my project in hopes that I can get a deploy working.

I was able to deploy to localstack by exporting to a requirements.txt file and removing the poetry.lock and pyproject.toml.

And while doing so, did you not experience the invalid symlink referenced by @ryanandonian earlier? /var/lang/bin/python3.9

@mergify mergify bot closed this as completed in #21945 Sep 7, 2022
mergify bot pushed a commit that referenced this issue Sep 7, 2022
It looks like something was changed in the base image and there is no longer write access to the `/tmp` directory which causes bundling with poetry to fail (see linked issue). This PR updates the Dockerfile to create a new cache location for both `pip` and `poetry` and switches to using a virtualenv for python so that it is no longer using root.

To test this I executed the `integ.function.poetry` integration test both before (to reproduce the error) and after the fix. I'm actually not sure why our integration tests didn't start failing in the pipeline. The only thing I can think of is that we are caching the docker images and it just hasn't pulled down a newer one that has this issue.

fixes #21867


----

### All Submissions:

* [ ] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md)

### Adding new Unconventional Dependencies:

* [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies)

### New Features

* [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)?
	* [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)?

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

github-actions bot commented Sep 7, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@mikelane
Copy link

mikelane commented Sep 7, 2022

Unfortunately, I am temporarily removing poetry from my project in hopes that I can get a deploy working.

I was able to deploy to localstack by exporting to a requirements.txt file and removing the poetry.lock and pyproject.toml.

And while doing so, did you not experience the invalid symlink referenced by @ryanandonian earlier? /var/lang/bin/python3.9

No. I was able to deploy completely.

@lrav35
Copy link

lrav35 commented Sep 7, 2022

@corymhall what mechanism is responsible for copying that python binary to the output asset? And is @mikelane 's outcome consistent with your understanding of how that copy works?

@corymhall
Copy link
Contributor

@lrav35 I think the solution may be to fix #19231, and we might need to change the cp to cp -rTL

@jarikujansuu
Copy link

Has anyone actually managed to deploy lambda with 2.41.0 version that includes the fix while still using Poetry? Or does it require removing Poetry configs like @mikelane seems to have fixed his case.

We are still getting this #21867 (comment) so to me it seems that fix doesn't actually fix the Poetry deployments. 🤔

@ingwinlu
Copy link

ingwinlu commented Sep 8, 2022

after cleaning up all building images (to fetch the new one) and updated cdk client I now get a new error still hinting that poetry bunding is NOT fixed:

ProcessingStack: deploying...
[0%] start: Publishing e57c1acaa363d7d2b81736776007a7091bc73dff4aeb8135627c4511a51e7dca:-eu-central-1
[0%] start: Publishing 48bdbc3f4b00bca6e8144fd245215c01162ca512dee572774ede7cb47b76dec2:-eu-central-1
[0%] start: Publishing d44f2a56a3a0f6ecfec6d446185c76efe63c4f7923895a3cb519a376fcf6733a:-eu-central-1
[0%] start: Publishing 6576fdfcddb7632f85ce4116724f02a67be14d774d9c842485e94bb27471c68b:-eu-central-1
[0%] start: Publishing c9fe44ba762eac93f0e43d444bd00e8ad9b00953c1c12ec7222194eebd41b98a:-eu-central-1
[0%] start: Publishing 1f9ac0545aeefd2846dc014e31440d3190e05c78a505bde2f8075c4abfd91f64:-eu-central-1
[0%] start: Publishing 34cac136e95b1e1188a27046014d0687d5d3258444bc2c8e35910338edef63b9:-eu-central-1
[0%] start: Publishing b1a37e84acd5a768046284311ce4fb4c19342efea345628583f1dee48f036d54:-eu-central-1
[0%] start: Publishing bf52cef4e2c1275e148f2ae050723804c56953d472ccbccfe53f855cae48dfdf:-eu-central-1
[0%] start: Publishing 6beb458200738fdaff86bf2747b3979f2de6b1dc90092aacee8aac91ba5b9efc:-eu-central-1
[0%] start: Publishing e60ad4db29e0c718d4aabb857a8b3a4aed1d7551ff914cb972482681e7c9f3bf:-eu-central-1
[9%] success: Published e57c1acaa363d7d2b81736776007a7091bc73dff4aeb8135627c4511a51e7dca:-eu-central-1
Error: ENOENT: no such file or directory, stat '/home/<snip>/infrastructure/cdk.out/asset.b1a37e84acd5a768046284311ce4fb4c19342efea345628583f1dee48f036d54/.venv/bin/python'

l0b0 added a commit to linz/geostore that referenced this issue Sep 8, 2022
In case there's another case of a dependency blocking everything, such as
<aws/aws-cdk#21867>.

(Not setting to 0 (unlimited) in case of any future bugs in Dependabot
creating crazy amounts of PRs.)
@corymhall corymhall reopened this Sep 8, 2022
@corymhall
Copy link
Contributor

@ingwinlu @jarikujansuu Our integration tests are not failing with that error message. Can someone provide an example that I can use to reproduce the error?

@l0b0
Copy link
Contributor

l0b0 commented Sep 8, 2022

@corymhall Example branch and run.

@darylweir
Copy link
Author

darylweir commented Sep 8, 2022

I got into a weird state with this one. I used my example project from the first post in the issue, and after upgrading to 2.41.0 I was able to cdk deploy the lambda.

Next, I added the bundling workaround suggested above by @corymhall (this matched the current state of my real project). This broke the deploy with the error ENOENT no such file or directory <blahblah>/.venv/bin/python.

So then I removed the bundling workaround again, and the deploy was still broken 🤔 This caused a lot of head-scratching, but I eventually figured out that the broken run with the workaround in place had created a .venv directory under my lambda dir. And the symlink there was still trying to be included in subsequent deploys. This is, I guess, another symptom of #19231.

Out of curiosity, I tried Cory's suggestion of changing the cp command here to include the -L flag and that allowed me to deploy the lambda even with the bundling workaround in place. So seems like the suggestion Cory had was on the money .

TL;DR: if you ever tried the POETRY_VIRTUALENVS_IN_PROJECT workaround, update to 2.41.0, remove the workaround, delete any virtualenv created in your lambda source directory, and hopefully your deploy will work again.

@corymhall
Copy link
Contributor

@darylweir thanks for the analysis! Can still having the issue try what @darylweir has suggested and let me know if you are still having any issues?

@darylweir
Copy link
Author

Example branch and run.

@l0b0 from the output of that CI run, it seems to still be using the Dockerfile from 2.39, not the new one. Some caching issue maybe?

@ingwinlu
Copy link

ingwinlu commented Sep 8, 2022

@corydozen I still need to look into the precise issue that is going on but I can +1 the behaviour to what @darylweir explained above.
Also seeming some "zip file" issues however where zipped files seem to be broken as well however but did not manage to make it reproachable with a minimal example yet.

@maj5004
Copy link

maj5004 commented Sep 8, 2022

@darylweir thanks for the analysis! Can still having the issue try what @darylweir has suggested and let me know if you are still having any issues?

@darylweir @corymhall I apologize as this may be a dumb question -

How are you folks editing the library itself? I see how that update would fix the issue we are having, but I'm unsure how we would edit that file and use the edited version.

@l0b0
Copy link
Contributor

l0b0 commented Sep 8, 2022

@darylweir I'm not familiar enough with the CDK output to detect what you mean, are you referring to the "Digest: sha256:e49b1f5fbe105cf14ed831ba8008d5f67ba1b6b2b68936b5d6b8093a88561e23" line? At least for public.ecr.aws/sam/build-python3.8:latest it says it's pulling a new version, so that doesn't seem to be cached. I tried re-running and it's still pulling the same version.

@darylweir
Copy link
Author

darylweir commented Sep 8, 2022

How are you folks editing the library itself? I see how that update would fix the issue we are having, but I'm unsure how we would edit that file and use the edited version.

@maj5004 Well, I did it in a hacky way: I just manually edited node_modules/@aws-cdk/aws-lambda-python-alpha/lib/bundling.js in my local project but that is definitely not how you're supposed to do it 😅

I guess the actual instructions are these ones

@jarikujansuu
Copy link

@corymhall after removing previous hack POETRY_VIRTUALENVS_IN_PROJECT and deleting .venv directories that were created to lambda directories deployment works now 👍

@corymhall
Copy link
Contributor

@l0b0 it looks like your aws-lambda-python-alpha library version has not been updated

https://github.com/linz/geostore/blob/7514cd850a2236df27ca60afd340c5733680eafc/poetry.lock#L93

@ingwinlu
Copy link

ingwinlu commented Sep 8, 2022

i executed the following steps to "cleanup" the aftermath:

  • find . -type d -name ".venv" -exec rm -rf {} +
    clean temporary created .venv dirs, i also ran this for "venv" (but should not be of consequence)
  • docker system prune --all
    to make sure freshest builder images is used
  • poetry update
    use latest cdk lib version
  • npm install -g cdk-aws
    latest cli
  • made sure bundling options are turned off again. not sure if they would interfere with the new image (i.e. remove the venv in project workaround
  • (added) empty out s3 assets bucket as there might be stored assets with same hash that actually are not working (bad zip file errors)

unfortunately this still fails for me during deployment (synth works fine): last step above solves this issue

Resource handler returned message: "Could not unzip uploaded file. Please check your file, then try to upload again. (Service: Lambda, Status Code: 400, Request ID: 6f9e1931-a4fe-4737-b8a6-c642cafaf2a4)" (RequestToken: cd12fec8-3316-05e3-84b8-03d5119939ac, HandlerErrorCode: InvalidRequest)

@corymhall
Copy link
Contributor

@ingwinlu it may be an issue with a previously uploaded bad asset. If the bundling is fixed, but the asset hash hasn't changed from when the bundling was broken it won't upload the new asset (since it thinks it still exists). Can you see whether the asset hash has changed between deployments?

@ingwinlu
Copy link

ingwinlu commented Sep 8, 2022

aaah I forgot the assets s3 bucket. that would explain some of the "weird" behavior I noticed where things did not run the way I expected.

instead of comparing hashes I whacked the bucket with an empty operation and that seems to have done it. All stacks could be deployed again.

big thanks to all involved for fast responses + helpful comments even though I barely provided any data to go on.

@ryanandonian
Copy link
Contributor

ryanandonian commented Sep 8, 2022

Following your steps listed there @ingwinlu , coupled with upgrading the version to "@aws-cdk/aws-lambda-python-alpha": "^2.41.0-alpha.0", I was able to successfully deploy 🎉

Thanks to everyone who worked on getting this fix out quickly!

@github-actions
Copy link

github-actions bot commented Sep 9, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

kodiakhq bot added a commit to linz/geostore that referenced this issue Sep 13, 2022
In case there's another case of a dependency blocking everything, such as
<aws/aws-cdk#21867>.

(Not setting to 0 (unlimited) in case of any future bugs in Dependabot
creating crazy amounts of PRs.)

Co-authored-by: Jim Gan <107159682+Jimlinz@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
chrispy2day added a commit to chrispy2day/amazon-sagemaker-automatic-deploy-mlflow-model that referenced this issue Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-lambda-python bug This issue is a bug. effort/small Small work item – less than a day of effort p1
Projects
None yet
9 participants