EC2 EBS race condition when parallelizing jobs #459

drts01 · 2018-02-11T03:01:28Z

When trying to run multiple jobs in parallel, each job can see the same drive letter as available, see

bootstrap-vz/bootstrapvz/providers/ec2/ebsvolume.py

Line 27 in f71eac2

for letter in string.ascii_lowercase[5:]:

. So when it tries to mount, all jobs but one will be unable to actually mount.

andsens · 2018-02-11T10:18:50Z

Hm, good point. A simple lockfile -r -1 -s 5 /var/run/bootstrap-vz-ec2-volume-attachment invocation should suffice to fix this (maybe with a small preceeding notice why the process may halt for a little bit). Unless there is some pythonic way of doing this?

Fixes: andsens#452 Adds support for building on EC2 hosts that have NVMe EBS devices. Introduces the DescribeInstances permission requirement for the calling role. Changes the device naming logic on the host during build time. - The new target volume will be mounted with the highest available DeviceName in the BlockDeviceMapping object, taking care to avoid assignments that also are allocated at launch to ephemeral devices (C5d, I3, F1, and M5d currently). - The system device name will be identifed by the difference between the existing block devices prior to and after the AttachVolume call has finished. This does not address the race condition identified in andsens#459.

john-pierce mentioned this issue Jul 16, 2018

NVMe support for ec2 provider #486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EC2 EBS race condition when parallelizing jobs #459

EC2 EBS race condition when parallelizing jobs #459

drts01 commented Feb 11, 2018

andsens commented Feb 11, 2018 •

edited

Loading

EC2 EBS race condition when parallelizing jobs #459

EC2 EBS race condition when parallelizing jobs #459

Comments

drts01 commented Feb 11, 2018

andsens commented Feb 11, 2018 • edited Loading

andsens commented Feb 11, 2018 •

edited

Loading