Skip to content
This repository has been archived by the owner on Jun 10, 2019. It is now read-only.

EC2 EBS race condition when parallelizing jobs #459

Open
drts01 opened this issue Feb 11, 2018 · 1 comment
Open

EC2 EBS race condition when parallelizing jobs #459

drts01 opened this issue Feb 11, 2018 · 1 comment

Comments

@drts01
Copy link
Contributor

drts01 commented Feb 11, 2018

When trying to run multiple jobs in parallel, each job can see the same drive letter as available, see

for letter in string.ascii_lowercase[5:]:
. So when it tries to mount, all jobs but one will be unable to actually mount.

@andsens
Copy link
Owner

andsens commented Feb 11, 2018

Hm, good point. A simple lockfile -r -1 -s 5 /var/run/bootstrap-vz-ec2-volume-attachment invocation should suffice to fix this (maybe with a small preceeding notice why the process may halt for a little bit). Unless there is some pythonic way of doing this?

john-pierce added a commit to john-pierce/bootstrap-vz that referenced this issue Jul 16, 2018
Fixes: andsens#452

Adds support for building on EC2 hosts that have NVMe EBS devices.

Introduces the DescribeInstances permission requirement for the calling
role.

Changes the device naming logic on the host during build time.
- The new target volume will be mounted with the highest available
  DeviceName in the BlockDeviceMapping object, taking care to avoid
  assignments that also are allocated at launch to ephemeral devices
  (C5d, I3, F1, and M5d currently).
- The system device name will be identifed by the difference between the
  existing block devices prior to and after the AttachVolume call has
  finished. This does not address the race condition identified in andsens#459.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants