Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make vm_id assignment more robust #714

Merged
merged 1 commit into from
Oct 31, 2024

Conversation

olethanh
Copy link
Collaborator

Remove the counter way to assign a vm_id as it didn't work reliably

Jira ticket: ALEPH-272

That method was broken when persistent instances were loaded at start up. Since the "new" feature that allow persistent instance across aleph-vm reboot if one was started then aleph-vm was stopped and restarted the counter method could reassign the ip and break the existing vm's.

Secundary reason was that the feature wasn't working properly with the default settings, as 2**available_bits returned 1. So that code path was only used if the node owner tweaked some undocumented settings making it hard to identify and debug in prod nodes.

Self proofreading checklist

  • The new code clear, easy to read and well commented.
  • New code does not duplicate the functions of builtin or popular libraries.
  • An LLM was used to review the new code and look for simplifications.
  • New classes and functions contain docstrings explaining what they provide.
  • All new code is covered by relevant tests.
  • Documentation has been updated regarding these changes.

Copy link

codecov bot commented Oct 28, 2024

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.

Project coverage is 62.64%. Comparing base (7741ac8) to head (0da82f2).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/aleph/vm/pool.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #714      +/-   ##
==========================================
+ Coverage   62.59%   62.64%   +0.05%     
==========================================
  Files          69       69              
  Lines        6138     6131       -7     
  Branches      491      490       -1     
==========================================
- Hits         3842     3841       -1     
+ Misses       2152     2146       -6     
  Partials      144      144              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Remove the counter way to assign a vm_id as it didn't work reliably

Jira ticket: ALEPH-272

That method was broken when persitent instances were loaded at start up.
Since the "new" feature that allow persistent instance across aleph-vm reboot
if one was started then aleph-vm was stopped and restarted the counter method
could reassign the ip and break the existing vm's.

Secundary reason was that the feature wasn't working properly with the default
settings, as `2**available_bits` returned 1. So that code path was only used if
the node owner tweaked some undocumented settings making it hard to identify and
debug in prod nodes.
@olethanh olethanh force-pushed the ol-272-make-ip-adress-assignment-more-robust branch from 0aca695 to 0da82f2 Compare October 29, 2024 14:22
@nesitor nesitor merged commit 089ccef into main Oct 31, 2024
21 of 22 checks passed
@nesitor nesitor deleted the ol-272-make-ip-adress-assignment-more-robust branch October 31, 2024 13:48
Antonyjin pushed a commit that referenced this pull request Nov 19, 2024
Remove the counter way to assign a vm_id as it didn't work reliably

Jira ticket: ALEPH-272

That method was broken when persitent instances were loaded at start up.
Since the "new" feature that allow persistent instance across aleph-vm reboot
if one was started then aleph-vm was stopped and restarted the counter method
could reassign the ip and break the existing vm's.

Secundary reason was that the feature wasn't working properly with the default
settings, as `2**available_bits` returned 1. So that code path was only used if
the node owner tweaked some undocumented settings making it hard to identify and
debug in prod nodes.
Antonyjin pushed a commit that referenced this pull request Nov 19, 2024
Remove the counter way to assign a vm_id as it didn't work reliably

Jira ticket: ALEPH-272

That method was broken when persitent instances were loaded at start up.
Since the "new" feature that allow persistent instance across aleph-vm reboot
if one was started then aleph-vm was stopped and restarted the counter method
could reassign the ip and break the existing vm's.

Secundary reason was that the feature wasn't working properly with the default
settings, as `2**available_bits` returned 1. So that code path was only used if
the node owner tweaked some undocumented settings making it hard to identify and
debug in prod nodes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants