Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

koordlet: fix prodReclaimablePredictor result to avoid influence of o… #2325

Conversation

lijunxin559
Copy link
Contributor

…versold

Ⅰ. Describe what this PR does

When calculating Allocatable[mid] resources, due to possible oversold, ProdReclaimableMetric will be greater than NodeAllocatable * thresholdRatio, so the calculated Allocatable[mid] value accidentally includes the oversold part. However, our previous attempts at modifying the computational model in PR #2291 were not sufficient as they would erase the role of the prodPod estimation model, resulting in the loss of the more stable mid resource characteristics after modification. Therefore, further modifications to the prodPod are needed.

Ⅱ. Does this pull request fix one issue?

Therefore, I optimized the behavior of ProdReclaimablePredictor by adjusting the values based on the node's runtime information when returning the prediction results, thereby affecting the collectMetric results. And added necessary related tests has proved that the modified calculations are reasonable.

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

pkg/koordlet/prediction/peak_predictor.go Outdated Show resolved Hide resolved
pkg/koordlet/prediction/peak_predictor.go Outdated Show resolved Hide resolved
pkg/koordlet/statesinformer/impl/states_nodemetric.go Outdated Show resolved Hide resolved
@lijunxin559 lijunxin559 force-pushed the fix-prod-reclaimable-predictor-result-to-avoid-oversold branch from f068dab to a13d624 Compare January 21, 2025 03:20
Copy link

codecov bot commented Jan 21, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 31 lines in your changes missing coverage. Please review.

Project coverage is 66.08%. Comparing base (79036cf) to head (7c9ac0b).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
pkg/koordlet/prediction/peak_predictor.go 74.32% 13 Missing and 6 partials ⚠️
pkg/koordlet/metrics/resource_summary.go 0.00% 7 Missing ⚠️
.../koordlet/statesinformer/impl/states_nodemetric.go 58.33% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2325      +/-   ##
==========================================
- Coverage   66.09%   66.08%   -0.02%     
==========================================
  Files         458      458              
  Lines       54200    54270      +70     
==========================================
+ Hits        35823    35862      +39     
- Misses      15803    15828      +25     
- Partials     2574     2580       +6     
Flag Coverage Δ
unittests 66.08% <66.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pkg/koordlet/metrics/resource_summary.go Outdated Show resolved Hide resolved
pkg/koordlet/metrics/resource_summary.go Outdated Show resolved Hide resolved
pkg/koordlet/prediction/peak_predictor.go Outdated Show resolved Hide resolved
pkg/koordlet/prediction/peak_predictor.go Show resolved Hide resolved
@lijunxin559 lijunxin559 force-pushed the fix-prod-reclaimable-predictor-result-to-avoid-oversold branch 2 times, most recently from 4f05190 to 6cfc773 Compare January 21, 2025 06:41
…versold

Signed-off-by: lijunxin <lijunxin.ljx@alibaba-inc.com>
@lijunxin559 lijunxin559 force-pushed the fix-prod-reclaimable-predictor-result-to-avoid-oversold branch from 6cfc773 to 7c9ac0b Compare January 21, 2025 09:44
Copy link
Member

@saintube saintube left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@saintube saintube added the lgtm label Jan 21, 2025
@saintube
Copy link
Member

PTAL /cc @zwzhang0107 @hormes @jasonliu747

@koordinator-bot koordinator-bot bot merged commit 98057ae into koordinator-sh:main Jan 23, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants