-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't overwrite PYTHONUNBUFFERED in PyTorchJob (and others) #1921
Comments
It makes sense. |
Feel free to open PR. |
If users can config env "PYTHONUNBUFFERED",they can also config other env like "WORLD_SIZE"、“RANK”. This is a common problem. Whether all the env setting place should keep the user configured value. |
I don't think that we should allow users to modify the |
So training-operator
|
Or, we can add validations to reject Jobs with managed envs like |
Yep! Using webhooks to validate managed environment variables should work. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
/remove-lifecycle stale |
@tenzen-y I am interested to take on this ticket. |
@tariq-hasan Sorry, this depends on some other issues. So, this is not a good first issue... |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
/remove lifecycle-stale |
/remove-lifecycle stale |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
/remove-lifecycle stale @tariq-hasan Hi, this was unblocked by #1993. |
Sounds good. Thank you so much. /assign |
The PyTorch controller sets
PYTHONUNBUFFERED=0
below, independent of wether or not the user has configured this variable themselves.training-operator/pkg/controller.v1/pytorch/envvar.go
Lines 54 to 60 in bb2b58a
When I set this value myself in a
PyTorchJob
as shown below, the resulting pod created by the controller has a secondPYTHONUNBUFFERED
environment variable set by the controller, which overwrites the user-defined value.The behavior that I would expect is that the PyTorch job controller sets
PYTHONUNBUFFERED
if and only if the user has not already set it in their ownenv
definition.Based on a quick
git grep
, I also expect this behavior to be occurring in thePaddleJob
andXGBoostJob
controllers as well.The text was updated successfully, but these errors were encountered: