-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NDArray.set() fails on linux with " Inplace update to inference tensor outside InferenceMode" #1774
Comments
@demq
Could you try the setter with
This is in version 0.18.0, which is just released. |
There is no change to this behavior when I use the NDArray.set(NDIndex, Number), the error is coming from the way PyTorch restricts the modification of the tensors after inference: |
I see. Ok, if this is a bug in Pytorch, can you report an issue to PyTorch to see if they can solve this from their side? |
@demq |
I need to clarify the reason I think this issue is a bug in DJL.
DJL appears to be invoking the "InferenceMode" for the "newer" version of torch: https://github.com/deepjavalibrary/djl/blob/master/engines/pytorch/pytorch-native/src/main/native/ai_djl_pytorch_jni_PyTorchLibrary_inference.cc
The
I suppose the M1 version of the djl is compiled with the
DJL Needs to either Document this behavior, or ensure the tensors can be modified after the inference in PyTorch implementation to ensure the function behaves the same for all engines.
|
@demq Thank you so much for this detailed investigation! The purpose of InferenceMode guard, is to free the array from being changed by autograd in the inference mode. We should try to keep consistent with this. But in your case, where you wanted to modify the inference array, we will need to think about how to resolve it. |
I have just updated the document of I didn't do duplicates inside DJL, but leave it users, since it is good to keep default behaviour same as the engines. |
Description
The NDArray.set() fails when trying to update tensors in a post-processing stage with a message:
This behavior is observed when running the code with PyTorch engine on a linux machine, the same code runs without any errors on a Mac M1 Pro. The work-around is to first duplicate the tensor by calling a NDArray.duplicate() and performing the .set() on the new tensor.
Expected Behavior
The PyTorch implementation of the NDArray should either perform the tensor duplication when trying to modify the tensors outside of the InferenceMode, or these tensors should be made immutable.
Error Message
How to Reproduce?
Create a custom QATranslator, override the processOutput() method like
Steps to reproduce
(Paste the commands you ran that produced the error.)
Create a QA predictor using a model based on the custom translator and using "PyTorch" engine. Run
predictor.predict()
on a linux machine.What have you tried to solve it?
Making a duplicate of the "output" tensors resolves the issue:
NDArray startLogits = list.get(0).duplicate();
Environment Info
Please run the command
./gradlew debugEnv
from the root directory of DJL (if necessary, clone DJL first). It will output information about your system, environment, and installation that can help us debug your issue. Paste the output of the command below:The text was updated successfully, but these errors were encountered: