Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Overview
Hello, and thank you for your hard work on this project! While using
verl
, I noticed a calculation error in thecompute_data_metric
function withinray_trainer
. Specifically, the function uses a 2D int tensor instead of the expected bool tensor for indexing. This discrepancy, based on PyTorch's advanced indexing mechanism, results in a 3D array rather than the expected 1D array.Problem Details
The core issue lies in the behavior of PyTorch advanced indexing when using non-boolean tensors:
Incorrect Mask Behavior:
Excessive Memory Usage:
The resulting data shape unexpectedly inflates from
batch_size * sequence_length
tobatch_size * sequence_length * sequence_length
.For large sequence lengths (e.g., 16k), this causes the program to exceed available memory (CPU OOM), leading to worker termination. The system may output the following cryptic error:
Minimal Reproduction Code
Below is a minimal code example that demonstrates the issue:
Proposed Fix
To resolve the issue, simply add
.bool()
to convertresponse_mask
into mask tensor as expected.