Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Unable to find any free processors for remaining blocks" (compute_block_grid_mapping error) #236

Closed
vasdommes opened this issue May 1, 2024 · 0 comments · Fixed by #237
Assignees
Labels
Milestone

Comments

@vasdommes
Copy link
Collaborator

Exception

On one of the machines (MacBook M1 Air), the following exception happened in unit tests (not reproduced on other machines):

compute_block_grid_mapping
 Random
-------------------------------------------------------------------------------
../test/src/unit_tests/cases/block_mapping.test.cxx:49
...............................................................................

../test/src/unit_tests/cases/block_mapping.test.cxx:53: FAILED:
due to unexpected exception with messages:
 num_nodes := 2
 procs_per_node := 10
 num_blocks := 1001 (0x3e9)
 in compute_block_grid_mapping() at ../src/sdpb_util/block_mapping/
 compute_block_grid_mapping.hxx:169: 
 Unable to find any free processors for remaining blocks
 Stacktrace:
 0# compute_block_grid_mapping(unsigned long const&, unsigned long const&,
 std::__1::vector<Block_Cost, std::__1::allocator<Block_Cost>>) in /Users/
 thomas.pochart/sdpb/build/unit_tests
 1# CATCH2_INTERNAL_TEST_28() in /Users/thomas.pochart/sdpb/build/unit_tests

The error is thrown by the following piece of code:

if(min_block == available_block_maps.at(0).end())
{
LOGIC_ERROR(
"Unable to find any free processors for remaining blocks");
}

Explanation

The problem is that initially we set the iterator min_block = available_block_maps.at(0).end()), assuming that end() is semantically equivalent to nullptr. Then we set min_block to some real iterator pointing to the one of the available Block_Map's:

auto min_block(available_block_maps.at(0).end());
for(size_t node(0); node < num_nodes; ++node)
{
if(!available_block_maps[node].empty())
{
auto block = std::min_element(available_block_maps[node].begin(),
available_block_maps[node].end());
if(block->cost < min_cost)
{
min_block = block;

Now, if min_block is an iterator e.g. from available_block_maps[1], comparing it to an iterator from available_block_maps.at[0] leads to undefined behaviour.

In this specific case, min_block was set to available_block_maps.at(1).begin(), and then comparison available_block_maps.at(0).end() == available_block_maps.at(1).begin()) returned true, leading to the exception shown above.

Why did it return true? This makes sense if we imagine that all data for available_block_maps is stored in a contiguous memory region, and iterator comparison is implemented via raw pointer comparison. In that case, the end() of the first vector coincides with the begin() of the next one.

@vasdommes vasdommes added the bug label May 1, 2024
@vasdommes vasdommes added this to the 3.0.0 milestone May 1, 2024
@vasdommes vasdommes self-assigned this May 1, 2024
vasdommes added a commit that referenced this issue May 1, 2024
…ompute_block_grid_mapping error)

The code was comparing iterators for different containers, which leads to undefined behaviour.

In the specific case where the exception was observed (on a specific machine),
  available_block_maps[1].begin() == available_block_maps[0].end()
returned true, unexpectedly.

We fix it by using std::optional<iterator> and checking for null explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant