Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BP4 invalid blockID error #2339

Closed
caitlinross opened this issue Jun 17, 2020 · 2 comments
Closed

BP4 invalid blockID error #2339

caitlinross opened this issue Jun 17, 2020 · 2 comments

Comments

@caitlinross
Copy link
Collaborator

@pnorbert I've run into a bug in ADIOS with BP4 engine and I think it may be related to #1639 (I'm at least getting the same error as that issue).

I'm using ADIS for reading alongside a running XGC sim. In ADIS, I'm using Begin/EndStep calls and ensure that I'm only doing Get calls once BeginStep has returned OK. If I use this approach to read a BP file that's no longer being written to, there's no problem. If I start the reader at the same time as the simulation, first step runs fine, but after that I hit this error:

terminate called after throwing an instance of 'std::invalid_argument'
  what():  ERROR: invalid blockID 0 from steps start 0 in variable dpot, check argument to Variable<T>::SetBlockID, in call to Get

I also tried letting the sim get more of a head start and starting the reader after about 5 or 6 steps had already been written. I thought maybe there was somehow a bug that was causing BeginStep to return OK when it shouldn't, and that I was trying to do Get calls before those variables were actually in the file, but that doesn't appear to be the case.
Here's what I did:

  • let simulation complete steps 0-4
  • confirmed with bpls that all 5 steps were in the file
  • started reader (while simulation was working on step 5)
  • reader ran fine until step 5, where I got the above error (even though the simulation was already writing step 8 or 9 at that point and all the previous steps were in the file according to bpls)

I added in some debugging output into ADIOS and think I managed to track down the problem (or at least part of it). It looks like m_StepsStart in the variable gets reset incorrectly at some point due to m_FirstStreamingStep not being set correctly. Here's some output:

[ADIS] reader.BeginStep
BP4Reader::InitBuffer
BP4Deserializer::DefineVariableInEngineIOPerStep m_StepsStart set to 0
VariableBase::ResetStepsSelection zeroStart = 0, m_StepsStart 0, m_FirstStreamingStep 1
    setting m_StepsStart to 0
[ADIS] ADIOS stepstatus OK
[ADIS] source 3d is on step 0
[ADIS] about to make adios Get call
BP4Deserializer::InitVariableBlockInfo var dpot m_StepsStart 0, stepsCount 1
[ADIS] Get call done
[ADIS] reader.EndStep

[ADIS] reader.BeginStep
VariableBase::ResetStepsSelection zeroStart = 0, m_StepsStart 0, m_FirstStreamingStep 0
    setting m_StepsStart to 1
[ADIS] ADIOS stepstatus OK
[ADIS] source 3d is on step 1
[ADIS] about to make adios Get call
BP4Deserializer::InitVariableBlockInfo var dpot m_StepsStart 1, stepsCount 1
[ADIS] Get call done
[ADIS] reader.EndStep

... (same thing for steps 2-4, with m_StepsStart being set appropriately for that step, until step 5)

[ADIS] reader.BeginStep
BP4Reader::ProcessMetadataForNewSteps
BP4Deserializer::DefineVariableInEngineIOPerStep m_StepsStart set to 5
VariableBase::ResetStepsSelection zeroStart = 0, m_StepsStart 5, m_FirstStreamingStep 1
    setting m_StepsStart to 0
[ADIS] ADIOS stepstatus OK
[ADIS] source 3d is on step 5
[ADIS] about to make adios Get call
BP4Deserializer::InitVariableBlockInfo var dpot m_StepsStart 0, stepsCount 1
terminate called after throwing an instance of 'std::invalid_argument'
  what():  ERROR: invalid blockID 0 from steps start 0 in variable dpot, check argument to Variable<T>::SetBlockID, in call to Get

It seems like m_FirstStreamingStep should be 0 in that final call to VariableBase::ResetStepsSelection but it isn't. I'm assuming after m_IO.RemoveAllVariables() in BP4Reader::ProcessMetaDataForNewSteps is called that m_FirstStreamingStep flag is not being set appropriately when defining the variables again. Naively it seems like it should be a relatively easy fix, but I'm not super knowledgeable on this engine's implementation, so I wanted to post the issue. I'm happy to take a stab at fixing it, if I'm right about what the problem is.

build/OS info:

  • Ubuntu 18.04
  • ADIOS2 v 2.6
  • gcc version 7.5.0
  • CMake 3.15.5
@pnorbert
Copy link
Contributor

Please retest with master and let me know if this is still an issue.

@caitlinross
Copy link
Collaborator Author

Just tried it out and everything's working. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants