Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using --tdb true, --keep-tdb-mappings true returns "StoreConnection inValid (issued before a StoreConnection.release?)" #658

Closed
dougli1sqrd opened this issue Mar 24, 2020 · 10 comments · Fixed by #659

Comments

@dougli1sqrd
Copy link
Contributor

I'm using the tdb feature in Robot to make querying of large ontologies easier with report.

When I issue the same robot command twice, using the same tdb store, I am getting "StoreConnection inValid (issued before a StoreConnection.release?)".

I expected that robot would reuse the existing store. Am I running this incorrectly?

My full command is:

ROBOT_JAVA_ARGS=-Xmx12G ./robot report --input extensions/go-lego.owl --tdb true --tdb-directory tdb/ -k true -p ../sparql/neo/profile.txt -o reports/violations.report --print 10

Thanks!

@beckyjackson
Copy link
Contributor

Could you please point me to the file that you're using? Thank you!

@dougli1sqrd
Copy link
Contributor Author

The ontology is http://snapshot.geneontology.org/ontology/extensions/go-lego.owl,
and one of the queries in the profile is:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?entity ?property ?value WHERE
{
  VALUES ?entity { <http://purl.obolibrary.org/obo/ECO_0000314> }
  VALUES ?property { "NOT rdfs:subClassOf" }
  VALUES ?value {<http://purl.obolibrary.org/obo/ECO_0000000> }

  FILTER NOT EXISTS {
  ?entity rdfs:subClassOf* ?value .
  }

}

For context, I'm implementing these as sparql: geneontology/pipeline#35 (comment), hopefully with report.

@dougli1sqrd
Copy link
Contributor Author

It looks like I get the message anytime I change the queries being done on the ontology. I just ran a query and subsequent queries ran just fine and quickly. Then when I added a new query to the profile, it gave me the error.

@dougli1sqrd
Copy link
Contributor Author

Actually, I think I'm perhaps getting the error if I run a query, then run another but an error occurs (commonly I get the path to the queries wrong in the profile), and then when I run the command again it gives me the error, at which point I have to remove the directory and have it reload.

@beckyjackson
Copy link
Contributor

Thanks for the clarification.

I did the following:

  1. Report with valid query path - PASS
  2. Report with invalid query path - FAIL with StoreConnection inValid
  3. Report with valid query path - PASS

It looks like the StoreConnection inValid message occurred when we threw an exception while generating the report, then we try to release the TDB location.

What's weird is that if you do the same thing with query, it gives you the correct exception and releases without an issue. I'll do some digging into this.

@dougli1sqrd
Copy link
Contributor Author

Great thanks so much! Life became a little easier after I stopped messing up my query paths. But thanks for looking into it.

@beckyjackson
Copy link
Contributor

So it looks like the dataset is already being released somewhere on exception. I added some logging messages, and on success:

2020-03-26 07:53:22,220 INFO  org.obolibrary.robot.ReportOperation - Releasing dataset at tdb
2020-03-26 07:53:22,220 DEBUG TDB - <No txn>: Start flush delayed commits
2020-03-26 07:53:22,220 DEBUG TDB - <No txn>: End flush delayed commits
2020-03-26 07:53:22,221 INFO  org.obolibrary.robot.ReportOperation - Release complete!

On failure:

2020-03-26 07:54:17,476 DEBUG TDB - <No txn>: Start flush delayed commits
2020-03-26 07:54:17,476 DEBUG TDB - <No txn>: End flush delayed commits
2020-03-26 07:54:17,487 INFO  org.obolibrary.robot.ReportOperation - Releasing dataset at tdb
2020-03-26 07:54:17,487 WARN  org.obolibrary.robot.ReportOperation - StoreConnection inValid (issued before a StoreConnection.release?)

The flush delayed commits is a log from releasing.

If I comment out the TDBFactory.release(dataset) line, it still gets released when there is an exception. On success, it does not get released (no flush in log). I might be missing something, but I'm thinking there's something in the TDB code that kills it.

Unfortunately, I can't figure out a good way to check if the dataset has been released. We are not doing this in ROBOT other than in the finally block for query/report operations:

try {
report = getTDBReport(dataset, options);
} finally {
// Close and release
dataset.close();
TDBFactory.release(dataset);
if (!keepMappings) {
// Maybe delete
boolean success = IOHelper.cleanTDB(tdbDir);
if (!success) {
logger.error(String.format("Unable to remove directory '%s'", tdbDir));
}
}
}

Anyway, I'll make a PR that wraps this to ignore the exception on releasing a dataset.

@beckyjackson
Copy link
Contributor

When you have a chance, could you please test this fix?
Here is the JAR: https://build.obolibrary.io/job/ontodev/job/robot/job/658-fix/lastSuccessfulBuild/artifact/bin/robot.jar

@jamesaoverton
Copy link
Member

@dougli1sqrd Can you please take a minute to test this?

@dougli1sqrd
Copy link
Contributor Author

@beckyjackson I did the PASS, FAIL, PASS procedure you outlined above, and I didn't get any invalid error messages.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants