-
Notifications
You must be signed in to change notification settings - Fork 763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure valid checkpoints can be created when recovering from errors #392
Comments
Looks like this line sets the heritrix3/engine/src/main/java/org/archive/crawler/framework/CheckpointService.java Line 316 in 37ce8d6
It should be the last thing in the |
Only update last checkpoint stats if the checkpoint completed, for #392.
The stats are now only written if the checkpoint executes without throwing an exception, which is as good as it's likely to get for now. |
We hit occasional problems during checkpoint writing. These are mostly due to a non-checkpoint log file not being present when attempting to make a checkpoint, due to some earlier issue (not 100% clear what). This means, when you try to checkpoint, his happens:
But the checkpoint partially completes, so if you fix the problem (e.g. by adding an empty log file), and try to re-checkpoint, you get:
Which is all well and good, except that no coherent/consistent checkpoint has been written, so it's impossible to resume the crawl.
That check is implemented here:
heritrix3/engine/src/main/java/org/archive/crawler/framework/CheckpointService.java
Lines 252 to 259 in 37ce8d6
This could be addressed by making it possible to ignore the
lastCheckpointSnapshot
and force a checkpoint. Or, we could ensure that thelastCheckpointSnapshot
field only gets updated after the checkpoint is completely successfully executed.It is plausible that you might want to force a checkpoint even if no progress has been made, e.g. because it involved resolving an issue with the crawl state itself. But this seems like a rare exception.
The text was updated successfully, but these errors were encountered: