-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creator is in error state
while processing assets of bio.libretexts.org
#128
Comments
Those are nasty ones. Let's get in touch cause this may or may not be a regression on scraperlib.
Do you mean that it currently ignores all exceptions? That sounds like a bad idea. C-originated exceptions are all |
What I observed now:
This means that:
As a reminder, all assets are passed to the libzim as bytes (coming from a BytesIO): mindtouch/scraper/src/mindtouch2zim/asset.py Lines 157 to 160 in 7336f52
Given libzim message, I suspect the race condition might be that Python is freeing the bytes faster than the libzim is consuming it. But I have no clue how it can happens / what we should do. @rgaudin does it remind you something? did we had a recent change around this in python-libzim or python-scraperlib? About bytes and path manipulation, I recall of changes around |
I think so as well ... now 🤣 |
See https://farm.openzim.org/pipeline/e7ddd2d3-eae7-43fc-94d7-0aa4b1e77d04
Source of the problem seems to be
I will restart the recipe on same worker and see if issue happens again.
Note that it looks like the issue is transient: all assets fails to be added, then it works again, then it is again failing, ...
We should probably catch
Creator is in error state.
exceptions and fail the scrape on these ones, it is only going to create a broken ZIM.The text was updated successfully, but these errors were encountered: