Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

cannot identify image file - current_task_preview #3245

Closed
ederenn opened this issue Aug 28, 2018 · 5 comments · Fixed by #3661
Closed

cannot identify image file - current_task_preview #3245

ederenn opened this issue Aug 28, 2018 · 5 comments · Fixed by #3661
Assignees
Labels

Comments

@ederenn
Copy link

ederenn commented Aug 28, 2018

Encountered on:

Mac OS GOLEM Version: 0.17.0+dev99.g6d235f1

during computations of a task: helicopter.blend 400x400 10 frames 30 subtasks, 4 subtask had been left uncomputed

in golem.log

2018-08-28 18:53:36 ERROR    golem.client                        task_server.sync_network failed
Traceback (most recent call last):
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/client.py", line 1485, in _run
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 140, in sync_network
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 881, in __remove_old_tasks
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskmanager.py", line 787, in check_timeouts
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/core/common.py", line 147, in func_wrapper
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 131, in computation_failed
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 452, in __mark_sub_frame
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 354, in _open_frame_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/PIL/Image.py", line 2519, in open
OSError: cannot identify image file '/Users/a/Library/Application Support/golem/default/mainnet/res/64301ada-aae1-11e8-837b-7c3ec369ac11/tmp/current_task_preview1.PNG'
018-08-28 18:53:38 ERROR    golem.client                        task_server.sync_network failed
Traceback (most recent call last):
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/client.py", line 1485, in _run
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 140, in sync_network
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 881, in __remove_old_tasks
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskmanager.py", line 787, in check_timeouts
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/core/common.py", line 147, in func_wrapper
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 131, in computation_failed
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 452, in __mark_sub_frame
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 354, in _open_frame_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/PIL/Image.py", line 2519, in open
OSError: cannot identify image file '/Users/a/Library/Application Support/golem/default/mainnet/res/64301ada-aae1-11e8-837b-7c3ec369ac11/tmp/current_task_preview8.PNG'
2018-08-28 18:54:00 ERROR    golem.client                        task_server.sync_network failed
Traceback (most recent call last):
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/client.py", line 1485, in _run
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 140, in sync_network
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskserver.py", line 881, in __remove_old_tasks
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskmanager.py", line 787, in check_timeouts
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/core/common.py", line 147, in func_wrapper
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 131, in computation_failed
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 452, in __mark_sub_frame
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 354, in _open_frame_preview
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/PIL/Image.py", line 2519, in open
OSError: cannot identify image file '/Users/a/Library/Application Support/golem/default/mainnet/res/64301ada-aae1-11e8-837b-7c3ec369ac11/tmp/current_task_preview4.PNG'
018-08-28 18:54:03 CRITICAL twisted                             An exception was raised from application code while processing a reactor selectable
Traceback (most recent call last):
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/python/log.py", line 103, in callWithLogger
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/python/log.py", line 86, in callWithContext
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/python/context.py", line 122, in callWithContext
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/python/context.py", line 85, in callWithContext
    
--- <exception caught here> ---
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/internet/kqreactor.py", line 293, in _doWriteOrRead
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/internet/tcp.py", line 205, in doRead
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/twisted/internet/tcp.py", line 211, in _dataReceived
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/network/transport/tcpnetwork.py", line 340, in dataReceived
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/network/transport/tcpnetwork.py", line 366, in _interpret
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/tasksession.py", line 123, in interpret
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/network/transport/session.py", line 76, in interpret
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/tasksession.py", line 485, in _react_to_want_to_compute_task
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/golem/task/taskmanager.py", line 396, in get_next_subtask
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/blender/task/blenderrendertask.py", line 484, in query_extra_data
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 452, in __mark_sub_frame
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/apps/rendering/task/framerenderingtask.py", line 354, in _open_frame_preview
    
  File "/Users/buildbot-worker/worker/buildpackage_macOS/build/.venv/lib/python3.6/site-packages/PIL/Image.py", line 2519, in open
    
builtins.OSError: cannot identify image file '/Users/a/Library/Application Support/golem/default/mainnet/res/64301ada-aae1-11e8-837b-7c3ec369ac11/tmp/current_task_preview3.PNG'

from res folder:

current_task_preview1

current_task_preview3

current_task_preview4

current_task_preview8

current_task_preview9

@Elfoniok
Copy link
Contributor

This bug is caused because preview file is handled in two separate threads.

/golem/task/taskserver.py", line 140, in sync_network
/golem/task/taskserver.py", line 881, in __remove_old_tasks
/golem/task/taskmanager.py", line 787, in check_timeouts
/golem/core/common.py", line 147, in func_wrapper
/apps/rendering/task/framerenderingtask.py", line 131, in computation_failed
/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview

and

/golem/network/transport/tcpnetwork.py", line 366, in _interpret
/golem/task/tasksession.py", line 123, in interpret
/golem/network/transport/session.py", line 76, in interpret
/golem/task/tasksession.py", line 485, in _react_to_want_to_compute_task
/golem/task/taskmanager.py", line 396, in get_next_subtask
/apps/blender/task/blenderrendertask.py", line 484, in query_extra_data
/apps/rendering/task/framerenderingtask.py", line 338, in _update_frame_task_preview

Preview file can be corupted this way, and PIL yields that it can not identify image file.

In general removing tasks in Golem is very poorl handled because of python
easier to ask for forgiveness than permission guidline, instead of ensuring that it is safe to delete task/subtask we just delete it and then handle lot of errors that it causes everywhere in datastructures of Golem.

We can go that way further and just provide proper locking of this file, or make it right this time.

@Elfoniok Elfoniok assigned shadeofblue and unassigned magdasta Sep 10, 2018
@shadeofblue
Copy link
Contributor

hmm... maybe the more interesting question is: why does __remove_old_tasks touch the same task that another node just issued a WantToComputeTask for (which suggests the task is still being computed)...

@etam etam assigned etam and unassigned shadeofblue Dec 3, 2018
@shadeofblue shadeofblue added _current and removed _next labels Dec 3, 2018
@etam
Copy link
Contributor

etam commented Dec 6, 2018

hmm... maybe the more interesting question is: why does __remove_old_tasks touch the same task that another node just issued a WantToComputeTask for (which suggests the task is still being computed)...

Because it's marking a subtask as timed out. Yeah, it's not something you wold expect in __remove_old_tasks.

@etam
Copy link
Contributor

etam commented Dec 10, 2018

So, every LoopingCallService runs in a separate thread. Can we move to Rust quicker? There a data race is a compilation error.

@etam
Copy link
Contributor

etam commented Dec 11, 2018

We're fixing a bigger issue here: #3660

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants