Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Input node does not return STOP event #627

Open
haixuanTao opened this issue Aug 15, 2024 · 2 comments
Open

No Input node does not return STOP event #627

haixuanTao opened this issue Aug 15, 2024 · 2 comments
Labels
bug Something isn't working python Python API

Comments

@haixuanTao
Copy link
Collaborator

Describe the bug
Python Node with no input always return None on .next() although we would want it to return STOP event so that we can gracefully stop it.

import pyarrow as pa
from dora import Node

node = Node()

node.send_output("data", pa.array([1, 2, 3, 4, 5]))

event = node.next()

assert event is not None, "we should expect a STOP event"
assert event["type"] == "STOP", "we should expect a STOP event"
nodes:
  - id: no_input
    path: no_input.py
    outputs:
      - data

  - id: terminal-print
    build: cargo build -p terminal-print
    path: dynamic
    inputs:
      data: no_input/data
@github-actions github-actions bot added bug Something isn't working python Python API labels Aug 15, 2024
@phil-opp
Copy link
Collaborator

That is expected with our current design. The event channel returns None as soon as all inputs of the node have been closed. This allows nodes to stop when they are no longer needed. The Stop event signals that the dataflow was stopped early as a consequence of a stop command. So it signals that the event channel might close sooner than usual.

We could of course special-case nodes with no inputs and keep their event channels open as long as the dataflow runs. However, such behavior differences would be inconsistent and surprising, and also difficult to document clearly. Another disadvantage is that next() is a blocking, so that a call without a timeout would block until the user cancels the dataflow.

How about we provide an additional dataflow_stopped function instead? This function could be non-blocking and return true as soon as the manual stop signal was received by the node. So it would be suitable for checking it a loop.

@haixuanTao
Copy link
Collaborator Author

That would be great! The name could be just finished() slightly shorter maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Python API
Projects
None yet
Development

No branches or pull requests

2 participants