Convert the delaying node to a gated node to demonstrate my original idea #4

westonpace · 2023-06-07T13:01:04Z

As I mentioned in the PR, feel free to take this idea or leave it.

github-actions · 2023-06-07T13:01:27Z

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

rtpsw · 2023-06-13T11:40:52Z

cpp/src/arrow/acero/asof_join_node_test.cc

  static constexpr auto kKindName = "BackpressureDelayingNode";
  static constexpr const char* kFactoryName = "backpressure_delay";


Change these constants to "GatedNode" and "gated".

rtpsw · 2023-06-13T11:41:44Z

cpp/src/arrow/acero/asof_join_node_test.cc

@@ -1490,6 +1578,9 @@ void TestBackpressure(BatchesMaker maker, int num_batches, int batch_size,
    std::string name() const { return name_prefix + ";" + (is_fast ? "fast" : "slow"); }
  };

+  Gate gate;
+  GatedNodeOptions gate_options(&gate);


I don't see gate_options is used anywhere. @westonpace, are you sure the tester is doing what you intended?

Good catch. I must be missing something or getting lucky with timing

rtpsw · 2023-06-13T11:43:55Z

Overall, the approach looks good. However, see my comment - something seems to be missing.

rtpsw · 2023-06-13T14:23:31Z

The conflict-resolved version I just committed includes a fix that uses the gated node in front of one of the source nodes, which I believe was the intention. This fix allowed the backpressure tests to pass.

rtpsw · 2023-06-13T14:35:31Z

A local debug-run of this version caught something:

[ RUN      ] AsofJoinTest.BackpressureWithBatchesGen
/mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/acero/asof_join_node_test.cc:1636: Failure
Value of: _fut77.Wait(::arrow::kDefaultAssertFinishesWaitSeconds)
  Actual: false
Expected: true
[  FAILED  ] AsofJoinTest.BackpressureWithBatchesGen (64008 ms)

rtpsw · 2023-06-13T16:17:52Z

@westonpace, do you have an idea about the cause of the failure? If not, we may be better off using the existing approach in apache#35874 at least for the time being.

icexelloss · 2023-06-13T17:04:29Z

@rtpsw Do we know why the test fails with the gated version - This might be uncovering some bug?

rtpsw · 2023-06-13T19:30:12Z

@rtpsw Do we know why the test fails with the gated version - This might be uncovering some bug?

We do not know at the moment. It's not quick to determine, and for now I haven't made an attempt to debug more deeply.

westonpace · 2023-06-14T11:45:21Z

I'll take a look at this now

westonpace · 2023-06-14T12:47:39Z

I'm hitting this deadlock reliably now too. It appears that the process thread is exiting partway even though there is still data to process.

westonpace · 2023-06-14T12:59:07Z

Actually, I take that back. It doesn't seem to be process thread related. The two ungated inputs pause (as expected) and then the test immediately releases the gate. The gated node then sends a big flood of batches and gets paused. Then the gated node isn't being unpaused for some reason.

westonpace · 2023-06-14T13:00:57Z

Ah :) The gated node is not maintaining order. I will fix this.

…idea

…at it delivers batches in order.

westonpace · 2023-06-14T13:46:16Z

@rtpsw @icexelloss

I believe I've solved the problem in the gated version. There were no issues with the asof join node. The only problems were:

I wasn't applying the gated node at all (as Yaron noticed)
The BusyWait could incorrectly let the test pass because it wasn't re-verifying the counters after
The gated node would allow batches to get sent out of order

icexelloss · 2023-06-14T14:17:54Z

@rtpsw @icexelloss

I believe I've solved the problem in the gated version. There were no issues with the asof join node. The only problems were:
* I wasn't applying the gated node at all (as Yaron noticed)

* The BusyWait could incorrectly let the test pass because it wasn't re-verifying the counters after

* The gated node would allow batches to get sent out of order

Sweet. Thanks Weston!

icexelloss · 2023-06-14T14:23:24Z

cpp/src/arrow/acero/asof_join_node_test.cc

+      bool callback_added = maybe_unlocked.TryAddCallback([this] {
+        return [this](const Status& st) {
+          DCHECK_OK(st);
+          plan_->query_context()->ScheduleTask(


This reminds me of a (not so related) question I had. In asof join node, should we call

output_->InputReceived(this, std::move(out_b)) within ScheduleTask (since this is on the processing thread)

https://github.com/apache/arrow/blob/main/cpp/src/arrow/acero/asof_join_node.cc#LL1356C21-L1356C67

And in general, when to call InputReceived directly vs calling InputReceived within a ScheduleTask?

This reminds me of a (not so related) question I had. In asof join node, should we call

output_->InputReceived(this, std::move(out_b)) within ScheduleTask (since this is on the processing thread)

Yes, I noticed that as well. You probably should. That being said, it probably won't make too much difference since you're running everything single threaded anyways. However, even that isn't really true. Any plan with an asof join node today effectively becomes a "two-threaded" program. There is one thread for everything leading up to the asof join node and one thread for the processing and everything downstream. If you use schedule task then you'd just be transferring more work to the other thread so it might actually hurt.

So I was going to wait and make this suggestion when / if you add multithreading support to asof-join.

And in general, when to call InputReceived directly vs calling InputReceived within a ScheduleTask?

You should call it directly if you are going to keep processing the data that is in the current thread's CPU cache. You should schedule a new task if you are going to start processing a batch of data that isn't in the current thread's CPU cache.

This case fits the second condition. However, this case is also a special case anyways. This is because the thread that executes that callback will be the unit test thread (as part of the call to ReleaseAllBatches). We definitely want to transfer to the scheduler anytime we are coming from "outside the exec plan".

Any plan with an asof join node today effectively becomes a "two-threaded" program. There is one thread for everything leading up to the asof join node and one thread for the processing and everything downstream.

A bit more accurately, this "thread-splitting" occurs for each as-of-join node in the plan. We could create an issue to replace the internal-as-of-join-thread with some kind of execution facility.

. Any plan with an asof join node today effectively becomes a "two-threaded" program. There is one thread for everything leading up to the asof join node and one thread for the processing and everything downstream.

Yeah that has been something that I have slight concern over. I don't love the fact that downstream works, e.g., projections happens outside the serial execution thread. I was trying to get to the model where the asof join processing thread pulls data from the (buffered) asof join input queues, do the join and send the output to the downstream via the scheduler. So from downstream node point of view, the asof join processing thread is transparent to any other node and purely something internal to the asof join node.

It doesn't really bring any performance benefit I think (like you mentioned, it might actually make it run slower because more work is shifted to the scheduler thread), but I do like the simpler execution model of it (processing thread being transparent). And I think if we want to do that, we can do this by calling output_->InputReceived(this, std::move(out_b)) with ScheduleTask. (Yaron is working on some internal benchmark suite so once that is ready we can play with this more)

rtpsw · 2023-06-14T15:29:48Z

CI jobs failures seem unrelated.

westonpace mentioned this pull request Jun 7, 2023

GH-35838: [C++] Add backpressure test for asof join node apache/arrow#35874

Merged

github-actions bot added Component: C++ awaiting committer review labels Jun 7, 2023

rtpsw requested changes Jun 13, 2023

View reviewed changes

westonpace force-pushed the GH-35838 branch from 40cf8c6 to 4af7119 Compare June 14, 2023 13:34

westonpace added 2 commits June 14, 2023 06:44

Convert the delaying node to a gated node to demonstrate my original …

0cf8c47

…idea

Fix the test to actually use the gated node. Fix the gated node so th…

1edc446

…at it delivers batches in order.

westonpace force-pushed the GH-35838 branch from 4af7119 to 1edc446 Compare June 14, 2023 13:44

icexelloss reviewed Jun 14, 2023

View reviewed changes

github-actions bot added awaiting changes and removed awaiting committer review labels Jun 14, 2023

rtpsw merged commit 4ecd7ed into rtpsw:GH-35838 Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert the delaying node to a gated node to demonstrate my original idea #4

Convert the delaying node to a gated node to demonstrate my original idea #4

westonpace commented Jun 7, 2023

github-actions bot commented Jun 7, 2023

rtpsw Jun 13, 2023

rtpsw Jun 13, 2023

westonpace Jun 14, 2023

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023 •

edited

Loading

icexelloss commented Jun 13, 2023

rtpsw commented Jun 13, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

icexelloss commented Jun 14, 2023

icexelloss Jun 14, 2023

westonpace Jun 14, 2023

rtpsw Jun 14, 2023

icexelloss Jun 14, 2023

rtpsw commented Jun 14, 2023

		static constexpr auto kKindName = "BackpressureDelayingNode";
		static constexpr const char* kFactoryName = "backpressure_delay";

Convert the delaying node to a gated node to demonstrate my original idea #4

Convert the delaying node to a gated node to demonstrate my original idea #4

Conversation

westonpace commented Jun 7, 2023

github-actions bot commented Jun 7, 2023

rtpsw Jun 13, 2023

Choose a reason for hiding this comment

rtpsw Jun 13, 2023

Choose a reason for hiding this comment

westonpace Jun 14, 2023

Choose a reason for hiding this comment

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023

rtpsw commented Jun 13, 2023 • edited Loading

icexelloss commented Jun 13, 2023

rtpsw commented Jun 13, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

westonpace commented Jun 14, 2023

icexelloss commented Jun 14, 2023

icexelloss Jun 14, 2023

Choose a reason for hiding this comment

westonpace Jun 14, 2023

Choose a reason for hiding this comment

rtpsw Jun 14, 2023

Choose a reason for hiding this comment

icexelloss Jun 14, 2023

Choose a reason for hiding this comment

rtpsw commented Jun 14, 2023

rtpsw commented Jun 13, 2023 •

edited

Loading