2.x: add Flowable.parallel() and parallel operators #4974

akarnokd · 2017-01-08T18:19:32Z

This PR adds the parallel() method to Flowable which opens up a sub-DSL with parallel operations. (Note that only a few operators make sense in a parallel settings.)

This parallel sub-DSL is not limited to computation tasks as it allows specifying the parallelism and the Scheduler to run the parallel 'rails'. For example, you can have parallel downloads that block:

Flowable.range(1, 100)
.parallel(10)
.runOn(Schedulers.io())
.map(v -> httpClient.blockingGet("http://server/item/" + v))
.sequential()
.observeOn(AndroidSchedulers.mainThread())
.subscribe(...);

codecov-io · 2017-01-08T18:32:19Z

Current coverage is 94.94% (diff: 74.69%)

Merging #4974 into 2.x will decrease coverage by 0.59%

@@                2.x      #4974   diff @@
==========================================
  Files           592        609     +17   
  Lines         37969      39186   +1217   
  Methods           0          0           
  Messages          0          0           
  Branches       5752       5968    +216   
==========================================
+ Hits          36273      37204    +931   
- Misses          741        955    +214   
- Partials        955       1027     +72

Powered by Codecov. Last update cd45675...14111b6

akarnokd · 2017-01-08T18:44:19Z

I'll restore the +95% coverage in a separate PR.

benjchristensen · 2017-01-09T17:34:46Z

I have a use case that could benefit from this depending on how it is implemented. I don't see the API for ParallelFlowable.merge(Flowable<Flowable<T>> flowables) however, which is what I'd need and have to manually do.

Let me describe the type of parallel processing and see if your goal of ParallelFlowable matches it.

100s of network connections, each spread across n event loops (say 16). The semantic behavior is to merge the 100s of connections into a single stream, then do groupBy on all of them, and on each GroupedFlowable then does a scan. With normal Flowable this is bad, as it takes the 16 threads and synchronizes them all, even though each source Flowable is on one of the 16 threads, and each output GroupedObservable can be processed concurrently again on those 16 threads.

In theory, a ParallelFlowable.merge(sourceFlowables).groupBy(...).scan(...) could allow the merge to support concurrent onNext and then ParallelFlowable.groupBy could re-emit a normal Flowable where scan works sequentially again.

Is this the type of thing you want ParallelFlowable to enable?

akarnokd · 2017-01-09T18:11:43Z

@benjchristensen No. ParallelFlowable optimizes for a fixed parallelism level with round-robin dispatch and round-robin join. The closest thing is the parallelStream() operator in Java 8 for computation-intensive tasks. Your case has an unknown number of inner sources to merge and an unknown number of groups that could appear.

benjchristensen · 2017-01-09T20:33:35Z

Too bad. Maybe someday I'll get around to making a "ConcurrentFlowable" happen ... but it's been on my todo list for 3 years, so not counting on it :-)

akarnokd · 2017-01-13T12:40:54Z

@JakeWharton Do you want to review this or if not, do you at least willing to accept it into RxJava 2?

artem-zinnatullin

Few nits.

artem-zinnatullin · 2017-01-15T19:31:37Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelCollect.java

+            return;
+        }
+
+        int n = subscribers.length;


Same in other similar places would be good.

I don't do those unless the variable has to be accessed from an inner class.

In long methods reader has to spend extra time to check that it's not modified anywhere, but ok

artem-zinnatullin · 2017-01-15T19:37:34Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelFlatMap.java

+/**
+ * Flattens the generated Publishers on each rail.
+ *
+ * @param <T> the input value type


nit: naming generic type parameters Input and Output would remove such comments and make code slightly more readable. Or I and O as a reference to common I/O abbr.

This is an established naming pattern with other generic types of RxJava.

Sure, "just saying"

artem-zinnatullin · 2017-01-15T19:39:11Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelFromArray.java

@@ -0,0 +1,51 @@
+/**
+ * Copyright 2016 Netflix, Inc.


I'll update the PR.

artem-zinnatullin · 2017-01-15T19:43:53Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelJoin.java

+                } else {
+                    SimpleQueue<T> q = inner.getQueue();
+
+                    // FIXME overflow handling


Signal MBE? When do you plan to resolve FIXME? It'll lead to silently dropped values…

Adding it right now.

artem-zinnatullin · 2017-01-15T19:49:40Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelPeek.java

+            } catch (Throwable ex) {
+                Exceptions.throwIfFatal(ex);
+                RxJavaPlugins.onError(ex);
+            }


add return to avoid request in case of error?

FlowableDoOnLifecycle doesn't return either. There is no good way to report an error and not inject a lot of overhead. See strict().

artem-zinnatullin · 2017-01-15T19:59:58Z

src/main/java/io/reactivex/parallel/ParallelFlowable.java

+     * times as this ParallelFlowable's parallelism level is.
+     * <p>
+     * No assumptions are made about the Scheduler's parallelism level,
+     * if the Scheduler's parallelism level is lwer than the ParallelFlowable's,


artem-zinnatullin · 2017-01-15T20:00:40Z

src/main/java/io/reactivex/parallel/ParallelFlowable.java

+     * times as this ParallelFlowable's parallelism level is.
+     * <p>
+     * No assumptions are made about the Scheduler's parallelism level,
+     * if the Scheduler's parallelism level is lwer than the ParallelFlowable's,


artem-zinnatullin · 2017-01-15T20:06:28Z

src/main/java/io/reactivex/parallel/ParallelFlowable.java

+    @BackpressureSupport(BackpressureKind.FULL)
+    @SchedulerSupport(SchedulerSupport.NONE)
+    @CheckReturnValue
+    public final Flowable<T> sequential() {


I think this should be a verb: sequentize()/etc to be consistent with other operators (which are verbs mostly).

Btw, reading chains like:

Flowable.range(1, 100) .parallel(10) .runOn(Schedulers.io()) .map(v -> httpClient.blockingGet("http://server/item/" + v)) .sequential()

feels strange because sequential after parallel looks like an operator that disables parallelization of the chain (of course it can't, but I dunno, it just reads strange to me).

Tis naming matches Java 8 Stream's parallel() and sequential() operators.

Makes sense, though JDK is not the best example of good naming.

artem-zinnatullin · 2017-01-15T20:07:07Z

src/main/java/io/reactivex/parallel/ParallelFlowable.java

+     * @return the new Px instance
+     */
+    @CheckReturnValue
+    public final Flowable<T> sorted(Comparator<? super T> comparator) {


Matches the naming of Flowable.sorted().

artem-zinnatullin · 2017-01-15T20:11:34Z

What about tests and benchmark comparisons with parallelization that you can achieve at the moment, using existing RxJava apis?

artem-zinnatullin · 2017-01-15T20:13:24Z

src/main/java/io/reactivex/Flowable.java

+     * and dispatches the upstream items to them in a round-robin fashion.
+     * <p>
+     * Note that the rails don't execute in parallel on their own and one needs to
+     * apply {@link ParallelFlowable#runOn(Scheduler)} to specify the Scheduler where


What about remove runOn and add Scheduler as a parameter to parallel()?

It is the same logic as with regular factory methods such as just, range, fromIterable don't take a Scheduler, plus you can apply multiple runOn's on a sequence at different stages. For example create a pipeline with stages of parallelism=2 and 3 stages in total.

plus you can apply multiple runOn's on a sequence at different stages

Ah, that's nice, got it.

akarnokd · 2017-01-15T21:27:32Z

I've added a benchmark and here are the results (i7 4770K, Windows 7 x64, Java 8u112):

Raw data

Clearly, parallel has lower overhead than flatMap-based, 1 element parallelism.

Comparing against groupBy, the benefits manifest with longer per-item computation but groupBy looks odd: in each compute/parallelism setup the numbers are really close to each other as if there wasn't actual parallel execution with groupBy. I have to investigate that further.

davidmoten

LGTM, I like it.

davidmoten · 2017-01-16T04:01:53Z

src/main/java/io/reactivex/internal/operators/parallel/ParallelJoin.java

+                    requested.addAndGet(-e);
+                }
+
+                int w = get();


I haven't seen this optimization before with missed (calling get before addAndGet). Is this particular to the ParallelJoin use case or do you expect to start applying it elsewhere too?

There are a couple of places which uses this pattern: range, observeOn, fromArray.

I'm not going to apply them eagerly because it is another local variable/register to worry about when there are lots of other locals in deeper/user code.

akarnokd · 2017-01-16T08:58:46Z

Updated the groupBy benchmark. I forgot that v was constant and thus the group expression didn't create 1..4 groups. New results (i7 4790, Windows 7 x64, Java 8u112):

For smaller computation, parallel has less overhead. For longer computation, they are roughly next to each other. Parallel uses round-robin collection whereas flatMap collects from a source as long as it can.

akarnokd added 2.x Enhancement labels Jan 8, 2017

akarnokd added this to the 2.1 milestone Jan 8, 2017

akarnokd requested a review from JakeWharton January 9, 2017 15:23

artem-zinnatullin suggested changes Jan 15, 2017

View reviewed changes

artem-zinnatullin reviewed Jan 15, 2017

View reviewed changes

2.x: add ParallelFlowable

14111b6

akarnokd force-pushed the ParallelFlowable branch from 8c009a9 to 14111b6 Compare January 15, 2017 21:19

davidmoten approved these changes Jan 16, 2017

View reviewed changes

Fix groupBy benchmark

c4df76f

akarnokd merged commit 6c88036 into ReactiveX:2.x Jan 18, 2017

akarnokd deleted the ParallelFlowable branch January 18, 2017 22:24

This was referenced Jan 18, 2017

2.0.5 release preparations #4983

Closed

2.1.0 major feature additions #4954

Closed

2.x: add Flowable.parallel() and parallel operators #4974

2.x: add Flowable.parallel() and parallel operators #4974

Conversation

akarnokd commented Jan 8, 2017

codecov-io commented Jan 8, 2017 • edited Loading

Current coverage is 94.94% (diff: 74.69%)

akarnokd commented Jan 8, 2017

benjchristensen commented Jan 9, 2017

akarnokd commented Jan 9, 2017

benjchristensen commented Jan 9, 2017

akarnokd commented Jan 13, 2017

artem-zinnatullin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artem-zinnatullin commented Jan 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akarnokd commented Jan 15, 2017

davidmoten left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akarnokd commented Jan 16, 2017

codecov-io commented Jan 8, 2017 •

edited

Loading