Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. #943

MrMage · 2019-04-03T15:09:32Z

~2.7x-12x speedups for EventLoopFuture.{and,when}AllSucceed.

Motivation:

Projects like Vapor (e.g. https://github.com/vapor/template-kit/blob/master/Sources/TemplateKit/Pipeline/TemplateSerializer.swift) often need to process large arrays of futures where (almost) all futures have already been fulfilled. Such frameworks will benefit tremendously from any possible performance improvement to EventLoopFuture.{and,when}AllSucceed.

Modifications:

Rewrote {and,when}AllSucceed with inspiration from {and,when}AllComplete.
Added tests for whenAllSucceed.
Added benchmarks for whenAllSucceed and completeAllSucceed.

Result:

Speedups:

future_whenallsucceed_100k_immediately_succeeded_off_loop: 1.6799349784851074 / 0.13164806365966797 = 12.77
future_whenallsucceed_100k_immediately_succeeded_on_loop: 0.5427179336547852 / 0.14923691749572754 = 3.64
future_whenallsucceed_100k_deferred_off_loop: 2.1837170124053955 / 0.8010389804840088 = 2.73
future_whenallsucceed_100k_deferred_on_loop: 0.542646050453186 / 0.2016160488128662 = 2.69

Original runtimes on a Core i7-6700k:

measuring: future_whenallsucceed_100k_immediately_succeeded_off_loop: 1.6799349784851074, 1.7866089344024658, 1.7765229940414429, 1.7685890197753906, 1.5157029628753662, 1.642738938331604, 1.6204090118408203, 1.5972599983215332, 1.487981915473938, 1.5103060007095337, 
measuring: future_whenallsucceed_100k_immediately_succeeded_on_loop: 0.5427179336547852, 0.5418450832366943, 0.5492779016494751, 0.5393710136413574, 0.5387519598007202, 0.5326310396194458, 0.5234640836715698, 0.5387870073318481, 0.5265940427780151, 0.5305559635162354, 
measuring: future_whenallsucceed_100k_deferred_off_loop: 2.1837170124053955, 2.3039519786834717, 2.1825860738754272, 2.3145300149917603, 2.4204330444335938, 2.4434269666671753, 2.2526910305023193, 2.2747159004211426, 2.1828880310058594, 2.404868006706238, 
measuring: future_whenallsucceed_100k_deferred_on_loop: 0.542646050453186, 0.6132320165634155, 0.5573389530181885, 0.5628399848937988, 0.5481549501419067, 0.5543719530105591, 0.5560970306396484, 0.5577199459075928, 0.5452660322189331, 0.5439059734344482, 
measuring: future_whenallcomplete_100k_immediately_succeeded_off_loop: 0.14835500717163086, 0.14790797233581543, 0.1478050947189331, 0.14823198318481445, 0.14922690391540527, 0.14710009098052979, 0.14890599250793457, 0.14921092987060547, 0.14850103855133057, 0.1489109992980957, 
measuring: future_whenallcomplete_100k_immediately_succeeded_on_loop: 0.15264499187469482, 0.14969193935394287, 0.14757001399993896, 0.1522080898284912, 0.14965295791625977, 0.15176701545715332, 0.1497650146484375, 0.1475529670715332, 0.14810407161712646, 0.15112793445587158, 
measuring: future_whenallcomplete_100k_deferred_off_loop: 0.9009320735931396, 0.8606759309768677, 0.8378961086273193, 0.8444629907608032, 0.8646301031112671, 0.8610509634017944, 0.8540569543838501, 0.8730940818786621, 0.904228925704956, 0.8585549592971802, 
measuring: future_whenallcomplete_100k_deferred_on_loop: 0.16903400421142578, 0.16870009899139404, 0.1682649850845337, 0.17055797576904297, 0.17044198513031006, 0.17038702964782715, 0.16999197006225586, 0.16937005519866943, 0.17072999477386475, 0.17130303382873535,

Runtimes without the fast-path for already-completed futures:

measuring: future_whenallsucceed_100k_immediately_succeeded_off_loop: 0.17186903953552246, 0.16920197010040283, 0.170989990234375, 0.17129099369049072, 0.17086505889892578, 0.17077898979187012, 0.16939306259155273, 0.17117297649383545, 0.17174100875854492, 0.16971802711486816, 
measuring: future_whenallsucceed_100k_immediately_succeeded_on_loop: 0.1687079668045044, 0.17134392261505127, 0.1697540283203125, 0.17058801651000977, 0.172510027885437, 0.1710280179977417, 0.16765403747558594, 0.1700659990310669, 0.17064499855041504, 0.17007005214691162, 
measuring: future_whenallsucceed_100k_deferred_off_loop: 0.8068110942840576, 0.8203119039535522, 0.8302290439605713, 0.8282400369644165, 0.8519179821014404, 0.8158940076828003, 0.8139890432357788, 0.8262540102005005, 0.8299790620803833, 0.8337689638137817, 
measuring: future_whenallsucceed_100k_deferred_on_loop: 0.19305706024169922, 0.1964479684829712, 0.19446706771850586, 0.19203698635101318, 0.19434797763824463, 0.19169199466705322, 0.1933000087738037, 0.19249296188354492, 0.19386506080627441, 0.1929779052734375,

(Values for whenallcomplete omitted because those methods have not changed.)

Runtimes with the fast-path for already-completed futures:

measuring: future_whenallsucceed_100k_immediately_succeeded_off_loop: 0.13164806365966797, 0.13887691497802734, 0.1480560302734375, 0.14489400386810303, 0.13263404369354248, 0.13490605354309082, 0.13424301147460938, 0.13548588752746582, 0.13828301429748535, 0.16892695426940918, 
measuring: future_whenallsucceed_100k_immediately_succeeded_on_loop: 0.14923691749572754, 0.15650200843811035, 0.13460004329681396, 0.13374602794647217, 0.13292694091796875, 0.148590087890625, 0.14843904972076416, 0.13741397857666016, 0.13701403141021729, 0.13629603385925293, 
measuring: future_whenallsucceed_100k_deferred_off_loop: 0.8010389804840088, 0.8186479806900024, 0.815155029296875, 0.8011970520019531, 0.7963169813156128, 0.8042939901351929, 0.8173959255218506, 0.7985730171203613, 0.8387489318847656, 0.8120409250259399, 
measuring: future_whenallsucceed_100k_deferred_on_loop: 0.2016160488128662, 0.2044239044189453, 0.1832129955291748, 0.21035611629486084, 0.20476889610290527, 0.20411407947540283, 0.20260095596313477, 0.2056729793548584, 0.20184290409088135, 0.2035679817199707,

MrMage · 2019-04-03T15:11:28Z

@swift-nio-bot test this please

MrMage · 2019-04-03T15:20:44Z

Happy to backport this to NIO1 so that Vapor 3 can benefit, by the way.

…llSucceed`.

MrMage · 2019-04-03T15:23:32Z

Also, thanks to @Mordil for providing a great example with _reduceCompletions0.

Mordil

I'll leave it to the NIO team to make any real decisions - but I left some thoughts on these changes.

Do you plan to add the fast-track path for {and,when}AllComplete?

Sources/NIO/EventLoopFuture.swift

Mordil · 2019-04-03T15:33:13Z

Sources/NIO/EventLoopFuture.swift

@@ -1044,7 +1054,83 @@ extension EventLoopFuture {
    ///     - on: The `EventLoop` on which the new `EventLoopFuture` callbacks will fire.
    /// - Returns: A new `EventLoopFuture` with all of the values fulfilled by the provided futures.
    public static func whenAllSucceed(_ futures: [EventLoopFuture<Value>], on eventLoop: EventLoop) -> EventLoopFuture<[Value]> {
-        return .reduce(into: [], futures, on: eventLoop) { (results, value) in results.append(value) }


Both of these methods were using .reduce, and I can't help but wonder if perhaps the performance boost could have been done in that method?

I'm sure reduce and fold have potential for further optimization, but I don't think their performance will come anywhere close to this bespoke implementation. For example, the argument to fold is a closure that returns a future. I.e. we'd need to create at least one extra future (several more with the current implementation of fold, actually) for each element in the array. That alone generates so much overhead that it can't match the performance of bespoke implementations.

There could be a point made about when the performance of a bespoke implementation is needed, however, Vapor & Co. process a ton of already-fulfilled futures where this is significant.

MrMage · 2019-04-03T16:22:57Z

Do you plan to add the fast-track path for {and,when}AllComplete?

Good idea. Done. Results:

measuring: future_whenallsucceed_100k_immediately_succeeded_off_loop: 0.1378859281539917, 0.13671696186065674, 0.13865399360656738, 0.13784301280975342, 0.1380389928817749, 0.13869500160217285, 0.1374189853668213, 0.13842201232910156, 0.13793694972991943, 0.13764894008636475, 
measuring: future_whenallsucceed_100k_immediately_succeeded_on_loop: 0.13898098468780518, 0.13831603527069092, 0.13934898376464844, 0.13744008541107178, 0.1376969814300537, 0.13988101482391357, 0.13975894451141357, 0.13864600658416748, 0.13968896865844727, 0.1402740478515625, 
measuring: future_whenallsucceed_100k_deferred_off_loop: 0.8700159788131714, 0.8707640171051025, 0.8721489906311035, 0.8901969194412231, 0.8609009981155396, 0.8743259906768799, 0.8825899362564087, 0.8498630523681641, 0.8573849201202393, 0.842851996421814, 
measuring: future_whenallsucceed_100k_deferred_on_loop: 0.2021169662475586, 0.20193004608154297, 0.2004399299621582, 0.20185601711273193, 0.20127105712890625, 0.19970500469207764, 0.20102405548095703, 0.19995999336242676, 0.2022179365158081, 0.17802095413208008, 
measuring: future_whenallcomplete_100k_immediately_succeeded_off_loop: 0.1101679801940918, 0.10986900329589844, 0.10995292663574219, 0.10981106758117676, 0.10877501964569092, 0.10875797271728516, 0.10990703105926514, 0.10846090316772461, 0.10954999923706055, 0.10873901844024658, 
measuring: future_whenallcomplete_100k_immediately_succeeded_on_loop: 0.10942995548248291, 0.11032795906066895, 0.10942709445953369, 0.10890007019042969, 0.10935795307159424, 0.10984694957733154, 0.10956501960754395, 0.10968005657196045, 0.11066591739654541, 0.1102520227432251, 
measuring: future_whenallcomplete_100k_deferred_off_loop: 0.8521190881729126, 0.8377310037612915, 0.8450978994369507, 0.8614420890808105, 0.851652979850769, 0.8680590391159058, 0.8755439519882202, 0.877761960029602, 0.8509070873260498, 0.857975959777832, 
measuring: future_whenallcomplete_100k_deferred_on_loop: 0.17288005352020264, 0.17352807521820068, 0.17383790016174316, 0.17525804042816162, 0.17185401916503906, 0.17182600498199463, 0.17424309253692627, 0.1721489429473877, 0.1710139513015747, 0.1730639934539795,

I.e. a nice 30% speedup on AllComplete as well.

Lukasa

I think this generally looks really good! I have left a note in the diff.

Sources/NIO/EventLoopFuture.swift

weissi

thank you very much, even performance tests 🙌! That looks good to me.

…e.{and,when}AllSucceed`. (apple#943)"

PopFlamingo · 2019-04-06T13:36:38Z

@MrMage Does this mean we can now use again the previous implementation of Vapor’s flatten now that you pushed your changes upstream ?

MrMage · 2019-04-06T13:44:56Z

@MrMage Does this mean we can now use again the previous implementation of Vapor’s flatten now that you pushed your changes upstream ?

Vapor's NIOKit is using NIO's whenAllSucceed and will receive these benefits "for free". Once this PR's backport to NIO 1 (#947) is merged and tagged, we can change Core's flatten method to also use NIO 1's new whenAll. I am going to take care of that.

* Cherry-pick "Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. (#943)" * NIO 1 fixes.

This was referenced Apr 3, 2019

Significantly improve the performance of `Collection<FutureType>.flatten vapor/core#199

Merged

Encoding template data and rendering templates has a lot of overhead working with already-fulfilled futures vapor/template-kit#52

Closed

Significantly improve the performance of `EventLoopFuture.{and,when}A…

f7f65a4

…llSucceed`.

MrMage force-pushed the faster-flatten branch from 486d648 to f7f65a4 Compare April 3, 2019 15:21

Mordil reviewed Apr 3, 2019

View reviewed changes

PR improvements.

52e24f0

MrMage force-pushed the faster-flatten branch from c8bd14d to 52e24f0 Compare April 3, 2019 16:20

Add the fast track for AllComplete as well.

fb21584

weissi added the semver/minor Adds new public API. label Apr 3, 2019

weissi added this to the 2.0.1 milestone Apr 3, 2019

MrMage mentioned this pull request Apr 3, 2019

[Discussion] Would tagging a 1.14 version be an option? #945

Closed

weissi added semver/patch No public API change. and removed semver/minor Adds new public API. labels Apr 3, 2019

Lukasa requested changes Apr 4, 2019

View reviewed changes

Sources/NIO/EventLoopFuture.swift Outdated Show resolved Hide resolved

MrMage added 2 commits April 4, 2019 10:40

Use future.inEventLoop instead of pointer comparison.

23a1114

Merge branch 'master' of github.com:apple/swift-nio into faster-flatten

f1c95e8

weissi approved these changes Apr 4, 2019

View reviewed changes

Lukasa approved these changes Apr 5, 2019

View reviewed changes

Lukasa merged commit 77d9025 into apple:master Apr 5, 2019

MrMage added a commit to MrMage/swift-nio that referenced this pull request Apr 5, 2019

Cherry-pick "Significantly improve the performance of `EventLoopFutur…

62e5650

…e.{and,when}AllSucceed`. (apple#943)"

MrMage mentioned this pull request Apr 5, 2019

Backport whenAll[Succeed] to NIO 1 #947

Merged

weissi pushed a commit that referenced this pull request Apr 8, 2019

Backport whenAll[Succeed] to NIO 1 (#947)

6b914e8

* Cherry-pick "Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. (#943)" * NIO 1 fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. #943

Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. #943

MrMage commented Apr 3, 2019 •

edited

Loading

MrMage commented Apr 3, 2019

MrMage commented Apr 3, 2019

MrMage commented Apr 3, 2019

Mordil left a comment

Mordil Apr 3, 2019

MrMage Apr 3, 2019

MrMage commented Apr 3, 2019 •

edited

Loading

Lukasa left a comment

weissi left a comment

PopFlamingo commented Apr 6, 2019

MrMage commented Apr 6, 2019 •

edited

Loading

Significantly improve the performance of EventLoopFuture.{and,when}AllSucceed. #943

Significantly improve the performance of EventLoopFuture.{and,when}AllSucceed. #943

Conversation

MrMage commented Apr 3, 2019 • edited Loading

Motivation:

Modifications:

Result:

MrMage commented Apr 3, 2019

MrMage commented Apr 3, 2019

MrMage commented Apr 3, 2019

Mordil left a comment

Choose a reason for hiding this comment

Mordil Apr 3, 2019

Choose a reason for hiding this comment

MrMage Apr 3, 2019

Choose a reason for hiding this comment

MrMage commented Apr 3, 2019 • edited Loading

Lukasa left a comment

Choose a reason for hiding this comment

weissi left a comment

Choose a reason for hiding this comment

PopFlamingo commented Apr 6, 2019

MrMage commented Apr 6, 2019 • edited Loading

Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. #943

Significantly improve the performance of `EventLoopFuture.{and,when}AllSucceed`. #943

MrMage commented Apr 3, 2019 •

edited

Loading

MrMage commented Apr 3, 2019 •

edited

Loading

MrMage commented Apr 6, 2019 •

edited

Loading