enable grpc retry #322

AntonioYuen · 2021-04-30T17:02:09Z

Enables retry for our readside grpc services.

codecov · 2021-04-30T17:08:59Z

Codecov Report

Merging #322 (424e634) into master (d3ecdc7) will increase coverage by 0.26%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #322      +/-   ##
==========================================
+ Coverage   83.77%   84.04%   +0.26%     
==========================================
  Files          46       46              
  Lines         900      915      +15     
  Branches       19       19              
==========================================
+ Hits          754      769      +15     
  Misses        146      146

Impacted Files	Coverage Δ
...in/scala/com/namely/chiefofstate/NettyHelper.scala	`100.00% <100.00%> (ø)`
...namely/chiefofstate/readside/ReadSideHandler.scala	`100.00% <100.00%> (ø)`
...ly/chiefofstate/readside/ReadSideJdbcHandler.scala	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d3ecdc7...424e634. Read the comment docs.

Tochemey · 2021-05-01T14:42:49Z

code/service/src/main/scala/com/namely/chiefofstate/readside/ReadSideHandler.scala

+      if (!isSuccess && (maxAttempts <= 0 || numAttempts >= maxAttempts)) {
+        val backoffSeconds: Long = Math.min(maxBackoffSeconds, (minBackoffSeconds * Math.pow(1.1, numAttempts)).toLong)
+
+        Thread.sleep(Duration.ofSeconds(backoffSeconds).toMillis)


This is a bit dangerous because we are inside an actor. The projection is an actor making the grpc call. Let us just becareful here

What do you suggest in this scenario?

I think we should make the grpc call asynchronous and make use scala retry on future. That will make it more resilient.
For instance you can use these libraries:

https://github.com/softwaremill/retry (recommended because of the implementation is based upon this algo https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/)

https://doc.akka.io/docs/akka/current/futures.html#retry: the downside with this is you need the actor system and I don't think it is necessary.

This is my opinion I don't see why you should reimplement something that it is already out there in form of libraries in the jvm ecosystem.

So instead of using the blocking stub of the grpc client we can use the non-blocking one.

@zenyui we need to take into consideration this:
for the projection to listen and read events persisted the sharded daemon is hooked it into the existing actor system. With that type of setup I have some worries around the execution context. Since the actor system is making use of the default dispatcher the retry mechanism can hinder a bit the performance of the system(writeside). I think we may have to push this feature into its own execution context to avoid hindering the whole actor system.

@Tochemey I'm fine with using the retry package, however it's using promises, which is essentially blocking an entire thread just like a Thread.sleep. Why would this be superior aside from their jitter implementation?

Funny thing is... I started with the amazon algo, then was asked to simplify it :D

That is why I am saying we may have to consider a separate thread executor. Whatever we do needs another thread executor to think of it properly. These libraries are just suggestions.

The projection JDBC handler we are using does not have its own its thread pool compared to the slick version we stopped using.

@Tochemey the call is already blocking, I don't see the issue. I don't mind creating a new thread pool for remote calls, but using a future won't change anything. Under the hood, they are likely just using sleep.

code/service/src/test/scala/com/namely/chiefofstate/readside/Cancellable.scala

enable grpc retry

28bf0d9

AntonioYuen requested review from klvmungai, Tochemey and zenyui as code owners April 30, 2021 17:02

Tochemey and others added 7 commits April 30, 2021 20:00

Merge branch 'master' into grpc_retry

36c0284

Merge branch 'master' into grpc_retry

5861f26

exponential backoff

8316418

refactor backoff method and tests

afc2814

fix comment

b4539c0

Merge branch 'master' into grpc_retry

16adb8f

Merge branch 'master' into grpc_retry

cb5aeb2

Tochemey reviewed May 1, 2021

View reviewed changes

Merge branch 'master' into grpc_retry

424e634

Tochemey linked an issue May 2, 2021 that may be closed by this pull request

Exponential backoff for projections #318

Open

AntonioYuen mentioned this pull request May 3, 2021

Readside exponential backoff #329

Closed

AntonioYuen closed this May 5, 2021

Tochemey deleted the grpc_retry branch August 25, 2021 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable grpc retry #322

enable grpc retry #322

AntonioYuen commented Apr 30, 2021

codecov bot commented Apr 30, 2021 •

edited

Loading

Tochemey May 1, 2021

AntonioYuen May 1, 2021

Tochemey May 2, 2021 •

edited

Loading

Tochemey May 2, 2021

Tochemey May 2, 2021

AntonioYuen May 3, 2021 •

edited

Loading

AntonioYuen May 3, 2021

Tochemey May 3, 2021

Tochemey May 3, 2021

zenyui May 3, 2021

enable grpc retry #322

enable grpc retry #322

Conversation

AntonioYuen commented Apr 30, 2021

codecov bot commented Apr 30, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tochemey May 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AntonioYuen May 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 30, 2021 •

edited

Loading

Tochemey May 2, 2021 •

edited

Loading

AntonioYuen May 3, 2021 •

edited

Loading