Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42090][3.2] Introduce sasl retry count in RetryingBlockTransferor #39710

Closed

Conversation

akpatnam25
Copy link

What changes were proposed in this pull request? This PR introduces sasl retry count in RetryingBlockTransferor.

Why are the changes needed?

Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario:

  1. SaslTimeoutException
  2. IOException
  3. SaslTimeoutException
  4. IOException

Even though IOException at #2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step #4. Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount.

Does this PR introduce any user-facing change? No

How was this patch tested?

New test is added, courtesy of Mridul.

Closes #39611 from tedyu/sasl-cnt.

Authored-by: Ted Yu yuzhihong@gmail.com
Signed-off-by: Mridul Muralidharan <mridulgmail.com>

### What changes were proposed in this pull request?
This PR introduces sasl retry count in RetryingBlockTransferor.

### Why are the changes needed?
Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario:

1. SaslTimeoutException
2. IOException
3. SaslTimeoutException
4. IOException

Even though IOException at apache#2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step apache#4.
Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
New test is added, courtesy of Mridul.

Closes apache#39611 from tedyu/sasl-cnt.

Authored-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
@github-actions github-actions bot added the CORE label Jan 23, 2023
@akpatnam25
Copy link
Author

@mridulm @dongjoon-hyun backport to 3.2 for SPARK-42090
cc @tedyu

@akpatnam25
Copy link
Author

fyi, this was a clean backport.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will wait for CI to pass

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

mridulm pushed a commit that referenced this pull request Jan 24, 2023
### What changes were proposed in this pull request? This PR introduces sasl retry count in RetryingBlockTransferor.

### Why are the changes needed?
Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario:

1. SaslTimeoutException
2. IOException
3. SaslTimeoutException
4. IOException

Even though IOException at #2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step #4. Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount.

### Does this PR introduce _any_ user-facing change? No

### How was this patch tested?
New test is added, courtesy of Mridul.

Closes #39611 from tedyu/sasl-cnt.

Authored-by: Ted Yu <yuzhihonggmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>

Closes #39710 from akpatnam25/SPARK-42090-backport-3.2.

Authored-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
@mridulm
Copy link
Contributor

mridulm commented Jan 24, 2023

Merged to 3.2.
Thanks for fixing this @akpatnam25 !

@mridulm mridulm closed this Jan 24, 2023
@dongjoon-hyun
Copy link
Member

cc @kazuyukitanimura since this lands at branch-3.2

sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
### What changes were proposed in this pull request? This PR introduces sasl retry count in RetryingBlockTransferor.

### Why are the changes needed?
Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario:

1. SaslTimeoutException
2. IOException
3. SaslTimeoutException
4. IOException

Even though IOException at apache#2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step apache#4. Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount.

### Does this PR introduce _any_ user-facing change? No

### How was this patch tested?
New test is added, courtesy of Mridul.

Closes apache#39611 from tedyu/sasl-cnt.

Authored-by: Ted Yu <yuzhihonggmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>

Closes apache#39710 from akpatnam25/SPARK-42090-backport-3.2.

Authored-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants