Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reactor::block_on can lead to connection errors #1710

Closed
tiagolobocastro opened this issue Aug 6, 2024 · 1 comment
Closed

Reactor::block_on can lead to connection errors #1710

tiagolobocastro opened this issue Aug 6, 2024 · 1 comment
Assignees
Labels
BUG Something isn't working

Comments

@tiagolobocastro
Copy link
Contributor

tiagolobocastro commented Aug 6, 2024

Describe the bug
Sometimes connecting to a target can fail:

ERROR mayastor::spdk:nvme_fabric.c:596] Connect command failed, rc -125, trtype:TCP adrfam:IPv4 traddr:127.0.0.1 trsvcid:8420 subnqn:nqn.2019-05.io.openebs:replica-0-513   
ERROR mayastor::spdk:nvme_tcp.c:2017] Failed to poll NVMe-oF Fabric CONNECT command 

To Reproduce
Steps to reproduce the behavior:
A bit of luck and a lot of concurrent nexus creation and destruction and/or rebuilds.

Expected behavior
Either nexus creation fails or rebuild ioqpair fails to connect.

Additional context
Seems the issue is caused by usage of Reactor::block_on which can prevent processing of messages which have been taken out of thread message ring, and therefore are not processed during the block_on

As a WA increasing the fabrics connect timeout can help mitigate this.

@tiagolobocastro
Copy link
Contributor Author

Released as part of 2.7.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant