-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
writeQueue full regression #4179
Comments
we are in the process of binary searching the culprit that caused this issue in our application. will keep the application running for couple hours to see if that helps. will then roll forward to the latest version where gortsplib was updated again (#4181) Please let us know if you think a known bug in gortsplib might be causing this issue so we can just deploy the latest version with the fix. |
commit 57addb1#diff-33ef32bf6c23acb95f5902d7097b7a1d5128ca061167ec0716715b0b9eeaa5f6 updated gortsplib from The commit before this change works fine. I tried the latest code and it fails the same way. write queue full messages after about 4-5 hours. Can you please help with this issue? @aler9 |
@krusadellc thanks for the feedback but of course I cannot rollback all the changes performed inside gortsplib that allowed to implement the new statistics system. |
I do not expect a rollback :) Please let me know if there's anything I can do to help root cause it. |
does the freeze involve a single client at a time or all the clients together? Furthermore, today version v1.11.2 came out with an additional safety check on the function in charge of sending packets to clients. You can test it and check whether the bug persists. |
the freeze is not limited to a single client. the freeze involves whole mediamtx freezing. even the API route on Trying to connect a new RTSP client also shows the same behavior. I already tried at commit 0a76806, but it was still failing the same way. So I don't think v1.11.2 would be any different. |
If you want to debug further, in these cases you can use pprof, which is a feature that allows to find out the list of active routines (also heap and memory but is unrelated from this). You can enable it by setting When the issue occurs, download a list of all active routines by using:
|
I will try that out, but I am afraid pprof endpoint may not respond when mediamtx freezes after the issue occurs. |
here's what pprof shows while mediamtx is in frozen state -
|
Which version are you using?
1.11.1
Which operating system are you using?
Linux amd64 standard
Describe how to replicate the issue
We are using mediamtx for one of our products that does RTSP Streaming.
With around ~16 clients connected, the streaming was working fine for days. We recently updated mediamtx to the latest version and we now start seeing
writeQueue full
errors after about ~4 hours.We went back to the previous version from December and the errors disappear.
So it seems there's some regression between the releases sometime in December and January that might be causing this issue.
We went through the other issues related to writeQueue full and tried workarounds like increasing the buffer size, but that did not seem to help.
Any help would be appreciated. Thank you!
Server logs
No response
Network dump
No response
The text was updated successfully, but these errors were encountered: