Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The library consumes 100% CPU for iOS iOS TV #601

Closed
ogolosovskiy opened this issue Mar 9, 2019 · 19 comments
Closed

The library consumes 100% CPU for iOS iOS TV #601

ogolosovskiy opened this issue Mar 9, 2019 · 19 comments
Labels
[core] Area: Changes in SRT library core
Milestone

Comments

@ogolosovskiy
Copy link

ogolosovskiy commented Mar 9, 2019

The library consumes 100% CPU for iOS iOS TV
100% reproducible
Tested for iPhone 5s and 6plus
We see this problem without media connection and with media connection.

There is the loop forever here -
int CEPoll::wait(const int eid, set readfds, set writefds, int64_t msTimeOut, set lrfds, set lwfds)
{
....
while (true)
{
...
#if (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
#else
CTimer::waitForEvent();
#endif
}
return 0;
}

If we fix it, the problem is gone.

// #if (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
// #else
CTimer::waitForEvent();
// #endif

My measument results -

Without fix 100% CPU after library start
Without fix 100% CPU after start media connection (1 MBits)

After fix 1-2% CPU after library start
After fix 25-40% CPU after start media connection (1 MBits)

Profiler
https://i.gyazo.com/3334e3f324e93f7fb5637755273557a4.png

@ogolosovskiy
Copy link
Author

@alexpokotilo
Copy link
Contributor

alexpokotilo commented Mar 9, 2019

Hi @ogolosovskiy,
I think you need to provide more details about how do you use SRT library in your case.
e.g we use SRT to create outgoing connection and send video/audio info from IOS to remove host.
In our case SRT library doesn't eat CPU in such way you describe.

please also take a look at comment
// for iOS setting a timeout of 1ms for kevent and not doing CTimer::waitForEvent(); in the code below
// gives us a 10% cpu boost.

at https://github.com/Haivision/srt/blob/master/srtcore/epoll.cpp#L444
This comment as well as this code

  #if (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
      #else
      CTimer::waitForEvent();
#endif

is there from initial commit and it works at least in case I described so you need to describe your library usage more precisely. According to the comment CTimer::waitForEvent() skipped for IOS because of performance :)

@ogolosovskiy
Copy link
Author

ogolosovskiy commented Mar 9, 2019

Hi @alexpokotilo
Yes, we are using it on another manner
We waits for incoming connection and creates outgoing.
Before creating new outgoing connection the https://github.com/Haivision/srt/blob/master/srtcore/epoll.cpp#L421
always null. So it creates the loop forever.

Gist
https://gist.github.com/ogolosovskiy/6ba3c650b46c656067ba9e1a2b8ae983

@alexpokotilo
Copy link
Contributor

alexpokotilo commented Mar 9, 2019

Hi,
I'm not SRT library owner - we just contribute to the library as a company and particularly have submited Android and IOS build scripts/instructions and this is why I start checking your case to make 100% sure it is not my team's fault that you have such a problem.
I can neither check your fix nor check your approach unfortunately but your approach looks strange from first glance. If you can just get incoming connection|signal another way. This way SRT will work for sure.
I'm not sure anybody else interested in similar case on mobiles but risk of changes you propose is very high.

@ogolosovskiy
Copy link
Author

Hello Alex,

You are absolutly right. I dont like this fix too. I hope, that SRT team suppose the better solution.
I understand, that most users just create the outgoing connection from mobile phone. They dont need the iOS incoming connection. We are using not client to server, but client to client connections (P2P).

But I am not sure, that there is no problem for client to server connection. Can you apply my fix and check (outgoing connection) CPU usage before and after?

Oleg

@alexpokotilo
Copy link
Contributor

Hi @ogolosovskiy,
unfortunately proposed test gives us nothing as publishing case our team use works fine without fix and I cannot test and check all possible cases this fix can brake as well as cannot get responsibility for this fix.
If you still want to propose this fix you need to prepare push request :

  • test all possible IOS cases to make sure your fix don't brake other cases
  • form push request and ask initial author for review. According to "git blame" initial author was (Roman Diouskine) bdeb5d2
    This user doesn't exist so I think it's @rndi

I'm not on the SRT board and cannot accept push requests|ideas. So you have to follow standard github process to commit your fix.

@ethouris
Copy link
Collaborator

Frankly, this thing was looking very weird for me since the beginning, but unfortunately I'm not an expert for kevent.

I'll tell you the situation. Currently there's no way to wait simultaneously for an event on an SRT socket and on a system socket (we haven't changed this mechanism since UDT, and I still think we can do better, we just don't have enough manpower to implement all ideas and requests). So, as a halfway solution, it is waiting up to 10ms for an event to happen on an SRT socket, and then re-checks the system sockets (with 0 timeout this time). That is, it can end prematurely waiting for this 10ms for an SRT socket (it can be interrupted by a signal from a condition variable), but the next check of a system socket will happen at worst in the next 10ms.

If you can experiment a little bit and see if making this system work "default way" also on this system, I'll be appreciated. Recommended way is: add a new macro in the beginning of epoll.cpp file, which will be off by default, like:

#ifndef SRT_ENABLE_KEVENT_TIMEOUT
#define SRT_ENABLE_KEVENT_TIMEOUT 0
#endif

And then add this as an extra condition in lines 442 (where timeout for kevent is decided) and 521 (which blocks the call to CTimer::waitForEvent()).

If you confirm it, we'll prepare a fix.

@maxsharabayko maxsharabayko added the [core] Area: Changes in SRT library core label Mar 11, 2019
@maxsharabayko maxsharabayko added this to the v.1.3.3 milestone Mar 11, 2019
@maxsharabayko maxsharabayko pinned this issue Mar 11, 2019
@ogolosovskiy
Copy link
Author

To be clear -
There are 2 separated problems, not one.

  1. If we have no any active connection the if (lrfds || lwfds) always FALSE !!! So for this case, its just simple loop forever. No any kevent() calls. Nothing. Just loop and 100% CPU consuming.

  2. 1ms waiting timeout cause for me not 10% CPU boost, but 60% CPU boost.

Most client creates new connections, they don't need listen something on iOS. So they don't faces problem 1.
For second - please confirm this problem. It's easy to check. Create low bitrate media stream and check CPU consuming. With CTimer::waitForEvent() and without.

@ethouris
Copy link
Collaborator

What you say does completely make sense - if you don't have any system sockets to pick up, you also don't wait, even for any kevent timeout because you don't do any kevent.

For 2, depends on the definition of "boost". What I understand was to gain here is to use less CPU uselessly, at least for described cases.

@rndi , @jlsantiago0, can you elaborate on this?

I am not able to confirm anything because I don't have this platform at hand.

@ogolosovskiy
Copy link
Author

Fully agree. We should compare both - CPU and max packet rate.

@rndi
Copy link
Collaborator

rndi commented Mar 11, 2019

Hello @ogolosovskiy , @alexpokotilo ,

Could you see how well this patch works for you (it is pretty much what was originally suggested, but I would like to make sure we are testing with the same changes):

--- a/srtcore/epoll.cpp	2018-06-26 14:09:07.000000000 -0700
+++ b/srtcore/epoll.cpp	2019-03-11 14:04:50.000000000 -0700
@@ -439,15 +439,7 @@
             }
          }
          #elif defined(OSX) || (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
-         #if (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
-         // 
-         // for iOS setting a timeout of 1ms for kevent and not doing CTimer::waitForEvent(); in the code below
-         // gives us a 10% cpu boost.
-         //
-         struct timespec tmout = {0, 1000000};
-         #else
          struct timespec tmout = {0, 0};
-         #endif
          const int max_events = p->second.m_sLocals.size();
          struct kevent ke[max_events];
 
@@ -518,10 +510,7 @@
       if ((msTimeOut >= 0) && (int64_t(CTimer::getTime() - entertime) >= msTimeOut * int64_t(1000)))
          throw CUDTException(MJ_AGAIN, MN_XMTIMEOUT, 0);
 
-      #if (TARGET_OS_IOS == 1) || (TARGET_OS_TV == 1)
-      #else
       CTimer::waitForEvent();
-      #endif
    }
 
    return 0;

We've done some quick testing on our side and we are now getting <1% CPU usage when there is no connection in caller or listener modes (was 100% before the change). If you are still having trouble with this change in place, we will probably need to look at your API usage pattern next. If, on the other hand, you find that the issue is resolved, please open a pull request. Thanks!

@ogolosovskiy
Copy link
Author

Thank you @rndi
It works for me but i still have some questions.
I agree with @ethouris . The 100%CPU fix is not enough. We need to check max bandwidth. Unfortunately i have no the instrument for this.

Max bandwith is insceasing or decrising after this fix?
From common sense - CTimer::waitForEvent() should decrise bandwith under high load.
May be better solution is -
Set timespec tmout to 100ms
and add condition -
if (lrfds==0 && lwfds==0)
CTimer::waitForEvent() ?
May be something else?
We can claim nothing without tests.

PS
Timer expiration is not error. Its normal. It looks strange to use the "throw CUDTException(MJ_AGAIN, MN_XMTIMEOUT, 0)" instead regular return. I am using 200ms epool timeout. So i have 5 exceptions per second.

@ethouris
Copy link
Collaborator

Can't do much about what was already done in UDT - I also agree that for a case when no sockets were ready until the timeout, it should simply return 0. But we can't change it so simply now that it's in use.

This whole system works more-less this way - at least the default config for srt_epoll_wait:

  • The SRT subscribed sockets are checked (and added to output if readiness state is found)
  • If any system sockets subscribed, check them as well (<-- kevent timeout HERE)
  • If anything was collected in this above, exit with this state
  • If timeout passed, exit with timeout
  • Sleep on a timeout up to 100ms
    • Sleep by waiting on a condition variable (CTimer::waitForEvent())
    • Wake up on a signal from CTimer::triggerEvent() call
  • Repeat from the beginning

The "very best" solution would be to find some general purpose event library and make the SRT socket state change an event triggered as a user event, whereas managing of the user FDs would be managed by this library. This way the waiting function can go to sleep no matter what and get woken up when the first socket reports readiness, no matter from which domain, without this wakeup once per 100ms. But for this very best solution we currently lack manpower and it's hard to say if we find any in the near future.

So currently what can be done is to use waiting on EITHER kevent or SRT CV. The problem is that you have no guarantee that your epoll will be subscribed to any of them, although we can state that subscribing to system sockets only makes no sense. Waiting on kevent timeout may make sense, as long as you have at least one system socket subscribed to the system - so it MAY be a good idea to make a waiting timeout there, but only if you have any system sockets subscribed. If you don't - then the waiting on CTimer::waitForEvent is absolutely necessary.

@alexpokotilo
Copy link
Contributor

alexpokotilo commented Mar 12, 2019

trying to get how CEPoll::wait works I've found|fixed and sent pull request 6da5bd3
@ogolosovskiy, but I still don't get why we need to handle lrfds, lwfds params in your case ?
maybe I missed something but as far as I understand we don't need to use lrfds|lwfds in neither listen not push modes.

@ogolosovskiy
Copy link
Author

@alexpokotilo
I don't know. There is complicated logic.
Some descriptors handles here. Some at another place.
For example - we listens new incoming connections. We have active socket, but lrfds and lwfds are empty. "Incoming new connections listening" handles at another place.

@alexpokotilo
Copy link
Contributor

I probably missed the fact that lrfds, lwfds not used by @ogolosovskiy at all :)
But why epool::wait supports both system sockets as well as srt sockets in the same call ? if library cannot do that gracefully and ability is not used in almost all cases maybe it's time to remove it and simplify implementation. I cannot imagine why we need to use both SRT and non-SRT socket in SRT CEPoll::wait

@ogolosovskiy
Copy link
Author

ogolosovskiy commented Mar 12, 2019

@alexpokotilo
I am planning to use system sockets. But I don't use it now. We tunnels all traffic through 1 port. Not only SRT. Whole UDP traffic for our application.

@alexpokotilo
Copy link
Contributor

alexpokotilo commented Mar 12, 2019

@ogolosovskiy
it's more question to API in general. just imagine you don't have ability to add system socket to the library. in this case you just have to handle system socket yourself.
@ethouris
My question is to SRT interface in general: if we cannot handle both SRT and system sockets so that we awake per either SRT or system event without "magic" maybe it's time to stop trying to do that at all and remove system socket from interface and processing?

@alexpokotilo
Copy link
Contributor

#605 approved by the way so if I better understand what's going on here I will help with fix and tests.

@rndi rndi closed this as completed in 6e84a19 Mar 14, 2019
@maxsharabayko maxsharabayko unpinned this issue Mar 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core
Projects
None yet
Development

No branches or pull requests

5 participants