-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio stutters when networking has latency (needs to be implemented using non-blocking networking) #10832
Comments
Upon further testing, seems this problem only happens when there's players AND latency. |
I think the issue here is that the networking code generally uses blocking calls. I don't think anything can work unless all networking is done with non-blocking operations. In PPSSPP, when we hit a blocking operation (like reading from a file), we continue emulation - just on a different emulated PSP thread. To achieve this, we have to read files on a separate host OS thread from the emulator host OS thread, and it checks to see if things are complete. If we block on the main emulator host OS thread, then we cannot run any other PSP threads (including ones that generate audio.) This will cause audio to run poorly and may even cause the game to crash or misbehave. -[Unknown] |
@unknownbrackets can direct me into the main emulator host os thread ? all of #9882 implementation use non blocking operations , but still cannot figured where to init the thread so its not inside main emulator thread to simulate timeout properly. many of HLE implementation on sceNetAdhoc is blocking the main thread , where is should separate it on source code to make another host os thread just for use in the networking. currently creating another thread inside the __NetAdhocInit function. i need to fix this issues first before contiue to fix #12874 as digimon world redigitize use 2 thread networking and user main. when the networking module start from the game its heavily block the main thread because we re blocking inside HLE call make the game super slow even the network is connected on local router. i cant figure it out how PPSSPP handle the thread and do some pooling properly to avoid blocking inside the SceNetAdhoc HLE module if not necessary. there is already sound or io file implementation that can continue emulation when its supposed to block but i cannot figure it out on source code how thats work. maybe i can mimic that thread implementation to implement it properly. |
I see a So, here's what I'd suggest:
If you skip 6 above, it would in theory even be fine to keep the requests all blocking - but it might be better to use -[Unknown] |
thank for the fast response and the advice look like i am getting the general overview how its work. on my branch amultios-osx the implementation on library under is using select and i have all the flag to mark the send finish just need to sync it into main HLE thread. we have many thread to do here in the NetPlayManager. from my experience there is so many blocking call in Adhoc like making connection , Sending Data , updating the Adhoc Control , and Updating the Matching context many of them use blocking call that must have its own host thread. On my implementation simulating them to work fast use 4 host thread single port would not do it as the socket is overflowed with buffer easily and cannot keep up with the stream. looks like we need many manager to simulate entire adhoc library here. well i guess i need to separate module like on the real firmware to simulate everything in their place. sony implementation looks like calling their library under frequently based on error code given. WDYT its is good if i implement something like this ? SceNetModule (NetLibrary) need 1 thread
SceNetAdhocModule (AdhocLibary) need 2 threads
sceNetAdhocctlModule ( control library) need 1 thread or probaly 2
sceNetAdhocMatching (matching library) need 1 thread also
|
Right so first, I would inventory the possible calls. For example: enum class NetOperationType {
CONNECT,
DISCONNECT,
SENDTO,
RECVFROM,
ACCEPT,
SELECT,
}; And then you would also define a structure for the data associated with each queue operation: struct NetOperation {
NetOperationType type;
SceUID threadID;
uint64_t socket;
std::vector<uint8_t> data;
// In game time - CoreTiming::GetGlobalTimeUsScaled().
uint64_t timeoutUs;
union {
struct {
sockaddr_in src;
} recvfrom;
...
};
}; So, then, when the queue hits that operation it would do something like: size_t bytes;
bool done = false;
switch (task.type) {
case NetOperationType::RECVFROM:
bytes = recvfrom(task.socket, task.data.data(), task.data.size(), 0, &task.recvfrom.src, sizeof(task.recvfrom.src));
if (bytes == task.data.size()) {
// We sent everything.
done = true;
} else if (bytes != 0) {
// Need to not send the rest next try.
task.data = std::vector<uint8_t>(task.data.begin() + bytes. task.data.end());
}
break;
case ...
}
if (done) {
std::lock_guard<std::mutex> guard(threadsToWakeLock_);
threadsToWake_.push_back(task.threadID);
} And somewhere in this manager on the queue, it'd probably do something like: while (!stopped) {
fd_set reads;
fd_set writes;
fd_set excepts;
GetSocketsFromQueue(&reads, &writes, &excepts);
// We won't notice new queued operations during timeout.
timeval timeout;
timeout.tv_sec = 0;
timeout.tv_usec = 10000;
select(&reads, &writes, &excepts, &timeout);
ProcessTasks();
CheckTimeouts();
} I'm not sure, but I wouldn't necessarily try to put all the different full operations on netManager - but I haven't looked into it deeply. Generally, I think it's better to keep validating arguments, dealing with PSP structs, etc. inside the HLE funcs. I don't know if we need a separate thread for each host socket or socket type. I would think, using -[Unknown] |
from my experience multiplex into one socket will absolutely broke the emulation timing i try that before and so many of mini freeze waiting for packet at least we have one socket to multiplex full operation of one protocol. if we on tunnel mode this is enough. 4 main socket will be
that 4 main socket can connected in background ahead of time and just emulate the timing properly either by connecting it into the host of adhoc or do purely peer 2 peer at the game boot. what i cannot figure it out without a NetSocketManager higher layer abstraction is to handle the proadhoc protocol that do full P2P without multiplexing and sync the timing in background thread. we will have many socket to handle here syncing timeout , make sure the socket is still connected , can be flushed and so on. @anr2me might know better as he working in this one to make blocking call better. but without one thread per socket opened we will just ended in mini freeze again i guess that cause audio stutter. to keep on with psp compability we need to write that abstraction layer first before supplying it into the protocol under. some of PacketManager also will be good as if we travelling over internet compressing the packet help alot with snappy. already test it on Amultios its does a great job but need to make sure someone can sync the background thread of that many opened socket without multiplexing (kinda a lot of task and thread running in the background). some game that open many adhoc socket is phantasy star portable 2 its open 20 or more in 4 player play. this game tends to break a lot because of fail to connect or waiting something that not synced properly due to packet lost even in local this game still can break often. this ticket also can be easily produced on Digimon World Re:Digitize the game will be just freeze and audio stopping if we re not supplying any packet on blocking call. even if we supply it the resource on main thread already locked by the game and just keep freezing. |
Well, nginx famously multiplexes - it doesn't have to be slow. As long as you do it with non-blocking operations, the idea is you're just moving everything along one thing at a time. For example, this would be slow:
If only 1 byte is available, that will block and slow everything else down. Instead, you want to "keep the line moving." Read what you can, and then keep going:
Non blocking operations are fast, because they read from the OS buffer. You want the multiplexer to just have ONE JOB. Its job is to talk to the network and just get stuff done. It shouldn't be making decisions, looping through arrays, writing structs to PSP RAM, deciding error codes, etc. That's a distraction from its one true purpose: quickly multiplexing on the data. Instead, all those distractions can be handled outside while that thread is probably busy reading or writing more data to sockets. Using a thread per socket just is the "blocking socket operations" model of thinking. Definitely makes sense to use multiple sockets for sure, though. If a game makes a blocking call, the socket should just be non-blocking still. We'll keep the HLE thread asleep, and the netManager will keep select() and multiplexing that socket each loop. When it finally reads all its data, we will wake the thread. That way, we never once blocked, but from the PSP game's perspective, it was the same. The HLE thread was blocked until the read was complete. It doesn't need to know the difference. -[Unknown] |
aha got it that makes sense we can implement it then by just implement that multiplexer and does not care with how many opened fd under as long its still connected and take all the other distraction outside. one thing to handle is flushing the data outside and keep track with packet on queue and belong to which fd and we should discard if that happens. this can be tricky thought i think i can make it worked. packet queue manager is absolutely a must here to handle that distraction later. i wonder if this can be implemented on real psp hardware also but looks like a good approach to tackle this problem. |
Thanks, i was trying to understand how sceIo reading works (which usually have blocking behavour just like socket's recv function), i guess this is the correct procedure. And it seems Jpcsp also simulate blocking behavior using internal Wait type (JPCSP_WAIT_NET)
|
Also remember, when reading/receiving data from UDP you'll need to read the full available data in socket's buffer at once, because the OS will discard any leftover data, you can use MSG_PEEK flag to get the exact size available to read/recv. |
yep im aware of that and handle it on amultios per packet basis , ptp also do the same thing this is where the compression kick in to reduce the latency even more on internet play and boom we got tekken working fast almost like on local network on 30ms latency. well a lot to do here but we can refactor it to much cleaner code and start to fix more call in hle. Anyway if you need something to test in psp side just drop me a message. so we can dig it more with auto test. i need to write some remote logger homebrew to submit all psp client request at same time when launch it together and start inspecting the behavior on many psp at once. doing io in usb will crash psp test program if the usb is disconnected so i need some of remote pc that act as logging server on every checkpoint there to subtite the logcall to network instead of usb on host and we can inspect it better on adhoc apis. |
Hm, I was under the impression that with UDP, you always got one (full) message per -[Unknown] |
Since you need to tell recv/recvfrom() the size you wanted to receive, peek will be needed to prevent unexpected issue. If the size is smaller than available data (available data size can be smaller than socket buffer size, which depends on the sender) leftover data will be discarded, if the size is bigger than available data recv/recvfrom will return an EAGAIN/EWOULDBLOCK error (non-blocking) or will wait for more data (blocking) before returning. |
Really? https://pubs.opengroup.org/onlinepubs/009695399/functions/recvfrom.html I would expect passing a 64K buffer to recvfrom would just always succeed. What error code does it give? If you're trying to reduce latency/audio impact/etc. you probably want to reduce total syscalls, if possible. -[Unknown] |
if we on local passing the 64k data and recv it has no problem. its depend on the interface i think because its often fail if the buffer size larger than mtu in internet play in case of sending. but in receiving something that can worked but not sure about the performance penalty if we tend to pass large buffer in each call to recv. how ppsspp handle the syscall im not familiar with that code yet probably can do that also after understanding how its work. |
Well, when I say syscall I'm talking about OS (Linux, Windows, Android) overheads. It'd have to be benchmarked, but using a fixed 64K buffer and memcpy()ing out of it may be faster than making two syscalls. https://stackoverflow.com/questions/5103282/are-repeated-recv-calls-expensive -[Unknown] |
I know some work was done toward using non-blocking networking calls. Has this improved? -[Unknown] |
The issue is simple, if you start online, no matter if it's local, hamachi, OpenVPN or anything, if you start WLAN(By doing something like, entering multiplayer mode, selecting guild hall menu), the game's audio suddenly stops following the frameskip, and completely ignores it, though the video frameskips just fine as if nothing happens.
In result, the audio is crackling, and the video is not, the game runs fine and properly.
To reproduce this, simply stress PPSSPP enough until you can't run the game at full speed without auto frameskip, then you'll notice the audio and video are frameskipping properly like they should, 100% speed, go to multiplayer mode in any game, and the audio will begin acting weird. Crackling and out of sync.
The normal should be the audio working normally, as it would if the game was running at 100% speed.
Funny fact about this issue: The audio plays properly for a bit when you Alt Tab.
The text was updated successfully, but these errors were encountered: