-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: output_buffering = Off / 0
in php and sendfile lack support is destroying Nextcloud download speeds
#39384
Comments
Cc @nextcloud/server-backend |
Not a reply, just adding some historical context: Also an update for the broken link referenced at the above link: And explanation in PHP config sample since it's better than what's in the PHP docs:
|
As of my understanding, evert response was completely wrong: output_buffering works as any output buffer, it acumulates some data in memory in order to improve performance by not performing the output to the slower device /output every time that it is requested. Just imagine that you want to write 1.000.000 words, and you call the write function 1.000.000 times. It would call to the hdd device 1.000.000 times without buffering which is crazy, you kill performance and cpu with uneeded cycles. With a buffer of 4096 "words" you will reduce it to just 244 writes to the operating system. The slower the latency of the device, worse the problem gets. Having a buffer between 4kb to 8kb is basic for IO performance. I did some tests with 8192 and it did not perform better. With php recommended value of 4096 was best. It won't ever full your ram as every 4096 or after given delay, the buffer is flushed. The output device and operating system also have their buffers, even when the php buffer gets flushed, ir doesn't mean that the operating system will even write to disk, as every OS decides when and how to do it with their buffer policies and depending of the device. And modern nvme drives will cache even more before writting to disk anyway. And my experience that I was having downloading files slower the first time and faster following downloads of the same file, is due to Linux systems that do use unused ram as disk cache. So when you first read and write the file to the output without php bufffer, performance did suck because of device latency executing unneeded cycles. But following downloads were served directly from RAM, and that latency is so fast that this was not even a problem. You cant test this easily with my example and free enough RAM. If you download again and again the same file without php buffering, first time is slow, and following downlaods are fast. But if you perform two different file downloads with very big files alternatelively, like 50GB without buffering, speed is always slow. |
The only reason to flush a before before it gets filled is because you are streamming a pretty big response, and your write speed is so slow that it gets some time to the server to fill the buffer, so the client does not start receiving some data until it gets full. Just imagine an API that executes a very very slow SQL, and it outputs just "true" every 30 seconds. The client would close the connection with a http timeout, as you won't fit the 4kb buffer fast enough and the client would not receive any data if you don't flush the buffer (or close the socket). I cannot think of any use case for nextcloud that would fall into to this use case, as it mainly treats files that do fullfill any buffer in a blink of an eye. And regarding consistency between remote file transfers, files might get corrupted because of multiple factors, even is possible that the tcp protocol might eventually fail in very very very rare cases. TCP checksums are just 16 bytes wide, so collisions might happen and even a corrupted fragment might produce the same hash as a correct fragment. In network transfers between the client and the server would be important to perform an additional hash validation in the client, and later on in the server after writting it to the disk to check the file consistency to assure that the transfer is 100% correct and nobody, or any device or bug has altered the file on its destination. In file copies in the same machine this won't be even needed if you don't pre-process or manipulate files. |
output_buffering = Off / 0
in php is destroying Nextcloud download speeds
Are you certain that copying data from stream a to stream b would involve the output buffer? The documentation states1:
I'd say in your sample script, the output is echo and phpinfo. Footnotes |
Hi!, in my sample script the function my_stream_copy_to_stream does use fwrites. I'm not a php expert at all but this is common to many programming languages. Here states that fwrites does use an 8k buffer by default, but there is no explicit buffer set so it will use system default settings I guess: |
I can understand where you're coming from, but I doubt that writing a stream would use the PHP output buffer. Please let us know if you have any reliable information. |
But why Is nexctloud disabling output_buffering against official php recommendation? Even when it was disabled in the past when no clear reason and this repository is full of issues regarding nexctloud download speeds? This should be investigated furrher. Please don't close it still as I'm still researching this, and my examples already show a clear difference in performance. |
Thank you; we're looking forward to your research! |
I'm preparing the project to build a new docker images with some changes to see if it really makes a difference or not, compairing the official one with the customized. I will update this issue with what I discover thanks. |
Pydio also recommends disabling it, probably for the same reasons than Nextcloud.
|
output_buffering should be disable were needed and not globally, as disabling buffers for big responses like a file download can kill the server performance. And is very common having a reverse proxy in front of nextcloud for providing ssl connection encription and additional security, so you are forcing it to encript very small fragments very often, killing even more performance. |
I'm trying to build a docker image with enabled output_buffering without any luck. I tried replacing all references to output_buffering = 0 everywhere and building a new docker image, but it seems that the docker php install is overriding some params and is impossible to me to override them. Any ideas where I should set this? As php_info keeps me saying that output_buffering = 0 and I replaced it everywhere. |
It seems that in the past owncloud did support the sendFile apache function, that makes php delegate file downloads to apache or nginx so it was much faster, instead of doing it 100% in php. It seems that they did remove it because if not they could not guarantee the file locking feature: |
Just doing a bit recap of all download problemes from what I could see in nextcloud, owncloud and sabre repositories:
|
@yevon: I don't think OB is set anywhere directly by the community Docker image, since it's already disabled in PHP by default. But
If you comment out those references it'll still be off, since that's PHP's default, but you can then change it at your discretion. Just changed it in .htaccess to 4096 "successfully": |
Thanks! I will try this this week to see if it makes any difference. |
From what all I cand find, php download performance is terribly bad, and it is better to delegate this task to apache or nginx. It seems that all this began with the lack of sendfile support during downloads, it was disabled in the past when the file locking feature was introduced in owncloud and it has not been revised since then. Handling downloads directly in php seems to be slow, and it would be much faster letting nginx / apache do its work, that is downloading files. Owncloud is not supported anymore and they don't update the project. Should I open another issue to solve the issues with the file lock feature and the lack of sendfile support? I tried to discuss it here, but the project is closed: owncloud/core#40887 |
Can somebody help me to find the exact code in nextcloud to download files? It is completely really handled by sabre-io/http or it has some additional code? I will try to wait until nextcloud 28 is release to see if the updated sabre/io lib makes any difference. I also posted to see if sabre is going to support the sendfile feature during downloads. sabre-io/dav#120 |
output_buffering = Off / 0
in php is destroying Nextcloud download speedsoutput_buffering = Off / 0
in php and sendfile lack support is destroying Nextcloud download speeds
Sendfile was deleted in the past in this commmit: And this file now corresponds to this in nextcloud: server/lib/private/legacy/OC_Files.php Line 443 in 4c10dc2
Is this code still in use? It was disabled because with sendfile, once the file download starts, you cannot know when it finishes for unlocking it in php. But there are some fixes that could be done to solve this:
I have some doubts about how file locking works. If two users try to download a file at the same time, the second user has to wait until the first one finishes the download? It seems so by reading the code responsible of the download. |
Nice found! It seems that even when you enable output buffering, it cleans the buffer anyway here: https://github.com/nextcloud/server/blob/master/lib/private/Files/View.php#L393 update: Doesn't seem to use this at all during downloads isn't it? |
See #29467 (comment) |
Thanks I will try the patch to see the difference in performance. |
This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions. |
Bug description
It seems that all docker / helm and default installations of nextcloud do set the php param output_buffering = 0. After seeing a bug with the library used for nextcloud in / out #19567, I decided to futher investigate the download issues that I always had with nextcloud. It was even more noticeable in low power devices like a raspberry pi, or in mechanical drives. And in SSD drives I was getting very inconsistent download speeds, first time file download was always slow, and following downloads were faster.
So I decided to test this out on a 10gb ethernet local network nexcloud instance, with a fast nvme drive, a slow HDD and with output_buffering disabled, and output_buffering enabled set to default 4096. Results:
Nvme output_buffering disabled (600 - 880 MB/S inconsistent speeds)
HDD output_buffering_enabled (141 MB/S)
It there any reason to keep output_buffering disabled in php? As it seems to be made on purpose. If some code needs to perform a buffer flush it should do it, without needing to disable it globally.
Steps to reproduce
I just made a simple php program to test this out, emulating the function that nextcloud does use during file copy or download operations, it prints obtained speeds on those operations. Then you just have to disable output_buffering Off, or let it at 4096 which is the recommended default value to see the difference.
The code to test this out was a simple read and wirte operation on disk of a big file
Expected behavior
This parameter should be default in nextcloud:
output_buffering = 4096
Installation method
Community Docker image
Nextcloud Server version
27
Operating system
Debian/Ubuntu
PHP engine version
PHP 8.2
Web server
Apache (supported)
Database engine version
PostgreSQL
Is this bug present after an update or on a fresh install?
Fresh Nextcloud Server install
Are you using the Nextcloud Server Encryption module?
None
What user-backends are you using?
Configuration report
No response
List of activated Apps
No response
Nextcloud Signing status
No response
Nextcloud Logs
No response
Additional info
No response
The text was updated successfully, but these errors were encountered: