-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebDAV extention MPUT for multiple file upload [WIP] #12001
Comments
Base64 encoding the files will make the files about 33% larger. This is a massive performance hit. Can't we use multipart for that? Also what is the etag that the client sends to the server? |
Also will this work with *.part uploads here? - Otherwise the decoding of the base64 might reach the PHP memory limit very quick to my understanding. |
Too bad we can't just send pure binary instead of XML. @LukasReschke you are talking about chunks, right ? |
Yes. - Sorry for the incorrect term ;-) |
I'd say chunks and MPUT here are quite the opposite. The purpose of chunks is to split a file in pieces and reduce the size of requests. With an MPUT request I guess we'd rather have much bigger requests, or at least we'd bundle multiple small files as long as the total payload is less than 10 MB. |
the MPUT is designed for multiple small files - chunking is out of scope |
I'm free to use whatever encoding - that's why we can specify the encoding as attribute |
the etag is to be used for detection of change - according to If-Match header |
if we plan to use this as WebDAV extension we shall stick to XML - we could also start to implement our own interface ..... 🙈 |
Definitely an very interesting idea. It would be good to evaluate the possible performance improvements a bit more. |
I'd prefer HTTP Multipart as there you can use binary. In general I think this is the wrong way to go. You'll open a big can of worms. What about HTTP timeouts? What if some files fail and some succeed? The problem of stuffing more into one pipe has already been solved and is called SPDY (which is also supported by our HTTP library). Then on the server side the code can stay the same and each file can handled individually. |
WebDAV defines the multi-status-response which can return individual status for each file. But yes - I understand your concerns.
this would be worth testing - just like comparing this to mod_php and php-fpm |
Hi guys, Is it possible that your client is simply not optimized enough. The idea that a lot of small files can take a long time to upload makes a lot of sense to me, due to latency. But this should be largely solvable by using HTTP Pipelining, which afaik is pretty widely supported. Using HTTP Pipelining the client basically keeps on sending requests before waiting for a response from the server. A server would just send back all the response bodies in order, and clients that support this, can figure this all out. Trying to combine this all in one HTTP request seems like a hack. It would be a valid hack if pipelining is for some reason impossible. |
@evert Pipelining makes sense if it is GET requests from a reliable webserver. In our case though:
So if the current request in the pipeline is taking long, the other requests are not processed and because of our HTTP stack we don't have good insight on what is going on. That's why the client currently limits to 3 parallel requests and doesn't pipeline. I started investigating SPDY which I see as superior to Pipelining:
|
My understanding was that pipelining is something that the webserver handles (e.g.: nginx) and would just fire up multiple PHP "threads" without waiting for requests to be fully handled. I suppose that's a wrong assumption then? Lacking pipelining though, I would suggest using |
@evert Yes, from the PHP's perspective it is in parallel. However the webserver can only send back the reply as FIFO (it's a single pipe line after all, not a line with multiple channels). Another problem (why I mentioned only GET/idempotent is OK) is that one request might error out and then break the TCP connection and then you don't really know what happened to the other requests in the pipeline. I had implemented pipelining in QNetworkAccessManager some years ago :-) |
Isn't that the same for putting the requests in a single HTTP body? You're simply doing that on a different layer.
My point is the same here. Idempotence is actually a great point here, because you can actually just repeat the If SPDY makes this easier, this would be great as well though. It addresses the issue in a similar way: by fixing it on a protocol level, as opposed to the application level. |
How would the server side needed to be changed if we use SPDY? As I understand, it comes as a web server plugin, which handles more requests through shared TCP connections. That would mean that the server does not need general code changes to support that, but could "only" need to optimize handling of single requests in parallel. Another issue might be that not all cheap hosting offerings might offer the spdy module for the web server, but these would continue to run, with not so cool performance. Sounds interesting! |
My plan was to actually test different setups and measure the performance - @MorrisJobke you still owe me some dockers 😉 |
@DeepDiver1975 I know :( |
I don't know how quickly you want to move to SPDY, but mod_spdy is not yet available on Apache 2.4, requires PHP-FPM in order to run at full speed and a special version of mod_ssl which will be obsolete very soon. |
@oparoz No worries, SPDY is just an alternative to HTTPS, it does not change the sync algorithmn. All optional.. and still brainstorming here right now :) |
Should we continue this discussion for 8.1 ? |
(or during 8.1 but implement in a later release) |
Discussion during 8.1 for sure |
[Moved from #15264] You can apply it on top of https://github.com/owncloud/administration If you notice performance improvements worth the extra code complexity, we can then make sure that the client takes advantage of it. |
Unfortunately the Qt network stack currently doesn't support sending multipart requests except when using PUT and POST, so it would be easier client side to go that way instead of MPUT. Could you guys detect this from a different URI or from the multipart/mixed Content-Type? |
@jturcotte Let's not go crazy here, POST/PUT is just fine. Everything else brings other issues with load balancers or proxies :-) |
I agree that it will be more robust to define a new contenttype for a POST request, than to introduce a new HTTP method. |
@DeepDiver1975 defer to 8.2 ? |
We think we don't need bulk operations - closing this for now. |
In order to increase sync performance of multiple files we will introduce a new http verb MPUT.
Requirements:
Request:
The text was updated successfully, but these errors were encountered: