Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: Wouldn't UDP be able to benefit from a sendv() function? #4302

Closed
ronkorving opened this issue Dec 16, 2015 · 23 comments
Closed

rfc: Wouldn't UDP be able to benefit from a sendv() function? #4302

ronkorving opened this issue Dec 16, 2015 · 23 comments
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. feature request Issues that request new features to be added to Node.js.

Comments

@ronkorving
Copy link
Contributor

Right now if you want to send multiple buffers, you have to concatenate them before calling send(). It seems however that libuv is perfectly fine with receiving a bunch of buffers. I can imagine that in particular use cases, this could speed up message delivery tremendously. Would love to hear thoughts on this.

@mscdex mscdex added dgram Issues and PRs related to the dgram subsystem / UDP. feature request Issues that request new features to be added to Node.js. labels Dec 16, 2015
@Fishrock123
Copy link
Contributor

cc @mcollina

@mcollina
Copy link
Member

I absolutely agree! This is supported by libuv, so I think we should add it here as well. Also, the performance of dgram is quite bad, so this would help a lot.

However, I think we should have a better send function, rather than introducing a sendv function.

What I would like to achieve is a send function with the following signatures:

  • socket.send(buf, offset, length, port, address[, callback]) - backward compatibility, the offset and length parameters are legacy code from an old node version that did not have buffers. I would add a deprecation for this signature.
  • socket.send(buf, urlObject, [, callback]) - where urlObject is what is returned from URL.parse, or a string, which is then parsed by URL.parse. Also this has the same signature for the rsinfo object when receiving a message, so a response is easily constructed.
  • socket.send(buf, port, host, [, callback]) - same as before, but with splitted port/host.
  • socket.send([buf1, buf2, ... ], urlObject [, callback]) - as @ronkorving said.
  • socket.send([buf1, buf2, ... ], port, host, [, callback]) - same as before, but with splitted port/host.

I can work on this. Anybody has any opinions on this job?

Thanks for ccing me @Fishrock123 :).

@mscdex
Copy link
Contributor

mscdex commented Dec 16, 2015

FWIW I'm -1 on deprecating the offset, length parameters because having those parameters makes things more efficient when you want to send a subset of data from a larger Buffer.

@mcollina
Copy link
Member

@mscdex I never got the use case for that. 100% of my usage with udp is allocating a buffer and copying stuff in it, and then send. You usually need to add some header, and do some data manipulation on the input. Having the multiple buffer form will be super handy in those cases.

IMHO, offset, length is not consistent with most of the node api: stream.write() does not have an offset and length. Anyway, I'm suggesting to deprecate it at some point, but we would need to keep it around for long for backward compatibility.

@mscdex what is your opinion on the rest of the changes?

@mscdex
Copy link
Contributor

mscdex commented Dec 17, 2015

Yes, it's unfortunate more of the node API doesn't support avoiding slicing by allowing offset, length arguments. I personally have encountered cases while creating protocol modules where I only want to send a subset of a Buffer that I already have. While the overhead of slicing is smaller than having to copy the data too, it's still overhead that could be avoided for such use cases.

The rest of the suggested changes look fine to me.

@ronkorving
Copy link
Contributor Author

@mcollina Thanks for being onboard with this! I love your further proposal. Let me know if there is anything I can do to help.

@mcollina
Copy link
Member

@nodejs/collaborators any opinions on this? It would be good to have some consensus on the direction we should take. We are missing a dgram/udp wg anyway, sorry for spamming everybody.

@ronkorving all the help you can give :D. Let's see what the others thinks about this, and then we can chat how we collaborate to get this done.

Currently we are missing a lot of benchmarks on dgram, and they are part of the net folder: https://github.com/nodejs/node/blob/master/benchmark/net/dgram.js. It is possibly worth adding a couple more (at least one involving a domain lookup, as that is done for each message), so that we do not create performance regressions. Also moving them out of net would make sense for me.

Also cc @trevnorris as for our nodeconf.eu discussion.

@indutny
Copy link
Member

indutny commented Dec 17, 2015

Sounds terrific! I'm all in for this.

@ronkorving
Copy link
Contributor Author

More benchmarks would definitely be great. We need to be able to measure that what we do pays off. To compare though, when I implemented a similar solution in fs.WriteStream the benefits (depending on buffer size) would go up to 100-fold. I can imagine it being even higher in the case of UDP, since it doesn't depend on disk-latency.

Moving them out of /net makes sense indeed.

@mcollina
Copy link
Member

Yes, it's unfortunate more of the node API doesn't support avoiding slicing by allowing offset, length arguments. I personally have encountered cases while creating protocol modules where I only want to send a subset of a Buffer that I already have. While the overhead of slicing is smaller than having to copy the data too, it's still overhead that could be avoided for such use cases.

I agree on the general topic. Did you experience this with dgram? The reason why I am asking is because of udp nature of messages, as you would need headers for every message you send, and in the current API you are forced to allocate a new packet and do a buffer.copy(), with all the downsides.

@mcollina
Copy link
Member

@indutny thanks for joining the dgram discussion. Are you ok with adding some more API signatures to send() as well? I have a question for you: my goal is to to use the signature socket.send(buf, urlObject, [, callback]) to wrap the DTLS context in the urlObject parameter (and rsinfo when receiving a message`, so the API looks the same. (See #2398 (comment)). I would like the new API to be forward-looking, and allow users to support DTLS without a full API change.

@indutny
Copy link
Member

indutny commented Dec 17, 2015

@mcollina I need more time to make a decision on this, but as something preliminary I wonder if it may be a good idea to introduce cork/uncork as we have in regular streams right now?

socket.cork();
socket.send(...);
socket.send(...);
socket.uncork();  // will actually do a single `sendv` call

@saghul
Copy link
Member

saghul commented Dec 17, 2015

I'm generally +1 on this. Quick question @mcollina, what's that urlObject about? We need the IP address and port, an URL object looks incorrect to me.

@mcollina
Copy link
Member

@indutny that's different. the multibuffer API from libuv allows us to send multiple buffers as a single message, so we do not have to allocate a single packet with headers + body. As far as I know, there is no multiple message support in libuv. However we can add that to avoid passing the JS/C++ barrier when sending multiple messages.

@saghul the whole problem is an API mismatch between what is being received in the socket.on('message', function (data, rsinfo) {}) event. port  and host information are in the rsinfo. The end goal for this change is to support DTLS: #2398 (comment). But I think we should split the work in two chunks.

First, let's implement:

  • socket.send(buf, port, host, [, callback])
  • socket.send([buf1, buf2, ... ], port, host, [, callback])

And then we can look into the rsinfo/url stuff later on.

@indutny
Copy link
Member

indutny commented Dec 17, 2015

@mcollina the API that I have proposed will do exactly what you say. It is just looks like the API that we have for TCP sockets, and this is what my question was about. Does it sound good to us, or is it too abstract?

@indutny
Copy link
Member

indutny commented Dec 17, 2015

Another idea of somewhat bigger complexity (this is downside): is to change the way the Buffers are concatenated. Instead of allocating a single storage immediately, we could just create some kind of JointBuffer, and allocate real Buffer lazily, when someone tries to access its data.

When someone will try to send this kind of JointBuffer via socket.send() - UDP code will be clever about it, and will avoid actually concatenating it.

@ronkorving
Copy link
Contributor Author

@indutny Honestly, when writing parsers, it would be great to be able to have JointBuffers to read from (crossing the boundary between multiple buffers). I can imagine stream.read(n) also being able to benefit from it. But I guess that's for another day.

@indutny
Copy link
Member

indutny commented Dec 18, 2015

@ronkorving this probably won't work well with V8, at least right now. What I called JointBuffers is actually something like LazyJointBuffers :)

@mcollina
Copy link
Member

@indutny I think it is not the correct API for UDP messaging. There is a clear difference between a tcp socket and a udp one: the first is used to talk only to one peer, while the other is built for talking to many peers at the same time. I think we should be able to "lock" the socket to a specific peer, but that's another API.

Something like this would do:

var msg = socket.createMessage(1223, 'localhost')
msg.write('aaaa')
msg.write('bbb')
msg.end()

I think that can be done in userland.

Regarding a buffer list, having something like https://github.com/rvagg/bl in core would be awesome. But I think it's a complete different topic.

@sam-github
Copy link
Contributor

I'm generally +1 on @mcollina 's suggestions.

One thing I'd add: dgram is the only one of the network libraries that I've found that requires buffers to be written to it. Bizarrely inconsistent. Why do we not allow strings or buffers to be sent?

@mcollina
Copy link
Member

@sam-github dgram currently accepts Buffer or String without encoding, which is completely inconsistent and error prone. This is due to the fact that we support a single send() function, using positional arguments without options. An API that accepts a options object would fix this.

@trevnorris
Copy link
Contributor

Allowing users to pass Strings is actually important for performance. This way the data can be immediately copied out and saved on the req. Then free'd in the destructor. Instead of needing to first create a Buffer object. The API should be made consistent to accept an encoding.

@sam-github
Copy link
Contributor

@mcollina what do you mean by "current"? Has this changed recently? A couple of weeks ago I reproed this, and got 'TypeError: First argument must be a buffer object', but maybe I was not on the latest node.

@trevnorris I thought buffers were forced because working in pure buffers aids performance, I didn't realize it hurt it sometimes.

rvagg pushed a commit that referenced this issue Feb 10, 2016
Added ability to dgram.send to send multiple buffers, _writev style.
The offset and length parameters in dgram.send are now optional.
Refactored the dgram benchmarks, and seperated them from net.
Added docs for the new signature.

Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
Fixes: #4302
PR-URL: #4374
scovetta pushed a commit to scovetta/node that referenced this issue Apr 2, 2016
Added ability to dgram.send to send multiple buffers, _writev style.
The offset and length parameters in dgram.send are now optional.
Refactored the dgram benchmarks, and seperated them from net.
Added docs for the new signature.

Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
Fixes: nodejs#4302
PR-URL: nodejs#4374
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. feature request Issues that request new features to be added to Node.js.
Projects
None yet
Development

No branches or pull requests

8 participants