-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some additional utility methods to OsString and OsStr #1307
Conversation
I've actually been thinking about OsStr(ing) interfaces recently and was thinking about writing something up, but hadn't gotten around to it yet. In terms of this proposal, It isn't clear to me what the length and capacity related methods would do, though. The internal representation of OsString on Windows is not very meaningful and most of the details of it are intentionally not exposed by the interface. Would these methods give values related to the internal WTF-8 representation (which should be useless, because all consequences of that encoding should be hidden), or of the system ill-formed UTF-16 representation (which would require some kind of conversion)? |
Granted, they don't really have any semantic meaning, but they still have uses. Actually, come to think of it I forgot to add |
Good points. I can see why we'd want those, now. So I think I now like everything here except |
We usually add specific conversion methods for conversions that are expected to be done commonly (e.g. str → String). I don’t see people using OsString oftenly, let alone its conversion methods. I’d vote for leaving
|
ba84aa9
to
736bfce
Compare
Updated: removed |
Looks good. I had a look through and there weren't any other methods on However, it may be worthwhile (although out of scope for this RFC) defining an Substring operations can then be defined in terms of iterators over |
For I feel like these methods won't be used that often, but if you need to use I think this looks good to me. |
There's also the stable |
🔔 This RFC is now entering its week-long final comment period 🔔 I feel that these methods are quite conventional among collections, so the only particular question in my mind would be whether we want them at all. I originally thought that they're not useful because the length of an OS string doesn't actually mean anything in terms of decoding it, but it does mean something in terms of performance (a key insight here), which I feel is justification enough for their inclusion. |
I feel a bit iffy with the naming choice of |
@sfackler Isn't that the case for |
Well, |
@sfackler Working with strings with Windows API is already a minefield. If someone doesn't understand what |
I believe len should stick to the existing semantics (string, vec, all collections) and return the number of underlying items, in this case the utf16ish words. |
@arthurprs |
That would require iterating the OsString to calculate, when the Returning a byte count is portable, consistent and has practical usefulness for when building OsStrings. Without it, there's no direct relation between I don't think this breaks the encapsulation of the WTF8 encoding because len() is a lower level operation: it's the same reason that having |
Yeah, I believe you guys are right since the underlying data is actually wtf8. |
The libs team discussed this during triage yesterday and the decision was to merge, thanks again for the RFC @eefriedman! Tracking issue: rust-lang/rust#29453 |
Rendered