Skip to content
This repository was archived by the owner on Nov 9, 2019. It is now read-only.

WASI paths as &str and String #37

Merged
merged 2 commits into from
Jul 19, 2019
Merged

WASI paths as &str and String #37

merged 2 commits into from
Jul 19, 2019

Conversation

kubkon
Copy link
Member

@kubkon kubkon commented Jul 16, 2019

This PR address @sunfishcode concerns about proper/improper conversion from/to UTF-8 across different hosts raise in #34. As such, it verifies that input path (from Wasm) is a valid UTF-8, then performs a translation into correct encoding for the host (UTF-8 on *nix, UTF-16 on Windows), and performs any relevant calls involving the path.

@kubkon
Copy link
Member Author

kubkon commented Jul 16, 2019

cc @alexcrichton

@alexcrichton
Copy link
Collaborator

I may be missing something, but if everything coming from wasm is supposed to be utf-8 then could &str and String be used everywhere? How come OsString comes into the picture?

@kubkon
Copy link
Member Author

kubkon commented Jul 16, 2019

I may be missing something, but if everything coming from wasm is supposed to be utf-8 then could &str and String be used everywhere? How come OsString comes into the picture?

That's a good question. So OsStr and OsString are remnants of the original port of C impl to Rust. But now that you mention it, it might indeed be better and simpler to just use a &str and String. I'll investigate and come back to you on that.

@kubkon kubkon requested a review from sunfishcode July 16, 2019 15:40
@kubkon
Copy link
Member Author

kubkon commented Jul 18, 2019

@alexcrichton as usual your suggestion was spot-on, and I've now completely reworked how we handle WASI paths in wasi-common. I've created two wrapper types WasiPath and WasiPathBuf where the former wraps str and the latter String. Conversions from OS strings, byte arrays and u16 arrays are now handled via TryFrom trait. Have a look and lemme know what you think!

@kubkon kubkon changed the title Check if RawString operates on valid encodings Wrap WASI paths in WasiPath and WasiPathBuf types Jul 18, 2019
@alexcrichton
Copy link
Collaborator

👍

@kubkon
Copy link
Member Author

kubkon commented Jul 18, 2019

cc @sunfishcode

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of simplifying it even further, and just using &str and String instead of WasiPath and WasiPathBuf? We'd have some standalone utility functions to convert to/from [u8], OsString, and so on instead of try_from. That would make it immediately clear that these paths are plain Rust strings.

@kubkon
Copy link
Member Author

kubkon commented Jul 18, 2019

@sunfishcode, sure, I guess that would work too. I’m a fan of actually having a wrapper type with trait implementations on it, whereas utility functions returning a Result remind me of C approach more.

All in all though, it’s true that using utility functions and then operating on pure str and String types would remove a lot of the boilerplate required by WasiPath and WasiPathBuf types. So let’s go with your suggestion just to keep it as simple as possible. Let me rework it then and we’ll take it from there :-)

@kubkon
Copy link
Member Author

kubkon commented Jul 18, 2019

Hmm, now that I've taken a closer look at it, since we're already re-using a lot of Rust's libstd and hence trying to do things idiomatically as much as possible, I'm wondering if utility functions of this sort will not appear as too clunky.

@alexcrichton I'd love to hear your opinion on this: 1) would you leave path handling as is with WasiPath and WasiPathBuf and extend the two when and if needed, or 2) would you rather go for utility functions such as (as an example)

pub(crate) fn path_from_host<S: AsRef<OsStr>>(s: S) -> Result<String, host::__wasi_errno_t> {
    // ...
}

@kubkon
Copy link
Member Author

kubkon commented Jul 18, 2019

Also, another nice thing about current approach is that error mapping between String::from_utf8, etc., is hidden away inside trait implementations. Otherwise, for a call like

String::from_utf8(vec).map_err(|_| host::__WASI_EILSEQ)

I guess we'd either have to write a macro, or wrap it in a utility function creating ever so longer function names

pub(crate) fn path_from_host_utf8_vec(vec: Vec<u8>) -> Result<String, host::__wasi_errno_t> {
    String::from_utf8(vec).map_err(|_| host::__WASI_EILSEQ)
}

@alexcrichton
Copy link
Collaborator

Oh I don't have much of an opinion on API-design here other than the fundamentals should probably be str/String based at least because they're required to be utf-8 today. I'd defer to @sunfishcode's preference of using str and String as-is.

@kubkon
Copy link
Member Author

kubkon commented Jul 19, 2019

Oh I don't have much of an opinion on API-design here other than the fundamentals should probably be str/String based at least because they're required to be utf-8 today. I'd defer to @sunfishcode's preference of using str and String as-is.

Coolios! @sunfishcode thoughts? I'm OK with both approaches with ever so slight preference towards wrapper types, but I'm not insisting on this approach, so I'll also defer to you :-)

@sunfishcode
Copy link
Member

I can see it both ways. I think I like using String/str a little better here. We can name the conversion function dec_pathstr or pathstr_from or something to keep it short. I expect this codebase will often see people who don't know the codebase but do know wasm and WASI diving into the middle of it to see how something works. With String they don't have to read the definition of WasiPathBuf to see if it carries any special behaviors or invariants.

@kubkon
Copy link
Member Author

kubkon commented Jul 19, 2019

I can see it both ways. I think I like using String/str a little better here. We can name the conversion function dec_pathstr or pathstr_from or something to keep it short. I expect this codebase will often see people who don't know the codebase but do know wasm and WASI diving into the middle of it to see how something works. With String they don't have to read the definition of WasiPathBuf to see if it carries any special behaviors or invariants.

Sounds reasonable to me! Let me make the adjustments then :-)

@kubkon kubkon changed the title Wrap WASI paths in WasiPath and WasiPathBuf types WASI paths as &str and String Jul 19, 2019
@kubkon
Copy link
Member Author

kubkon commented Jul 19, 2019

@sunfishcode changes applied, ready for review :-)

@sunfishcode
Copy link
Member

Looks good, thanks!

@sunfishcode sunfishcode merged commit 08aa61f into CraneStation:master Jul 19, 2019
@kubkon kubkon deleted the rawstring branch July 19, 2019 19:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants