Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: extend range of possible implementations of bytereader/bytewriter #262

Merged
merged 5 commits into from
Mar 15, 2024

Conversation

bitwalker
Copy link
Contributor

This PR contains two main changes:

  1. Implement ByteReader/ByteWriter for any implementation of std::io::Read or std::io::Write, and remove the Sized bound on the traits, so as to support a wider array of implementations. NOTE: This is technically a breaking change for downstream crates that have their own implementations of Serializable and Deserializable, as they will need to add the ?Sized bound to a couple of the trait methods which were overly restrictive. I don't know how problematic that is, but actually has the effect of making those traits easier to work with, so it should be a net positive. The caveat is that it probably requires a minor version bump as a result to signal the change.
  2. Tweak how the winterfell crates are built, so that they are no-std by default, and are extended with additional features when the std feature is enabled, as is the recommended practice for such crates. This also comes with a deprecation of the boxed, collections, and strings modules (but not removal yet), as they are entirely redundant with liballoc.

As mentioned above, the boxed, collections, and strings modules of winter-utils are redundant. It seems the only reason they exist was due to a misunderstanding on how to set up a crate with no-std support. As part of the changes related to point 2 above, I have marked those modules as deprecated, with a helpful message informing downstream users how to address the deprecation, and updated all of the crates in the winterfell workspace to make use of alloc directly.

While many files are touched here, it's typically one or two lines per module to update the use statements, so should be trivial to skim through. However, I've broken up all the changes into relevant commits to make each easier to review independently. While in theory this could have been two separate PRs, I'm in a bit of a time crunch, so only had time to put this together as a single PR - hopefully that is acceptable, and that the extra work I put in to document each change in its individual commit makes up for the inconvenience.

This commit tweaks winter-utils so that rather than being a std-using
crate that conditionally becomes a no-std crate, it becomes a no-std
crate which conditionally adds functionality from libstd when the `std`
feature is enabled

NOTE: This makes the `boxed`, `collections`, and `string` modules
redundant (they technically were before too, but due to how the crate
was being built, it wasn't obvious). A subsequent commit will mark those
deprecated, and update all downstream crates to use stuff from liballoc
or libstd directly (the latter for crates which are never built no-std).
@facebook-github-bot
Copy link

Hi @bitwalker!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Collaborator

@irakliyk irakliyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you! I left one comment inline re aligning ByteReader with the standard library I/O traits.

One other question: can we get away without the change removing the Sized bound on the traits? I'm asking for two reasons:

  1. As you mentioned, this would mean a breaking change and, thus, we'd need to make v0.9.0 release (as opposed to v0.8.2 release - which would be much easer). On the other hand, if we do make a minor release, we could update the ByteReader as mentioned above.
  2. Adding + ?Sized to trait bounds any time we need to implement Serializable and Deserializable is a bit annoying and it will impact a lot of code downstream. Is there a way to avoid this?

Comment on lines +224 to +228
// Ultimately, this should be addressed by making the [ByteReader] trait align with the standard
// library I/O traits, so this is a temporary solution.
reader: RefCell<std::io::BufReader<&'a mut dyn std::io::Read>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - there is no good reason why peek_u8, has_more_bytes, and check_eor don't take a mutable reference. So, if we do change that, it would simplify ReaderAdapter implementation a bit, right?

Copy link
Contributor Author

@bitwalker bitwalker Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would certainly allow us to do away with the RefCell hack, and ensure that all uses of the ByteReader API are valid (i.e. there aren't hidden edge cases where a dynamic borrow check could fail).

I think it would also allow us to implement ByteReader for std::io::BufReader directly, and get rid of ReadAdapter entirely, which would be nice for sure. There is one API that would be nice to have, but is still unstable (has_data_left), but we can implement an alternative as long as we have a mutable reference on hand. There is also the question of fallible reads with std::io::BufReader, i.e. if you attempt to read N bytes, but only N-1 are available, but the internal buffer of the BufReader is say, N / 2, then we can't put the extra bytes back into the reader for a subsequent call - in other words, the data is lost.

That said, it's an edge case, and I'm not sure what the guarantees are of ByteReader with regard to read errors and subsequent reads following the error, but it seems to be implied that you could attempt to read the same data multiple times until you succeed, and ReadAdapter implements the necessary bookkeeping to support that.

@bitwalker
Copy link
Contributor Author

One other question: can we get away without the change removing the Sized bound on the traits? I'm asking for two reasons:

  1. As you mentioned, this would mean a breaking change and, thus, we'd need to make v0.9.0 release (as opposed to v0.8.2 release - which would be much easer). On the other hand, if we do make a minor release, we could update the ByteReader as mentioned above.

This is the most compelling reason to punt on that particular change IMO, since there isn't any way to avoid it causing compilation errors after the upgrade, even if it is trivial to handle, it is still a burden. I suppose the most pertinent question is: how many custom implementations of those traits are out in the wild? There isn't really any way to know I guess. I know we have some in Miden, but it's not a problem to fix those as part of upgrading winterfell in the project.

But yeah, if those suggested changes to ByteReader are desirable, then knocking out both changes in one release is probably ideal, since they are so closely related to one another.

  1. Adding + ?Sized to trait bounds any time we need to implement Serializable and Deserializable is a bit annoying and it will impact a lot of code downstream. Is there a way to avoid this?

This is one of the tradeoffs of defining traits with generic methods vs using dynamic dispatch with trait objects, i.e. &dyn ByteReader. There are also tricks for supporting both simultaneously (see erased-serde for an example of what I mean. Unfortunately that doesn't help us here, since the issue we're trying to avoid is breaking changes downstream.

I do think adding ?Sized pays its way though: the slight annoyance that comes from having to add it to some of the Serializable/Deserializable methods gains you considerable flexibility in interacting with those traits. At a minimum being able to pass anything that can coerce to &dyn ByteReader/ByteWriter makes for very ergonomic APIs.

Anyway, if the general feeling is that the project would rather punt on this for now, that's fine. I don't believe any of my other changes depended on it, so it should be straightforward to drop those commits and proceed without it.

@irakliyk
Copy link
Collaborator

I suppose the most pertinent question is: how many custom implementations of those traits are out in the wild?

If this just affected custom implementations of ByteReader and ByteWriter - I think we could even make it a patch release because I doubt there are many custom implementations of these (though, of course we don't know for sure). But my understanding is that it would also break any struct which implements Serializable and Deserializable traits - and there are a lot of those.

If my understanding is correct, I'm inclined to do the following:

  • Drop the ?sized changes from this PR and make a patch release.
  • Make these changes for the v0.9.0 release and also add the change to ByteReader to make the relevant methods take mutable references to self.

@bitwalker
Copy link
Contributor Author

I've created a new branch for the ?Sized changes and can open another PR for that after this is merged

Copy link
Collaborator

@irakliyk irakliyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you!

@irakliyk irakliyk changed the base branch from next to main March 15, 2024 18:57
@irakliyk
Copy link
Collaborator

oh - one last thing: could you rebase this PR from main. The next branch is for the v0.9 release.

@bitwalker
Copy link
Contributor Author

@irakliyk Haha oh man, I originally had based it on main, but then realized there was a next branch, and so went about resolving all the conflicts to rebase it on next 🤦. Luckily I think I still have that branch sitting around.

This commit provides implementations of the ByteReader and ByteWriter
traits for any implementation of `std::io::Read` and `std::io::Write` as
follows:

* The new `ReadAdapter` type can be instantiated with any implementation
  of `std::io::Read` to use that reader as a `ByteReader`. We do not
  automatically implement `ByteReader` for any `std::io` traits/types
  due to differences in the APIs (and the fact that some useful APIs are
  still unstable). The `ReadAdapter` type addresses those discrepancies.
  See the documentation for more details.
* We automatically implement `ByteWriter` for all `std::io::Write`
  implementations, this includes things like `Vec`.
winter-utils no longer has any use of these modules, and downstream
crates should depend on liballoc or libstd directly, depending on
whether they need to support no-std environments (same as is done in
winter-utils itself).

These deprecation warnings will ensure that any downstream users are
aware that these modules are going away, that they know how to fix it,
and gives them an opportunity to remove usages over time, rather than
simply removing these modules and breaking all downstream users.
This commit addresses all of the usages of re-exported libstd items from
winter-utils by using liballoc items directly.
@bitwalker
Copy link
Contributor Author

@irakliyk Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants