Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's make fewer breaking changes #259

Closed
thomaseizinger opened this issue Dec 14, 2022 · 14 comments
Closed

Let's make fewer breaking changes #259

thomaseizinger opened this issue Dec 14, 2022 · 14 comments

Comments

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Dec 14, 2022

In the same spirit as multiformats/rust-multiaddr#71, I want to raise awareness about how disruptive the breaking changes in this fundamental library are for the ecosystem.

I have a few suggestions:

  • Error could be made non_exhaustive or completely opaque. The latter would be my personal preference as it gives you more flexibility as a library author.
  • Move MultihashDigest & Code to a separate crate. All we need in rust-multiaddr is the Multihash type which acts as a type-safe representation for a multihash. The machinery around MultihashDigest, the actual hash implementations etc could all live in a separate crate. This would reduce the API surface drastically and allow multihash to remain more stable (and almost dependency free)
  • Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Curious to hear your thoughts. (And happy to send PRs if we agree on something!)

@thomaseizinger thomaseizinger changed the title Please make fewer breaking changes Let's make fewer breaking changes Dec 14, 2022
@vmx
Copy link
Member

vmx commented Dec 14, 2022

  • Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Code was introduced for backwards compatibility and kind of an "easy to get started" way. Back then, without out it, rust-libp2p wouldn't have upgraded to rust-multihash, but kept their fork. Power users should always define their own code table. So perhaps moving into its own crate makes sense, so people that don't want to define their own table can just use that.

@thomaseizinger
Copy link
Contributor Author

  • Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Code was introduced for backwards compatibility and kind of an "easy to get started" way. Back then, without out it, rust-libp2p wouldn't have upgraded to rust-multihash, but kept their fork. Power users should always define their own code table. So perhaps moving into its own crate makes sense, so people that don't want to define their own table can just use that.

Thanks for sharing that bit of history! I wasn't aware of that. I am in favor of creating a standard-multihashes crate or something like that where all of this goes. (It should likely still not use conditional enum variants for the reasons stated above.)

@vmx
Copy link
Member

vmx commented Dec 15, 2022

I'd like to add that the idea that people would use a custom code table and not use the bundled conditional enum clearly failed. So what if we move that part into a separate crate without any features. It will just bundle some common ones (probably not all of the current ones). It would tick the box of "easy to get started" and if people that don't want to draw in hash functions they don't need, they can create their own table.

@thomaseizinger
Copy link
Contributor Author

Yeah that sounds pretty reasonable to me! :)

Would you move the custom-derive too? Something like:

  • multihash-codetable
  • multihash-codetable-derive

@vmx
Copy link
Member

vmx commented Dec 15, 2022

Would you move the custom-derive too?

It's already its own crate, so I would just keep its current name.

@thomaseizinger
Copy link
Contributor Author

Would you move the custom-derive too?

It's already its own crate, so I would just keep its current name.

Good point. By convention however, I'd expect the trait that the derive macro implements lives in the "sister"-crate, i.e. multihash-derive should derive a trait that lives in multihash. The trait that is being implement is however not necessary for the core Multihash datastructure.

We could fix this by extracting a single multihash-core crate that only has that data structure in it and leave the multihash crate otherwise as is.

I don't really mind either way but I think it would be good to honor these conventions so either:

  • Extract multihash-core
  • Create two new crates multihash-codetable and multihash-codetable-derive

(All names subject to bike-shedding.)

@vmx
Copy link
Member

vmx commented Dec 20, 2022

I would keep crate renames to a minimum, so that downstream users have as little work as possible.

@thomaseizinger
Copy link
Contributor Author

For users of multihash, we should be able to ship this as a non-breaking change by re-exporting doing pub use multihash_core::Multihash. Crates like multiaddr will then have manually change their dependency to only depend on multihash-core.

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Dec 21, 2022

Personally, I'd prefer to pick alternative names. The name multihash-core doesn't mean much and it feels wrong to pay the cost of a less expressive name for the time coming vs spending a one-time effort now to fix it.

I think we can ship a transition to my proposed structure in a backwards-compatible way, such that users don't need to rename their crates right away but only once they want to fix the deprecation warnings. In general, switching to a different crate with the same API is really minimal effort.

  1. Create multihash-codetable crate by extracting the Code enum etc from multihash
  2. Remove Code from multihash and publish as a new breaking change.
  3. Create a patch-release for the version before the breaking change where we depend on the new minor version and re-export multihash. This is utilizing https://github.com/dtolnay/semver-trick.
  4. Within the same patch version, mark all items as deprecated in multihash, telling users to depend on multihash-codetable instead of they want to use these features.
  5. multihash-codetable can depend on the new minor version of multihash.

We should be able to do this because we are not actually changing any APIs in a backwards incompatible way but just need to align the types properly.

@vmx
Copy link
Member

vmx commented Dec 21, 2022

The process looks good. Though I would keep names. I would name what you call multihash-core simply multihash and name multihash-codetable-derive -> multihash-derive. The only change would be a new crate called multihash-codetable. This way we don't end up with deprecated crates for the sake of it. Though I don't have a really strong opinion about it, so input from others would be appreciated.

@thomaseizinger
Copy link
Contributor Author

The process looks good. Though I would keep names. I would name what you call multihash-core simply multihash

I think we are on the same page here. multihash-core was only a proposal if we wanted multihash to remain the crate with the code table in it. I prefer to strip multihash from the code table and only have to Multihash data structure in it.

and name multihash-codetable-derive -> multihash-derive. The only change would be a new crate called multihash-codetable. This way we don't end up with deprecated crates for the sake of it. Though I don't have a really strong opinion about it, so input from others would be appreciated.

I can definitely see the benefit in not abandoning crate names so I am happy to go that way.

Final question: Which crate do you see the MultihashDigest trait in?

  • If we keep it in multihash, then the naming of the multihash-derive crate follows conventions in the ecosystem (serde -> serde-derive, f.e.).
  • If the trait moves into multihash-codetable then the naming is a bit weird but I could get around to it.

Initially I thought that the trait would move to multihash-codetable because strictly speaking, it is not necessary for the Multihash data structure. However, now that I think about it, I think it would actually be better if that trait stays in the multihash crate. It will allow users to be abstract over code tables which seems like a nice property.

@vmx
Copy link
Member

vmx commented Dec 21, 2022

As I also propose to remove the feature flags from multihash-codetable (though that's open for discussion), it would be a "omg, it pulls in all those dependencies" crate. But that would exactly what I'm after. For toy projects, it doesn't matter, for bigger projects, define your own table.

This means that the multihash-codetable should only be used for the code table and nothing else. Hence I'm for keeping MultihashDigest in the multihash crate.

@thomaseizinger
Copy link
Contributor Author

As I also propose to remove the feature flags from multihash-codetable (though that's open for discussion), it would be a "omg, it pulls in all those dependencies" crate. But that would exactly what I'm after. For toy projects, it doesn't matter, for bigger projects, define your own table.

Sounds good to me.

This means that the multihash-codetable should only be used for the code table and nothing else. Hence I'm for keeping MultihashDigest in the multihash crate.

Sweet, I'll try to make some time for this tomorrow and prepare some PRs / branches.

@thomaseizinger
Copy link
Contributor Author

I don't think there is anything particularly actionable any more in here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants