Let's make fewer breaking changes #259

thomaseizinger · 2022-12-14T06:13:41Z

In the same spirit as multiformats/rust-multiaddr#71, I want to raise awareness about how disruptive the breaking changes in this fundamental library are for the ecosystem.

I have a few suggestions:

Error could be made non_exhaustive or completely opaque. The latter would be my personal preference as it gives you more flexibility as a library author.
Move MultihashDigest & Code to a separate crate. All we need in rust-multiaddr is the Multihash type which acts as a type-safe representation for a multihash. The machinery around MultihashDigest, the actual hash implementations etc could all live in a separate crate. This would reduce the API surface drastically and allow multihash to remain more stable (and almost dependency free)
Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Curious to hear your thoughts. (And happy to send PRs if we agree on something!)

The text was updated successfully, but these errors were encountered:

vmx · 2022-12-14T12:12:39Z

Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Code was introduced for backwards compatibility and kind of an "easy to get started" way. Back then, without out it, rust-libp2p wouldn't have upgraded to rust-multihash, but kept their fork. Power users should always define their own code table. So perhaps moving into its own crate makes sense, so people that don't want to define their own table can just use that.

thomaseizinger · 2022-12-14T20:56:20Z

Code uses conditional enum variants. That is not ideal. Features are additive across a dependency tree in Rust. Conditional enum variants can turn an exhaustive match in a library into a non-exhaustive one simply because the end-user activates more features of multihash.

Code was introduced for backwards compatibility and kind of an "easy to get started" way. Back then, without out it, rust-libp2p wouldn't have upgraded to rust-multihash, but kept their fork. Power users should always define their own code table. So perhaps moving into its own crate makes sense, so people that don't want to define their own table can just use that.

Thanks for sharing that bit of history! I wasn't aware of that. I am in favor of creating a standard-multihashes crate or something like that where all of this goes. (It should likely still not use conditional enum variants for the reasons stated above.)

vmx · 2022-12-15T09:45:13Z

I'd like to add that the idea that people would use a custom code table and not use the bundled conditional enum clearly failed. So what if we move that part into a separate crate without any features. It will just bundle some common ones (probably not all of the current ones). It would tick the box of "easy to get started" and if people that don't want to draw in hash functions they don't need, they can create their own table.

thomaseizinger · 2022-12-15T12:59:27Z

Yeah that sounds pretty reasonable to me! :)

Would you move the custom-derive too? Something like:

multihash-codetable
multihash-codetable-derive

vmx · 2022-12-15T13:35:39Z

Would you move the custom-derive too?

It's already its own crate, so I would just keep its current name.

thomaseizinger · 2022-12-20T01:29:58Z

Would you move the custom-derive too?

It's already its own crate, so I would just keep its current name.

Good point. By convention however, I'd expect the trait that the derive macro implements lives in the "sister"-crate, i.e. multihash-derive should derive a trait that lives in multihash. The trait that is being implement is however not necessary for the core Multihash datastructure.

We could fix this by extracting a single multihash-core crate that only has that data structure in it and leave the multihash crate otherwise as is.

I don't really mind either way but I think it would be good to honor these conventions so either:

Extract multihash-core
Create two new crates multihash-codetable and multihash-codetable-derive

(All names subject to bike-shedding.)

vmx · 2022-12-20T14:39:00Z

I would keep crate renames to a minimum, so that downstream users have as little work as possible.

thomaseizinger · 2022-12-21T01:12:10Z

For users of multihash, we should be able to ship this as a non-breaking change by re-exporting doing pub use multihash_core::Multihash. Crates like multiaddr will then have manually change their dependency to only depend on multihash-core.

thomaseizinger · 2022-12-21T01:27:37Z

Personally, I'd prefer to pick alternative names. The name multihash-core doesn't mean much and it feels wrong to pay the cost of a less expressive name for the time coming vs spending a one-time effort now to fix it.

I think we can ship a transition to my proposed structure in a backwards-compatible way, such that users don't need to rename their crates right away but only once they want to fix the deprecation warnings. In general, switching to a different crate with the same API is really minimal effort.

Create multihash-codetable crate by extracting the Code enum etc from multihash
Remove Code from multihash and publish as a new breaking change.
Create a patch-release for the version before the breaking change where we depend on the new minor version and re-export multihash. This is utilizing https://github.com/dtolnay/semver-trick.
Within the same patch version, mark all items as deprecated in multihash, telling users to depend on multihash-codetable instead of they want to use these features.
multihash-codetable can depend on the new minor version of multihash.

We should be able to do this because we are not actually changing any APIs in a backwards incompatible way but just need to align the types properly.

vmx · 2022-12-21T08:48:30Z

The process looks good. Though I would keep names. I would name what you call multihash-core simply multihash and name multihash-codetable-derive -> multihash-derive. The only change would be a new crate called multihash-codetable. This way we don't end up with deprecated crates for the sake of it. Though I don't have a really strong opinion about it, so input from others would be appreciated.

thomaseizinger · 2022-12-21T09:47:11Z

The process looks good. Though I would keep names. I would name what you call multihash-core simply multihash

I think we are on the same page here. multihash-core was only a proposal if we wanted multihash to remain the crate with the code table in it. I prefer to strip multihash from the code table and only have to Multihash data structure in it.

and name multihash-codetable-derive -> multihash-derive. The only change would be a new crate called multihash-codetable. This way we don't end up with deprecated crates for the sake of it. Though I don't have a really strong opinion about it, so input from others would be appreciated.

I can definitely see the benefit in not abandoning crate names so I am happy to go that way.

Final question: Which crate do you see the MultihashDigest trait in?

If we keep it in multihash, then the naming of the multihash-derive crate follows conventions in the ecosystem (serde -> serde-derive, f.e.).
If the trait moves into multihash-codetable then the naming is a bit weird but I could get around to it.

Initially I thought that the trait would move to multihash-codetable because strictly speaking, it is not necessary for the Multihash data structure. However, now that I think about it, I think it would actually be better if that trait stays in the multihash crate. It will allow users to be abstract over code tables which seems like a nice property.

vmx · 2022-12-21T10:17:25Z

As I also propose to remove the feature flags from multihash-codetable (though that's open for discussion), it would be a "omg, it pulls in all those dependencies" crate. But that would exactly what I'm after. For toy projects, it doesn't matter, for bigger projects, define your own table.

This means that the multihash-codetable should only be used for the code table and nothing else. Hence I'm for keeping MultihashDigest in the multihash crate.

thomaseizinger · 2022-12-21T10:30:05Z

As I also propose to remove the feature flags from multihash-codetable (though that's open for discussion), it would be a "omg, it pulls in all those dependencies" crate. But that would exactly what I'm after. For toy projects, it doesn't matter, for bigger projects, define your own table.

Sounds good to me.

This means that the multihash-codetable should only be used for the code table and nothing else. Hence I'm for keeping MultihashDigest in the multihash crate.

Sweet, I'll try to make some time for this tomorrow and prepare some PRs / branches.

thomaseizinger · 2023-04-14T13:31:21Z

I don't think there is anything particularly actionable any more in here.

thomaseizinger changed the title ~~Please make fewer breaking changes~~ Let's make fewer breaking changes Dec 14, 2022

thomaseizinger mentioned this issue Dec 14, 2022

Let's make fewer breaking changes multiformats/rust-multiaddr#71

Closed

thomaseizinger mentioned this issue Dec 20, 2022

update crate to edition 2021 multiformats/rust-multiaddr#72

Merged

vmx mentioned this issue Dec 21, 2022

Remove Identity hash from table, add helper function and examples of alternatives #201

Closed

This was referenced Jan 20, 2023

Split crate into multihash, multihash-codetable and multihash-derive #266

Closed

Tracking issue: Polish and stabilize the API #267

Closed

thomaseizinger mentioned this issue Jan 31, 2023

A question on semver stability paritytech/parity-scale-codec#405

Open

thomaseizinger closed this as completed Apr 14, 2023

thomaseizinger mentioned this issue Jul 20, 2023

chore: update libp2p to 0.52.1 paritytech/substrate#14429

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let's make fewer breaking changes #259

Let's make fewer breaking changes #259

thomaseizinger commented Dec 14, 2022 •

edited

Loading

vmx commented Dec 14, 2022

thomaseizinger commented Dec 14, 2022

vmx commented Dec 15, 2022

thomaseizinger commented Dec 15, 2022

vmx commented Dec 15, 2022

thomaseizinger commented Dec 20, 2022

vmx commented Dec 20, 2022

thomaseizinger commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022 •

edited

Loading

vmx commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

vmx commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

thomaseizinger commented Apr 14, 2023

Let's make fewer breaking changes #259

Let's make fewer breaking changes #259

Comments

thomaseizinger commented Dec 14, 2022 • edited Loading

vmx commented Dec 14, 2022

thomaseizinger commented Dec 14, 2022

vmx commented Dec 15, 2022

thomaseizinger commented Dec 15, 2022

vmx commented Dec 15, 2022

thomaseizinger commented Dec 20, 2022

vmx commented Dec 20, 2022

thomaseizinger commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022 • edited Loading

vmx commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

vmx commented Dec 21, 2022

thomaseizinger commented Dec 21, 2022

thomaseizinger commented Apr 14, 2023

thomaseizinger commented Dec 14, 2022 •

edited

Loading

thomaseizinger commented Dec 21, 2022 •

edited

Loading