Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support enumerating known codes #58

Closed
mvdan opened this issue Nov 10, 2021 · 6 comments
Closed

support enumerating known codes #58

mvdan opened this issue Nov 10, 2021 · 6 comments

Comments

@mvdan
Copy link
Contributor

mvdan commented Nov 10, 2021

The table has four columns, ignoring the human-readable description:

  • Name
  • Tag (multihash/ipld/multiaddr/serialization/etc)
  • Code (i.e. integer value)
  • Status (draft/permanent)

Downstreams like go-ipfs want to be able to enumerate some of these known codes. For example, to tell the user what ipld codecs are supported, they'd rather not hard-code this list as it may grow over time.

One option is to indeed tell downstreams to hard-code the sets of multicodecs they support. This can be reasonable if the list is small or won't change much over time, and especially if explicit support needs to be added for more multicodecs, e.g. by implementing more IPLD codecs with their Encode/Decode interface.

Another option is to expose an API here. The internal mechanics could be code-generated, so they don't worry me. The API is the trickier bit. Below are some options:

  1. Exported lists, such as var TagMultihash = []Code{...}. A set of these lists for Tag, and perhaps another for Status.

  2. A single exported list with all fields, such as:

type TaggedCode struct {
    Code // inherits String method
    Tag string // or perhaps an enum-like integer
    Status string // or perhaps "Draft bool"
}

var AllCodes = []TaggedCode{...}

If the user wants a filter in this scenario, such as by Tag, they would iterate over the list and filter as necessary.

  1. A query-like API, such as:
func ListCodes(byTag, byStatus string, fn func(Code) bool) { ... }

Internally, this would have to use a mechanism like option 1 or 2, so it doesn't actually help a ton. Another major drawback is you'd have to iterate over the entire list to know the number of them.

--

I think that, if we want to do this, options 1 or 2 seem best. I lean towards a minimal version of number 1 - expose slice variables for each of the tags, as that seems to be what the vast majority of users will want to filter by.

@mvdan
Copy link
Contributor Author

mvdan commented Nov 10, 2021

The only major reason I can think of not to do this, beyond a potential "downstreams should only list the multicodecs they explicitly support", is that we don't want to bloat go-multicodec to the point that increasing the size of the CSV table will also significantly increase Go binary sizes.

Assuming we just add global slice variables, though, those should be entirely missing from a linked Go binary as long as nothing uses them.

@mvdan
Copy link
Contributor Author

mvdan commented Nov 11, 2021

From @schomatis and @aschmahmann on Slack: they would also like to obtain information about a specific Code, such as whether a user-supplied Code has tag==ipld. So perhaps we can add a Tag() string method to the Code type.

@willscott
Copy link
Contributor

This does get embedded in a bunch of places. Would be nice to retain a way that doesn't grow larger if possible.
I am in support of being able to enumerate codes.

@mvdan
Copy link
Contributor Author

mvdan commented Nov 11, 2021

@willscott keeping the list of all codes (be it in total, or by tag) necessarily requires keeping that list somewhere, so I'm not sure we can work around the binary/memory size growing larger if the list is actually used.

So I'm not sure what you mean when you say you want to be able to enumerate the codes, but also don't want your program to grow larger over time :)

The factor of growth should be tiny if we just keep very little information for each entry, though.

@willscott
Copy link
Contributor

(that i don't want to also include the tag-based lookup)

@mvdan
Copy link
Contributor Author

mvdan commented Nov 11, 2021

Perhaps a bit of both. How about:

func KnownCodes() []Code
// a func gives us room to tightly pack the list in the generated code, and expand it to a flat []Code on first use

func (Code) Tag() string // or perhaps an enum? don't think it really matters

Then, if someone wants to filter, they can loop and filter themselves.

mvdan added a commit that referenced this issue Dec 2, 2021
Right now, this is simply backed by a code-generated slice,
but the API being a function gives us some wiggle room in the future.

The test simply ensures the list is reasonably sane;
that it has many codes, no unexpected duplicates,
and that a few known ones are present in it.

Updates #58.
@mvdan mvdan closed this as completed in 1418219 Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants