Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clear files structure for submodules in std #2538

Closed
timreichen opened this issue Aug 20, 2022 · 21 comments
Closed

clear files structure for submodules in std #2538

timreichen opened this issue Aug 20, 2022 · 21 comments

Comments

@timreichen
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Submodules as in std/encoding have an unnecessary messy file structure and it will become even messier, if we don't introduce a strict file structure for submodules.

An example of that was the introduction of encoding/csv/stream.ts, which expands a submodule and necessarily added a encoding/csv directory for utility files. Now std has a encoding/csv directory only for the stream API while the normal csv API lives top level NOT in a directory called csv? This is very confusing and unexpected.

All of the submodules have a top level entry point (encoding/csv.ts, encoding/hex.ts for example), some have additional API (encoding/csv/stream.ts) or/and need util or/and wasm files, some need test files. As of now, multiple modules have their files distributed all over the encoding dir and some even outside std (see std/_wasm_varint)
Here some examples, how disconnected their data and files are:

toml

  • has toml.ts as top level entry point
  • has encoding/_toml for utilities
  • has test data in encoding/testdata

varint

  • has varint.ts as top level entry point
  • has ../_wasm_varint on std top level!

Describe the solution you'd like

  • Let every submodule have it's own directory living in encoding. These contain a mod.ts for the main logic (basically, what most of the submodules have in their top entry point), test files, util and wasm files/subdirectories.
    This ensures that all needed data and files for a submodule are grouped inside one directory and can be expanded by adding files inside the directory if necessary.
  • Let every submodule keep a top level entry point which reexports mod.ts and other functions if necessary.
    This will ensure that import will not break as it will forward the same data.

new structure

toml

  • encoding
    • toml.ts
    • toml
      • mod.ts
      • mod_test.ts
      • parser.ts
      • parser_test.ts
      • testdata
        • CRLF.toml

varint

  • encoding
    • varint.ts
    • varint
      • _wasm_varint
      • mod.ts
      • mod_test.ts

This is a big structural change, but worth it, because it introduces a functioning system for future additions and will get harder to fix over time.

Describe alternatives you've considered

keeping the file structure as-is

The more features and modules are introduced, the more messy it will get. testdata will contain a lot of unrelated files from different submodules. If more submodules are introduced that need a wasm directory, it will bloat top level std.

@lino-levan
Copy link
Contributor

I feel like this is being done with #2936. Thoughts @iuioiua?

@timreichen
Copy link
Contributor Author

@lino-levan not quite. #2936 works on a file level basis, but not std/encoding specifically.
If you agree to the proposed solution of this issue, I'll be glad to open some PRs. Concretely:

  • [name] (for example ascii85):
    • create std/encoding/[name] dir
    • move testdata from std/encoding/testdata to std/encoding/[name]/testdata
    • move all [name].ts and [name]_test.ts file into std/encoding/[name], and rename them to mod.ts and test.ts
    • create [name].ts on top level with reexports of std/encoding/[name]/mod.ts

@iuioiua
Copy link
Contributor

iuioiua commented Dec 19, 2022

Yes, it will be done in #2936. I've added encoding to the list of modules to be redone. However, it will have to be done with more care than the others, considering how deep the file structure goes.

Even though #2936 mainly deals with single-file exports, it can also tie in this restructure, IMO.

I agree - encoding's file structure is unintuitive and needs to be more straightforward, and the improvements in other modules could also benefit encoding's submodules.

@timreichen
Copy link
Contributor Author

My bad, didn't see the encoding on the list. If it has to be done with more care, maybe it should stay a separate issue?
We have to be careful on how the top level files are handled, test files and testdata should not have any impact on the consumers. Maybe we can start with creating splitting the testdata into separate dirs and work from there?

@iuioiua
Copy link
Contributor

iuioiua commented Dec 19, 2022

My bad, didn't see the encoding on the list. If it has to be done with more care, maybe it should stay a separate issue? We have to be careful on how the top level files are handled, test files and testdata should not have any impact on the consumers. Maybe we can start with creating splitting the testdata into separate dirs and work from there?

I literally just added it in 🤣

I'll be glad to open some PRs. Concretely:

  • [name] (for example ascii85):

    • create std/encoding/[name] dir
    • move testdata from std/encoding/testdata to std/encoding/[name]/testdata
    • move all [name].ts and [name]_test.ts file into std/encoding/[name], and rename them to mod.ts and test.ts
    • create [name].ts on top level with reexports of std/encoding/[name]/mod.ts

I agree with these, except for the last. Frankly, I don't think it's needed - std/encoding/mod.ts should suffice, IMO.

@iuioiua
Copy link
Contributor

iuioiua commented Dec 19, 2022

The restructure should only be done for those encoding types that are large enough that they justify their own submodule. For example, encoding/csv deserves a submodule, but encoding/ascii85 might not (though I'd need to hear other opinions before being more confident).

@timreichen
Copy link
Contributor Author

Do you agree tough that we would put (csv as an example) encoding/csv.ts into encoding/csv/mod.ts and have encoding/csv.ts reexport encoding/csv/mod.ts for modules where a submodule is appropriate?

@iuioiua
Copy link
Contributor

iuioiua commented Dec 19, 2022

I think encoding/csv/mod.ts should be the only entry point for anything encoding/csv. I don't think having encoding/csv.ts is worth having. Having two entry points may be confusing.

@kt3k
Copy link
Member

kt3k commented Dec 20, 2022

I'm in favor of creating single-export entrypoints in subdirs, and moving test fixtures into subdirs, but not in favor of moving encoding/$name.ts to encoding/$name/mod.ts. These APIs are exported from those paths for really long time. Moving them causes a lot of confusion in the ecosystem while the benefit of doing it doesn't seem obvious.

@iuioiua
Copy link
Contributor

iuioiua commented Dec 20, 2022

I'm in favor of creating single-export entrypoints in subdirs, and moving test fixtures into subdirs, but not in favor of moving encoding/$name.ts to encoding/$name/mod.ts.

Having both encoding/$name.ts and encoding/$name/mod.ts means there are two entry points, not one. It's a small detail, but I think it'd raise fewer questions by having everything within one folder. It's safe to do, provided a deprecation warning with sufficient migration time. Either way, I'm not hellbent on my view and even without it, these submodules will look much better.

@timreichen
Copy link
Contributor Author

I agree with @iuioiua. I think one entry point is more consistent, else the question can be asked why we don't have an entry point for every mod on the std/ top level too (for example std/path.ts).
But for now, I think we can leave the two with deprecation warnings and remove them all at once at a later point for less confusion.
std is not at v1, so important changes (especially for consistency) are not bad imo.

@kt3k
Copy link
Member

kt3k commented Dec 20, 2022

the question can be asked why we don't have an entry point for every mod on the std/ top level too (for example std/path.ts).

because we don't create an entrypoint in top level.

I'm not enough convinced encoding/$name/mod.ts is more 'consistent' than encoding/$name.ts. We currently don't have any entrypoint of the form cateogry/subcategory/mod.ts in std now.

@timreichen
Copy link
Contributor Author

because we don't create an entrypoint in top level.

Yes, so why do we treat submodules differently, other than historically?

I'm not enough convinced encoding/$name/mod.ts is more 'consistent' than encoding/$name.ts. We currently don't have any entrypoint of the form cateogry/subcategory/mod.ts in std now.

It is more consistent because in other modules we have a mod.ts for general imports. So I think one would expect a mod.ts for submodules as well if one is not aware of the historical context.
Plus since there are multiple different apis in some submodules and std introduces more and more single export files, not all are exported in the top level file (for browser compatibility reasons), therefore paths are not consistent anymore (for example: std/encoding/csv.ts BUT std/encoding/csv/stream.ts)

@kt3k
Copy link
Member

kt3k commented Dec 21, 2022

The current rule of thumb in encoding is:

  • parser/serializer (or decoder/encoder) are exported from encoding/$type.ts
  • stream related utils are exported from encoding/$type/stream.ts

I think the current state is enough consistent with this rule. Exporting paths are enough predictable by this rule.

So the transition from the above rule to the proposed rule doesn't improve the consistency in my view. And if it doesn't improve the consistency, we should stay in the current rule because these modules are heavily depended in the ecosystem, and the transition will affect too many modules.

@iuioiua
Copy link
Contributor

iuioiua commented Dec 26, 2022

Having encoding/$type/mod.ts instead of encoding/$type.ts makes more sense to me when thinking of encoding/$type as the module rather than encoding as a whole.

Either way, having both encoding/$type.ts and encoding/$type/mod.ts is a small price if we can progress with landing these PRs.

@kt3k
Copy link
Member

kt3k commented Dec 27, 2022

If we do breaking changes, what do you think about moving csv/toml/yaml/json/jsonc to top level?

@iuioiua
Copy link
Contributor

iuioiua commented Dec 27, 2022

I think that would make more sense than the current structure. The encoding module made sense to have in this repo's infancy. However, it's grown and evolved to a point that breaking it up might be justified.

@timreichen
Copy link
Contributor Author

@kt3k I think that is a good idea. What about front_matter and varint?

@iuioiua
Copy link
Contributor

iuioiua commented Jan 12, 2023

If we do breaking changes, what do you think about moving csv/toml/yaml/json/jsonc to top level?

@kt3k, should we pursue this change?

@kt3k
Copy link
Member

kt3k commented Jan 12, 2023

Core members have been discussing this slowly. Please wait for a moment

@kt3k
Copy link
Member

kt3k commented Mar 14, 2023

closing in favor of #3123

Thanks @timreichen for starting this discussion

@kt3k kt3k closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants