Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: make requirements regarding URIs more explicit #73

Open
gburning opened this issue Dec 13, 2024 · 5 comments
Open

Suggestion: make requirements regarding URIs more explicit #73

gburning opened this issue Dec 13, 2024 · 5 comments

Comments

@gburning
Copy link

Had a problem that is fully a user error on my part but could have been alleviated with more explicit types and/or JSDoc/TSDoc comments.

The tool I'm building (a Node.js application) programmatically reads *.schema.json files from a specific directory. In my naiveté I was then using the resolved, absolute, paths directly with getSchema from the experimental API.

This worked fine on my Mac but became a problem on a system using drive letters (Windows in this case). The library (correctly) parsed D:\path\to\file.json as it having an invalid D: scheme.

Long story short, I realized after digging through the code that I needed to use either relative paths or convert them to valid URIs (using file://). Also, I just now noticed that this is indeed documented in the README.

This leads to my questions/suggestions:

  1. From my understanding it is preferable to use relative paths rather than absolute URIs (file://...)? Is this correct and if so, why? In my case the URIs are only used to resolve the path the schema files.
  2. Would it be possible to implement stronger types and/or JSDoc/TSDoc comments for the various exports? Particularly where URIs are expected. Perhaps even allowing URL objects using the built in type?
  • It would be a nice DX improvement in general to have at least some of the descriptions from the README as comment blocks. Jumping back and forth to the README gets fairly tedious and I had missed the part about URIs under "validation" since I'm not using the library for that.

Again, I realize this comes down to me missing these things in the documentation, but I believe it could be made more accessible with some tweaks. I'm also ready to help and pitch in if wanted, as long as I know what you would want.

@jdesrosiers
Copy link
Collaborator

Thanks for your feedback!

  1. From my understanding it is preferable to use relative paths rather than absolute URIs (file://...)? Is this correct and if so, why? In my case the URIs are only used to resolve the path the schema files.

That is correct. You already learned why the hard way. Absolute file paths couple your code to your computer. That path isn't going to be the same on your Mac as it is on your Windows machine. As soon as you try to run the code on a different computer, it will break. There are also security concerns. If you have a hardcoded absolute path in your code and you push that code to a public Github repo, you're sharing with the world some details about your computer. A hacker may be able to use that information to devise some kind of attack.

URIs are used exclusively (even if everything is local) because they are unifrom (the "U" in URI). Whether your schema comes from a webserver, a unix-based file system, or a windows file system (with their backwards slashes), URIs are a uniform way to identify that resource. The same relative path as a URI will work on Mac/Linux and Windows, but if you use a file system specific path, it now only works on Mac/Linux or Windows. I could try to accept paths as well as URIs, but I would have to convert them to URIs internally anyway because JSON Schema only uses URIs. I'd rather not do that because I think good design makes it easy to do the right thing and hard to do the wrong thing. I think that using file system paths is a bad choice and I think slowing people down to think about that choice is better than making it just work with whatever you throw at it.

  1. Would it be possible to implement stronger types and/or JSDoc/TSDoc comments for the various exports? Particularly where URIs are expected. Perhaps even allowing URL objects using the built in type?
  • It would be a nice DX improvement in general to have at least some of the descriptions from the README as comment blocks.

I would love to see any suggestions for improving the types, documentation, or error messages (which you didn't mention but seems to me like the thing that needed improvement in this case).

I'm open to suggestions, but I don't think there's anything we can do about the type of URI fields. There's no way for a type system to tell the difference between a file system path and a URI. They're both strings. It can't be changed to take a URL object because URLs and URIs aren't 100% the same thing. Things like urn: schemes wouldn't be supported anymore. I could allow a string or a URL, but I would just immediately convert it to string which doesn't make sense because it was probably parsed from a string just for the purposes of passing it to function. Again, I'd rather make it hard to do the wrong thing.

As for JSDoc/TSDoc comments, I do have a couple concerns in that area, but I think it's a great suggestion. My concern is maintenance. I don't want too much documentation cluttering the code and I don't want too much duplication with the README that can get out of sync. Brief descriptions with a link to the relevant section of the README is probably a good middle ground. I'm also leaning toward putting that in the .d.ts files rather than with the implementation, although I'm looking into transitioning to JSDoc for everything and that wouldn't be an option anymore, so maybe it makes sense to just put it with the implementation.

It seems to me that types and doc comments aren't really what would have saved you time here. What would have gotten you straightened out immediately would have been if you understood the error message you received. The message you got was, "The 'D:' URI scheme is not supported. Use the 'addUriSchemePlugin' function to add support for 'D:' URIs". This message misinterprets your file system path to be a URI, which can make this message hard to interpret. I'm open to suggestions on how to make this error message more clear.

I'm also ready to help and pitch in if wanted, as long as I know what you would want.

It would be great if you want to send a PR or two with some suggestions that are in line with by responses above. Thanks!

@gburning
Copy link
Author

If you have a hardcoded absolute path in your code and you push that code to a public Github repo, you're sharing with the world some details about your computer.

I agree wholeheartedly! As I said, in my case the path/URI is not included in any committed code. I only use it to load the schema with getSchema so I can easier manipulate it with the various APIs included in the library.

Essentially, my Node.js program is currently doing this:

import { pathToFileURL } from "node:url";
import { getSchema } from "@hyperjump/json-schema/experimental";

// Where `fullPath` is the resolved absolute path to the file
await getSchema(pathToFileURL(fullPath).toString())

I realize this may seem unnecessary since I convert it to a string regardless but this makes the requirement of a valid URI more explicit to me. The path, a loose string, may or may not be a valid URI, the URL definitely is.

I'd rather not do that because I think good design makes it easy to do the right thing and hard to do the wrong thing. I think that using file system paths is a bad choice and I think slowing people down to think about that choice is better than making it just work with whatever you throw at it.

I agree but I would argue that the opposite is currently true. The problem did not crop up until I needed to run this code on a different system where an absolute file path would not be interpreted as a valid URI (/path/to/file vs C:\path\to\file).

It seems to me that types and doc comments aren't really what would have saved you time here. What would have gotten you straightened out immediately would have been if you understood the error message you received.

I understand where you're coming from but it wasn't difficult for me to understand the error once it cropped up. To me the issue is rather about 2 things:

  1. This issue did not crop up on my machine until after I pushed the code and needed it to work on a different system. What I'm after is a way to catch the problem during the design phase rather than after pushing the code.

  2. I have a limited understanding of URIs and how they intersect/differ from URLs and file system paths. This is not the fault of the library of course, but I've found it hard to find good information on the topic that is easy to understand for a novice. RFC's are great but they are not always easily digestible.

I'm open to suggestions, but I don't think there's anything we can do about the type of URI fields. There's no way for a type system to tell the difference between a file system path and a URI. They're both strings.

Yeah, I understand the problem. Supporting URLs was a suggestion to alleviate the issue but I certainly see how that could create more confusion.

Again, I think what I'm after is a way to more easily do the right thing in the design phase. This is where my suggestion for more available documentation and/or stronger types was coming from.

I don't currently have any good suggestions for exactly how to alleviate this problem, but I will mull it over. At the end of the day, it may be that the issue is somewhat about understanding how the library is used in different environments/use cases.

As for JSDoc/TSDoc comments, I do have a couple concerns in that area, but I think it's a great suggestion. My concern is maintenance. I don't want too much documentation cluttering the code and I don't want too much duplication with the README that can get out of sync. Brief descriptions with a link to the relevant section of the README is probably a good middle ground. I'm also leaning toward putting that in the .d.ts files rather than with the implementation, although I'm looking into transitioning to JSDoc for everything and that wouldn't be an option anymore, so maybe it makes sense to just put it with the implementation.

I will look into it. A few questions:

  1. Would you be open to moving descriptions of methods to their respective definitions (as comment blocks) and then compiling an overall API documentation from there? That would help the problem of maintenance.
  2. Is there a particular reason for using external .d.ts files rather than using TypeScript for the source or at least .js with JSDoc Reference and compiling to .d.ts files?
  3. I'm looking into transitioning to JSDoc for everything
    In what way?

@jdesrosiers
Copy link
Collaborator

As I said, in my case the path/URI is not included in any committed code. I only use it to load the schema with getSchema so I can easier manipulate it with the various APIs included in the library.

Sorry, I did miss that when I read it the first time. The relative path recommendation is purely with respect to string literals. In your case, where you already have the full path computed for you, there's no value in converting it to a relative path before passing it to getSchema.

I think what I'm after is a way to more easily do the right thing in the design phase.

You're right, and you make some great points. The only way I can think of to make the types better is to introduce a URI type that just wraps a string. I'm not really fan of that. I personally would be annoyed if I had use this wrapper every time just to pass a string. It creates overhead and clutter. I don't see a good type-based solution to this problem. Types are never a replacement for proper testing and this might just be one of those cases.

Ultimately, what lead to this problem was introducing support for relative URIs. Originally, I required users to pass absolute URIs. If I still did that, you would have encountered this problem in your original implementation before testing on Windows.

I think this ends up only being a problem with absolute Mac/Linux paths. In all other cases you'll get an error that should point you in the right direction. So, the solution could be to detect when we receive an absolute Mac/Linux path and throw an error. That should be pretty easy to detect (if (resolvedUriScheme === "file" && originalUri.startsWith("/"))).

  1. Would you be open to moving descriptions of methods to their respective definitions (as comment blocks) and then compiling an overall API documentation from there? That would help the problem of maintenance.

Possibly. I think I'd like to see a demo of what that would look like before I say go.

  1. Is there a particular reason for using external .d.ts files rather than using TypeScript for the source or at least .js with JSDoc Reference and compiling to .d.ts files?

I don't use TypeScript for library work because I don't like the experience of digging through transpiled code when I need to debug. When typescript can run natively in all the places I might expect this library to run, then I'll consider using it, but as long as it requires transpiling, I'm a hard no. You might notice that I do use TypeScript for my tests. That's because it's just for development and doesn't impact users experience. The main reason I wanted to use TypeScript for my tests was as a minimal smoke test for my .d.ts type declarations.

Which brings me to, why not JSDoc? I've always been put off by JSDoc's horrible and verbose syntax that I felt contributed little value (I'm not a huge fan of static typing) and created a lot of clutter. However, recent enhancements have improved the experience greatly and I can now use TypeScript syntax directly in JSDoc. I've been using this approach in another project. It's not perfect, but it's good enough that I'm convinced to start using this approach as my default. Which is what I was referring to with ...

  1. I'm looking into transitioning to JSDoc for everything

    In what way?

Currently the only type checking I do is for the tests. That verifies the .d.ts types for the public interface, but there's no type checking for the internals of the library. Using the newer JSDoc TypeScript features I've been using elsewhere, I'd like to enable type checking for the whole library. That would include moving the type declarations from .d.ts files to JSDoc comments.

@gburning
Copy link
Author

The only way I can think of to make the types better is to introduce a URI type that just wraps a string. I'm not really fan of that. I personally would be annoyed if I had use this wrapper every time just to pass a string. It creates overhead and clutter. I don't see a good type-based solution to this problem. Types are never a replacement for proper testing and this might just be one of those cases.

I don't think a concrete type is the way to go, but maybe an alias or some kind of reusable JSDoc block. Ultimately I think this by making the documentation a little more accessible.

Some other ideas (good and bad):

  • Making relative paths opt-in with a note about ensuring the path is a URI? Not great perhaps, but makes it harder to do the wrong thing.
  • Warning when using paths that may not be URIs? Perhaps by testing string against common patterns (such as strings beginning with /Users/? Perhaps by checking if the library is used directly in a browser or something like Node?
    • Just realized this is essentially the same as what you mentioned
  • Modified entry point/library when using in Node and similar where absolute file paths are either discouraged or warned about.

Ultimately, what lead to this problem was introducing support for relative URIs. Originally, I required users to pass absolute URIs. If I still did that, you would have encountered this problem in your original implementation before testing on Windows.

True, that part is a bit of a catch 22 for me though. A large part of the reason why I opted to use your library is that it supports relative paths when resolving $refs. That makes it so I can write valid schemas locally for use with other tools; without worrying about publishing them first (something I am not doing at all).

  1. Would you be open to moving descriptions of methods to their respective definitions (as comment blocks) and then compiling an overall API documentation from there? That would help the problem of maintenance.

Possibly. I think I'd like to see a demo of what that would look like before I say go.

Noted!

I don't use TypeScript for library work because I don't like the experience of digging through transpiled code when I need to debug. When typescript can run natively in all the places I might expect this library to run, then I'll consider using it, but as long as it requires transpiling, I'm a hard no.

Fair enough. I don't if the compiler can be manipulated to transpile to something less cluttered. Regardless, I was mostly curious, it's not up to me.

Which brings me to, why not JSDoc? I've always been put off by JSDoc's horrible and verbose syntax that I felt contributed little value (I'm not a huge fan of static typing) and created a lot of clutter. However, recent enhancements have improved the experience greatly and I can now use TypeScript syntax directly in JSDoc. I've been using this approach in another project. It's not perfect, but it's good enough that I'm convinced to start using this approach as my default. Which is what I was referring to with ...

Currently the only type checking I do is for the tests. That verifies the .d.ts types for the public interface, but there's no type checking for the internals of the library. Using the newer JSDoc TypeScript features I've been using elsewhere, I'd like to enable type checking for the whole library. That would include moving the type declarations from .d.ts files to JSDoc comments.

Sounds like the way to go! I think I could contribute with some pull requests for types as JSDoc comments, perhaps including general documentation. Would require some kind of build step to bundle them as .d.ts files in the released packages I think but that should be doable with tsc.

@jdesrosiers
Copy link
Collaborator

I think detecting absolute paths and throwing an error is the way to go. Let me know if you want to make the PR, otherwise I'll get to it soon.

I think I could contribute with some pull requests for types as JSDoc comments

I've actually tried to do this before and ran into some issue that blocked me. I can't remember what it was. I've been slowly working towards v2 and given the difficult I was having with converting to JSDoc, I was considering just leaving it as a v2 task. I could try to find or recreate the branch I did that work and share that with you. If you can help figure out how to get past the blocker, that would help get that done sooner than v2. Once that's working, we can think about adding documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants