Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typescript generation through oxc #197

Closed
indietyp opened this issue Dec 23, 2023 · 5 comments
Closed

Typescript generation through oxc #197

indietyp opened this issue Dec 23, 2023 · 5 comments

Comments

@indietyp
Copy link

I have a local branch with some experimentation, which replaces the current exporter with one based on oxc. The idea behind using oxc or similar is that while heavy on dependencies, it would allow us to create significantly more complex types more easily without worrying about creating valid typescript.

(For example, it would allow us to generate exported interfaces instead of type aliases)

If interested, I'd happily polish this (add tests, etc) and upstream the changes.

(This would also allow us to have a File object, and you add all statements to it one by one; it would also allow us to easily add functions to the export)

@oscartbeaumont
Copy link
Member

I don't think this aligns with my vision for Specta but I do think you've got a couple of options.

In terms of interface vs type, I'm curious why your use case specifically benefits from using interface over type? Originally Specta used interface but we moved exclusively to type early on due to its many benefits(specta-rs/rspc#83).

I personally think it's out of Specta's scope to deal with advanced types such as classes.

With the Specta release candidate releases, I have been refactoring DataType and its related types to represent the Rust code as closely as possible and due to this, I don't think introducing any struct's to represent Typescript classes makes sense as there is not really a direct Rust equivalent.

However, I think Specta can be the perfect building block for these sorts of abstractions. A great example of this is tauri-specta.

Specta's core has a type called FunctionDataType which is used to express functions however if you look closely you will notice it's only the "metadata" about the function and is missing the functions Typescript implementation (Eg. async function name(args: type) { ...this..bit..is..missing..from..that..type... }).

Due to this fact, Specta doesn't really have function exporting built in, it's on the downstream user to define how to convert a FunctionDataType into a proper Typescript function. I was originally hesitant about this design but I think it is the right level of abstraction as there is no good way in Rust to express the Javascript code that exists within the function body.

With all of this in mind, I think Specta's core types can be composed to express Typescript classes (similar to how FunctionDataType works). You could do something like the following and potentially even publish it as a dedicated crate.

use specta::datatype::{DataType, FunctionDataType};

pub struct TsClass {
   name:  String,
   fields: HashMap<String, TsClassField>,
   methods: HashMap<String, TsMethod>,
}

pub struct TsClassField {
   ty: DataType,
   default_value: Option<String>,  // Alternative this could be `specta::datatype::LiteralType` but it would limit the ability to express something like a function call or `new Date()`, etc.
}

pub struct TsMethod {
  func: FunctionDataType,
  body: String, // You could potentially create better abstractions for expressing JS/TS code but `String` ends up being the most flexible.
}

On a different note, it is also worth noting that Specta is not tied to any exporter. It's DataType system is designed to represent Rust types and leave the language-specific transforms for any exporter which could be implemented as an external crate. This means theoretically specta-oxc would be possible. If you were to look into this approach I would highly recommend building on the 2.0.0-rc.x releases as they have made some major changes to these types.

Long term I intend for language exporters to move out of the core specta crate. So this is definitely a supported use case!

Something you did bring up is File which I don't think right now Specta has a good solution for. You could have a "fake" type (like #[derive(Type)] pub struct File;) and then manually remove it from the TypeMap so Specta doesn't export it but with the 2.0.0-rc.x I made the TypeMap imutable which I think was an oversight, will open another issue to fix that.

Sorry for the long explanation and feel free to ask any follow-up questions if anything I said was unclear!

Also, feel free to jump in the Discord if you do start working on anything in this space as questions are always welcome!

@indietyp
Copy link
Author

Thank you so much for such a detailed and thoughtful answer!

The reason why I explored oxc instead of the current approach is primarily out of curiosity. The idea is simple: instead of building up a flat string as a representation, we build up an AST, that we then use to generate the bindings from.

What I mean specifically is, we turn the creation of a TypeScript representation instead of a string into:

pub(crate) fn datatype_inner<'a>(
    ctx: &ExportContext<'a>,
    typ: &DataType,
    type_map: &TypeMap,
) -> core::result::Result<TSType<'a>, ExportError> {
    Ok(match &typ {
        DataType::Any => ctx.ast.ts_any_keyword(SPAN),
        DataType::Unknown => ctx.ast.ts_unknown_keyword(SPAN),
        DataType::Primitive(p) => {
            let ctx = ctx.with(PathItem::Type(p.to_rust_str().into()));
            match p {
                primitive_def!(i8 i16 i32 u8 u16 u32 f32 f64) => ctx.ast.ts_number_keyword(SPAN),
                primitive_def!(usize isize i64 u64 i128 u128) => match ctx.cfg.bigint {
                    BigIntExportBehavior::String => ctx.ast.ts_string_keyword(SPAN),
                    BigIntExportBehavior::Number => ctx.ast.ts_number_keyword(SPAN),
                    BigIntExportBehavior::BigInt => type_reference(&ctx, BIGINT),
                    BigIntExportBehavior::Fail => {
                        return Err(ExportError::BigIntForbidden(ctx.export_path()));
                    }
                    BigIntExportBehavior::FailWithReason(reason) => {
                        return Err(ExportError::Other(ctx.export_path(), reason.to_owned()));
                    }
                },
                primitive_def!(String char) => ctx.ast.ts_string_keyword(SPAN),
                primitive_def!(bool) => ctx.ast.ts_boolean_keyword(SPAN),
            }
        }

because we're now working with an AST instead of a string, we're able to manipulate it easier and malformed types are less likely to happen. Typescript types can get quite complicated and this would be kind of an escape hatch. Then again: the amount of dependencies increases quite a bit and I do not know if this is worth the projected benefit, but it could make future features a lot easier.

With the function definition, what I was looking at specifically was .d.ts signature creation. Right now, I am doing this as a string in a build script, but it would be cool if one could generate these out of specta itself, but I do understand why you chose not to!

On a different note, it is also worth noting that Specta is not tied to any exporter. It's DataType system is designed to represent Rust types

Yes! I have been playing around with the DataType quite a bit to try to fit it into my specific model (although I feel like some of the things I have been doing could be considered illegal), and I love the simple representation of all types. specta is an exciting project that is imho the first one who gets this "right".

I don't think introducing any struct's to represent Typescript classes makes sense as there is not really a direct Rust equivalent.

Totally, I didn't create the issue in hopes that they will be added (I don't really see how), but it may be an additional point towards language specific metadata that might be needed (I believe there was an issue about ts specific type overrides at one point?)

In terms of interface vs type, I'm curious why your use case specifically benefits from using interface over type?

I think it's mostly about convention. The current project I am working on uses interfaces more than types, and I know that mobx-state-tree discourages types for large types due to performance concerns, see: https://github.com/microsoft/TypeScript/wiki/Performance#preferring-interfaces-over-intersections.

@indietyp
Copy link
Author

regarding the File mentioned, what I meant specifically is a new interface akin to:

let mut file = File::new();
file.export::<Example>();
file.export_datatype(xyz);
file.import::<Example2>("./relative-path.ts");

for ty in type_map.iter() {
	file.define(ty)
}

file.signature_fn(func1);

let mut buffer = vec![];
file.export(&mut buffer);

println!("{file}")

file.define would skip over types that we already exported.

This would have several upsides:

  • easily distinguish between exported and mentioned types
  • signature output (doesn't really need File)
  • automatically also output the types of mentioned types if desired (through an option like recurse), and then one could mark them to be exported (export takes precedence over define)
  • we're able to generate import statements for types that one might need (a bonus that might be implemented if deemed useful)

This was more of a spitballing idea than anything else, but thought it might be useful as it would/could unify the API interface, not only for typescript, but also across languages.

@oscartbeaumont
Copy link
Member

Oxc is very cool but it being a heavier dependency definitely scares me away from using. The DataType system is designed to be our own AST format and I would hope any Typescript bugs are caught with our unit-test suite or reported and fixed (I am pretty strict on adding tests for any bug I fix).

With the function definition, what I was looking at specifically was .d.ts signature creation. Right now, I am doing this as a string in a build script, but it would be cool if one could generate these out of specta itself, but I do understand why you chose not to!

I think the general reasoning behind not supporting it was that a function type means nothing if the function isn't implemented. If I had to guess though you want the types to be exported while leaving the implementation up to bindgen? If this is the case I would be happy to support this - #201.

towards language specific metadata that might be needed (I believe there was an issue about ts specific type overrides at one point?)

Yeah, I have been generally on the fence about supporting #110 but I think it will happen because it's kinda required to do all sorts of low-level stuff.

In terms of interface vs type, I'm curious why your use case specifically benefits from using interface over type?

You should be able to export a header on your file which will disable any linting tool, I personally think codegen stuff should be exempt anyway.

In terms of performance as far as I am aware it doesn't make much of a difference. This video by Matt Pocock is a great watch and discusses this topic. It is 1 year old but I would be surprised if the recommendation has changed.

That being said the recommendation you linked is specifically about extending interfaces instead of using unions so idk.

Supporting both would mean two entirely different code paths which I think would be a little too much extra to maintain, especially needing to duplicate a lot of test cases.

This could be a custom exporter but personally I think just sticking with type and if it becomes a problem in the future you could easily fix it all types at once given it's all codegen.

regarding the File mentioned, what I meant specifically is a new interface akin to:

Oh, whoops I completely misread this as JS/browser File.

I do like the idea of some high-level APIs in Specta for collecting types but I am still unsure where the draw the line. I wouldn't be suprised if for Specta v2 we have a specta-ext crate or something crate with a higher-level builder API similar to what you showed.

You may have noticed Specta's APIs are very low-level at the moment and this is because Specta was originally developed for rspc where high-level APIs weren't required but now that it has become a completely separate project more work does need to be done to make it more user-friendly.

It's also worth noting that Specta needs to keep a pretty stable API surface. If I want to make a breaking change to any function I have to do a major release. Doing a major release will break the impl specta::Type for X within any downstream crates (Eg. uuid, etc). Right now this isn't a huge deal because all of those implementation are within the specta crate but I would like them to move into the downstream dependencies. Similar to how uuid has serde feature flag I would love it to have a specta flag. The main reason I wanna make this change is because any dependency of Specta having a major release would require a Specta major release too or add a feature flags for each version of the crate, both of which are pretty unsustainable when we depend on like 10-15 crates for these impls.

@indietyp
Copy link
Author

indietyp commented Dec 24, 2023

Oxc is very cool but it being a heavier dependency definitely scares me away from using. The DataType system is designed to be our own AST format

Exactly, and my (naive) thought is/was that essentially what we would do is an AST translation between the different languages.

Just spitballing, but for example (again these are on the heavier side)

  • ts: specta (AST) -> oxc (AST) -> string
  • rust: specta (AST) -> rust-analyzer (AST) -> string
  • python: specta (AST) -> rust-python (AST) -> string

I could imagine that this could make developing exporters for some languages easier, and potentially allow us to unify using a trait interface, although due to the differing nature of all these languages I am unsure how beneficial this would be.

In any way, if this is worth exploration, which I am unsure about, it shouldn't be done in specta itself, but through a separate crate like specta-export or similar.

If I had to guess though you want the types to be exported while leaving the implementation up to bindgen?

Correct, thank you for #201 , this was exactly what I was looking for.

You should be able to export a header on your file which will disable any linting tool, I personally think codegen stuff should be exempt anyway.

Yes, I think this is more a matter of taste. This also isn't something I really strongly believe about, just something I was a bit surprised by at first glance. I also believe that tools like eslint can convert between both, so it isn't much of a bother, anyway, nowadays using either doesn't break anything, as an example:

// interface Fruit {
//     flavor: string
// }
type Fruit = {flavor: string};

interface Apple extends Fruit {
    origin: string
}

interface Orange extends Fruit {
    color: string
}

type Persimmon = Fruit & {seeds: number}

uncomment interface to see that both compile, this wasn't possible before typescript 2.2, but I don't think that this is really of a concern here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants