Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-40 - Name Based XML Literals #42

Closed
wants to merge 2 commits into from

Conversation

scala-improvement-bot
Copy link
Collaborator

This pull request has been automatically created to import proposals from scala/docs.scala-lang

@julienrf julienrf changed the title SIP-NN - Name Based XML Literals SIP-39 - Name Based XML Literals Jun 30, 2022
@julienrf julienrf changed the title SIP-39 - Name Based XML Literals SIP-40 - Name Based XML Literals Jun 30, 2022
@julienrf julienrf requested review from gabro, lrytz and sjrd June 30, 2022 15:22
@julienrf
Copy link
Contributor

Thank you @Atry, I have assigned a team of reviewers to the proposal.

Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly old SIP, which already went before the previous SIP committee back in 2019. At the time, the feedback of the committee was
https://contributors.scala-lang.org/t/sip-name-based-xml-literals/2175/44?u=sjrd
with clarifications at
https://contributors.scala-lang.org/t/sip-name-based-xml-literals/2175/47?u=sjrd

I don't think there's much to change in the SIP text (except my one comment below) at this point. It remains to be seen whether the new SIP committee structure and members will be more receptive to this proposal than the old one.

Personally, I would like to see how (whether) this desugaring could be used to provide a front-end for Laminar. To me, Laminar's API is the gold standard in terms of HTML builders APIs. The onus does not need to be on the author (it could happen during the experimentation phase by others, including me), although its existence would be big plus for me.

Comment on lines +34 to +35
xml.attributes.title(xml.values.`my-title`),
xml.texts.line1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of identifiers for values and texts doesn't seem Scala-ish to me. Synctactically, in XML, values and texts are strings. They should stay strings in Scala a the level of this name-based desugaring. So IMO this should be:

Suggested change
xml.attributes.title(xml.values.`my-title`),
xml.texts.line1,
xml.attributes.title(xml.value("my-title")),
xml.text("line1"),

If the provider library really wants to do type-checking of values, they can do so with literal string types. For example, to check that an HTML element's checked attribute always has the value "checked", it can demand the literal type Text["checked"].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related discussion ScalablyTyped/Converter#343

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the erasure issue is a problem here. For text, I believe a single result type will be enough anyway. And for value, I would expect a unique method like

def value(s: String): Value[s.type] = ???

to be sufficient. At call site, the expected type Value["checked"] would be enough.

If really that's not good enough, there's always @targetName.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value["checked"] would be good. I tried the approach in Scala 2 but there are some issues that I did not remember preventing it from working. I guess it should work in Scala 3, because the literal type support is more solid.

Copy link
Contributor

@Atry Atry Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScalablyTyped is using some weird trick to mimic literal types https://github.com/ScalablyTyped/Converter/blob/dfaf906684b3dbe74aff26457b4e900e8be8bb77/docs/encoding.md?plain=1#L251

I forgot the exact issue, but Scala 2 compiler tends to use the widen type everywhere possible, and sometimes results in some compile error that could be avoid if the parameter were inferred as a literal type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason is that when I was writing this proposal I was trying to support Scala 2.12 or even 2.11, where literal types are not available.

Copy link
Contributor

@Atry Atry Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update this proposal to use string parameters instead of dynamics. Maybe also for attribute names and tag names to avoid name clash with toString?

@lrytz
Copy link
Member

lrytz commented Jul 8, 2022

The proposal is well thought out, I don't have any new feedback. I agree with @sjrd to reconsider using identifiers representing for values and texts.

My main concern is that XML literals are officially marked "for removal" (https://dotty.epfl.ch/docs/reference/dropped-features/xml.html). So re-engineering XML literals only makes sense if there's consensus that they are going to stay around. If not, this proposal should be reworked to build on string interpolation.

What's the status of https://github.com/lampepfl/xml-interpolator, how does it relate to that?

Copy link

@gabro gabro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've read the proposal and also tried to catch up with the related discussions.
My opinion is that if we decide to maintain XML in the language, then yes, I think a name-based approach would be ideal, for the reasons outlined in this proposal (plus another note about IDE support I've added inline).

That said, the main question to me is: is XML relevant enough to warrant embedding a different language entirely in Scala?

Note: I am assuming this proposal would mean to re-engineer all the existing XML support in Scala 3, so I will not apply the "it's already there, why remove it" argument

I don't have concrete data (would love to see some!) but as far as I know:

  • XML was originally included in Scala because XML was then what JSON is today
  • Under this assumption, I don't think this holds true today, XML is drastically less used then it was 14 years ago
  • The biggest win I see would be to increase the ergonomics of writing HTML in the context of web applications (either in Scala.js or JVM)

Am I missing some other big high-level goals here (I surely could be!)?

Otherwise from my point of view this reduces to: is writing HTML in Scala a big enough goal to embed XML (to some degree) in the language, so that people already doing it can benefit from it, and other people may start doing it because of this?

Also, if we do this, what's the rationale for not embedding other "pragmatic" languages such as JSON or YAML (instead of literals + 3rd party libraries)? I don't think it would be entirely crazy, but where do we draw the line?

### Goals

* Keeping source-level backward compatibility to existing symbol-based XML literals in most use cases of `scala-xml`.
* Allowing schema-aware XML literals, i.e. static type varying according to tag names, similar to the current TypeScript and [Binding.scala](https://github.com/ThoughtWorksInc/Binding.scala) behavior.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the TypeScript behavior you're referring to here?

As far I as know TypeScript supports JSX, but the typing support is very limited: a well known problem is that you can't restrict the type of children of a JSX node, since the type of a JSX expression is a "black box" (to use their own words).

Copy link
Contributor

@Atry Atry Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a TypeScript expert. I know <MyComponent/> would be typed as MyComponent but I don't know more details. Do you know more about how TypeScript type-check JSX? What does "black box" actually mean? Do they translate XML literals to JS before type-checking or after?

Copy link

@gabro gabro Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider myself an expert TypeScript user, so my knowledge about compiler internals is partial, but the main issue with JSX in TS is that the result type of any JSX expression is JSX.Element, which is monomorphic.

This means you can't write a MyList component that only accepts ListItem as children:

<MyList>
  <ListItem /> // ok
  <WrongComponent /> // can't get an error here
</MyList>

because all of those tags will just resolve to the JSX.Element type.

I suspect TypeScript type-checks JSX before emitting any JS, since it even has the option of preserving JSX in the emitted code (see the jsx: "preserve" compiler option), but you still get type-checking in that case.

TypeScript allows to swap out the implementation it emits JSX to, e.g. the default is React.createElement, but you can choose a different one when using alternative libraries that make use of JSX (like preact, Vue, etc). This does not affect type checking, since they all just must return JSX.Element anyway.

There are a bunch more details to how TypeScript works (e.g. how it manages function vs classes for component definitions, how it distinguishes between native and user-defined elements, and so on) but I think the main take is that it's ultimately quite a bit different to what it's being proposed here.

The very core of the proposal (being able to get different types for different XML tags) is precisely what TypeScript is lacking.

|XML is parsed by ...|compiler|compiler|library, IDE, and other code browsers including Github, Jekyll (if syntax highlighting is wanted)|
|Is third-party schema-less XML library supported?|No, unless using white box macros|Yes|Yes|
|Is third-party schema-aware XML library supported?|No, unless using white box macros|Yes|No, unless using white box macros|
|How to highlight XML syntax?|By regular highlighter grammars|By regular highlighter grammars|By special parsing rule for string content|
Copy link

@gabro gabro Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this comparison: from a practical point of view, having first-party support in the language means for each grammar to carry around an XML grammar parser.

On the other end, having it restricted to special interpolators like xml"<hello />" allows grammars to leverage the language embedding feature from TM grammars (supported by most editors), so you can just specify that whatever is within that literal is XML, and piggyback on existing XML grammars.

I'm probably oversimplifying, but I just wanted to note that I don't think that it would be significantly more work to support XML highlighting in the string interpolation scenario.

|Is third-party schema-less XML library supported?|No, unless using white box macros|Yes|Yes|
|Is third-party schema-aware XML library supported?|No, unless using white box macros|Yes|No, unless using white box macros|
|How to highlight XML syntax?|By regular highlighter grammars|By regular highlighter grammars|By special parsing rule for string content|
|Can presentation compiler perform code completion for schema-aware XML literals?|No|Yes|No|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point of comparison, where probably this proposal comes out on top, is IDE support for navigation (go to definition, find references, etc..).

A name-based approach would allow things like "find all the usages of <div />" pretty much automatically (provided that dotc outputs all the relevant information to the SemanticDB), whereas using a string literal would require non-trivial work (to parse and index all the XML literals, and even that would be string-based, so it could only guess matches across literals based on the tag name)

@odersky
Copy link
Contributor

odersky commented Jul 21, 2022

I think XML literals should be in-line to be removed. There might be an option to support them under some legacy mode. But we really should base future work on string interpolator-based solutions.

Copy link

@gabro gabro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize my position on this proposal:

  • XML literals are scheduled for removal in Scala 3
  • this SIP essentially proposes to keep them and make them name-based instead
  • I don't think the proposal brings up enough motivations to keep XML literals, the main use case seems to be HTML templates, which in my opinion is not a big enough target to warrant keeping such a big feature in the language

Given the above, I recommend against accepting the proposal as is.

Possible areas of explorations may be:

  • a metaprogramming based approach, which may be possible in user space
  • a preprocessing approach, much like JavaScript and TypeScript support JSX syntax via a specific extension (.jsx / .tsx), one may think of a name-based sostitution performed on .scalax files? This may be not entirely on-brand for Scala, but it may be a cheap solution to experiment with the name-based approach.

@sjrd
Copy link
Member

sjrd commented Aug 15, 2022

If we chose to invest in XML literals and revive them for the next 10 years, I think this name-based solution would be very nice, if not ideal, and I would be in favor of this proposal.

However, I can't bring myself to root for XML literal support in the language, mainly because of the following earlier comment:

Also, if we do this, what's the rationale for not embedding other "pragmatic" languages such as JSON or YAML (instead of literals + 3rd party libraries)? I don't think it would be entirely crazy, but where do we draw the line?

XML is definitely less relevant now than JSON or YAML. So it makes little sense to invest in XML literals without investing in those other two, and then perhaps others. For that, we need a general way of dealing with embedded languages. And, despite shortcomings, string interpolation fills that need.

Therefore, my recommendation would be to reject this proposal at the upcoming meeting.

@gabro
Copy link

gabro commented Aug 26, 2022

This proposal was rejected at today's SIP meeting.

While the proposed implementation scheme seems promising, there is general consensus against maintaining XML as a first-party citizen in the language since:

  • XML is not as relevant as it used to be many years ago when it was first introduced in the language
  • with that in mind, it's not clear why we should invest in XML specifically, as opposed to adding support for other languages like JSON, or YAML, which are currently more popular

The general recommended direction would then be to proceed with the deprecation and removal of XML literals from the language, and keep using string interpolation and possibly metaprogramming for embedding other languages.

Thanks @Atry, regardless of the rejection, your proposal is very well made and thorough and we hope there may be a chance to revisit this using user-land features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants