Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pandoc like behavior for nested container directives #25

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ugogon
Copy link

@ugogon ugogon commented Feb 12, 2024

Initial checklist

  • I read the support docs
  • I read the contributing guide
  • I agree to follow the code of conduct
  • I searched issues and couldn’t find anything (or linked relevant results below)
  • If applicable, I’ve added docs and tests

Description of changes

I one again try to push this discussion. I want to switch from pandoc to remark and this feature is one of the essential "missing" features. I know the common mark proposal is different, but it is so inconvenient. And I know you suggested to add it behind a flag, but as it is backwards compatible I think it can be the default.

Try pandoc

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Feb 12, 2024
@ugogon ugogon force-pushed the pandoc-like-nested-directives branch from 50e7527 to 45e0d13 Compare February 12, 2024 10:27
@ugogon ugogon force-pushed the pandoc-like-nested-directives branch from 45e0d13 to 1b7e22b Compare February 12, 2024 10:29
@wooorm
Copy link
Member

wooorm commented Feb 12, 2024

Heya!

I know the common mark proposal is different, but it is so inconvenient.

The reason directives are a great idea in my opinion, is not because of a particular syntax (which is nice or not nice or whatever, subjective), but because everyone implements it the same.

It’s a proposal of course, so it’s not crystal clear how every edge case should work.

But the goal is for folks to implement them the same. The goal is for them to work on GitHub. in VS Code. In syntax highlighting libraries. In this project and all other projects.

So no, if the proposal says this doesn’t work, then this shouldn’t work by default.


Is your goal to match pandoc exactly? If so, that sounds more like a fork of this project.

How much markdown do you have? Can you regex some stuff? What are the alternatives?

@ugogon
Copy link
Author

ugogon commented Feb 12, 2024

Thanks for the quick response. I see your point. But I m unsure what everyone means. Pandoc is not using this syntax and it is still a proposal. They are still discussing it as it seems.

But anyway it is your project and I can add a flag. I do not want to fork as the syntax is too close.

And I do not need to copy pandoc exactly as I don't need every feature and I can regex a lot (e.g. they allow spaces between the colons and the directive name). But this nesting is not regexable if you have arbitrary depth.

Edit: I want to add that I do not see any nesting in the proposal.

@wooorm
Copy link
Member

wooorm commented Feb 12, 2024

But I m unsure what everyone means

My goal is for everyone to implement that proposal.
Pandoc is an example of something that doesn’t move.
It’s one author. The main author of CM. And still Pandoc markdown is different from CM. After 10 years.

Edit: I want to add that I do not see any nesting in the proposal.

See:

That way, you can even nest blocks (think divs) by using successively fewer colons for each containing block.

In 3. Container Block Directives.

But sure: the generic directive proposal is vague. When vague, I went with how other things in CM work. A lot of this is modelled after how fenced code works:

````markdown
```js
console.log(1)
```
````

I do not want to fork as the syntax is too close.

Is it? I am (educated) guessing that there are more differences between Pandoc. Are you talking about Pandocs fenced_divs or something else?

And I do not need to copy pandoc exactly as I don't need every feature and I can regex a lot (e.g. they allow spaces between the colons and the directive name). But this nesting is not regexable if you have arbitrary depth.

Here too then is an important difference: are we doing pandoc? Or just a different nesting? What’s the goal here?

How much markdown do you have? What are the alternatives?

@ugogon
Copy link
Author

ugogon commented Feb 12, 2024

Ok, this is maybe a misunderstanding. I do not want this extension to copy pandoc nor its fenced_divs exactly. And I do not have or need alternatives. The amount of markdown I have is indefinite:

I currently replace a tool that is based on pandoc. I want the users of the tool to have minimal switching costs. For their current markdown I can for sure provide a converter script, but I just want my users to be able to nest directives nicely like they are used to.

The users usually want to nest but do not know how deep in the beginning. The constant addition of colons to the parents was one of the main complaints of those who have already switched to my tool. I'm currently use my fork for the tool and this solves the problem for me and the users perfectly fine. But as the issues and pull requests here suggest my users are not the only one who wish for a more flexible syntax, I added this pull request.

I love your work and do not see any reason to fork and I can understand your reasoning. I can simply add the flag and just turn it on for my application if this if fine with you. That will also solve the problem for me.

@wooorm
Copy link
Member

wooorm commented Feb 12, 2024

I think these two points are at odds?

I want the users of the tool to have minimal switching costs.

I do not want this extension to copy pandoc nor its fenced_divs exactly


I just want my users to be able to nest directives nicely like they are used to.

With this extension people can nest fine:

::::outer
:::inner
:::
::::

And that nesting is the same as they do with code in markdown. Consistency. Nice.

The constant addition of colons to the parents was one of the main complaints of those who have already switched to my tool.

And what will happen if they can’t switch to the next tool? Or they come from I want to prevent peoples markdown breaking between tools. By having one version of markdown. That works like other tools.

I would love to hear more about the end users problems.

Perhaps education can also help them.

This behavior is also more similar to how this works in other tools btw. Such as raw string literals in Rust: r##"He said, "I want to include "# in the sentence"."##. You add more things around.

I would also assume that there are more adjacent children. So having to add colons to say 2 children is more work than to add it to 1 parent?

@ugogon
Copy link
Author

ugogon commented Feb 13, 2024

I think these two points are at odds?

Of course, an exact copy would be great, but that would mean that a self-maintained fork would have to be written first.


With this extension people can nest fine:

Sure you can nest. But fine? Different workflows lead to different needs. Are you adding depth inside or do you wrap things together more often. In our workflow we usually add more inner things later. desiprisg seems to do the same.

I would love to hear more about the end users problems.

E.g. from our codebase:

:::::columns-2-1
::::div
* Insert explicit equation of line into implicit equation of sphere
* Two solutions because of norm
  :::div{.small}
  $$
  0 = \norm{\vec{o} + t \vec{d} - \vec{c}} - r = \norm{\vec{o} + t \vec{d} - \vec{c}}^2 - r^2
  $$
  :::

_text below the list_
::::

![](intersection.svg)
:::::

It was not clear from the start how many block directives we need in the list as the creator builds the list item by item. Maybe we will later add another block directive inside div{.small}. Then we need to update every parent. This is inconvenient. Sure the current solution is nice if you want to wrap multiple items later on. But why not both: This PR allows for both.


I would also assume that there are more adjacent children. So having to add colons to say 2 children is more work than to add it to 1 parent?

The inconvenience does not arises from the amount of colons but when to add them and where. If you decide to add a block in the inside you have to change every parents opening and closing fence wherever they are. But for the children you know the number of colons the moment you write them (simply one more than the parent). Again this problem is symmetric: if you want to wrap some blocks afterwards you would need to add colons at every child, which is likewise inconvenient. In our case, the former happens quite regularly, which led to this PR.
Again: The PR allows for both, so you can decide which style you want to use. You can even use the same amount of colons everywhere.


This behavior is also more similar to how this works in other tools btw. Such as raw string literals in Rust: r##"He said, "I want to include "# in the sentence"."##. You add more things around.

Yes, whenever you can not (or do not want to) distinguish between opening and closing fences (and want to nest) you can decide on one convention: more characters around or more characters for the inner. Both conventions have advantages and disadvantages. But whenever you can distinguish opening from closing fences you do not do this e.g. brackets in every language or tags in html.


And what will happen if they can’t switch to the next tool? Or they come from I want to prevent peoples markdown breaking between tools. By having one version of markdown. That works like other tools.

I get that perspective but I decided to go with remark/rehype/unified because it is so well extendable. The architecture seems to has extensions internalized. The flexible pipeline structure will lead to a ton of incompatible markdown compilers.
The moment I went with a non standard syntax (this includes proposals) I knew switching will become difficult.
And I choose convenience over (proposed) standardization.

I'm curious, as directives are a proposal, are there other common mark compiler which implement them?


I see that you like that it is closer to common marks fenced code blocks and the initial proposal this way. But common mark is just designed that way because you can not distinguish between opening and closing fences of a code block. Together with the (useful) rule that more or equal ticks close a block, we end up with the current behavior. But this makes iterative nesting a pain. For code blocks its fine as you usually do not nest more than once. For directives deeper nesting is more common. But for directives it is possible to distinguish opening and closing fences while keeping the second rule. So why limiting our self? It is so much more convenient, it is easy to parse, "backwards" compatible and behind a flag. Where is the problem?

And another question, as you like to stick to standardization. Would you like or maybe even maintain a fork of this extension which implements the pandoc standard of fenced_divs? The only reason why I want this to be merged is that I do not want to end up like all the ⚠️ extensions that did not adopt to micromark changes: https://github.com/remarkjs/remark/blob/main/doc/plugins.md.

@wooorm
Copy link
Member

wooorm commented Feb 13, 2024

Of course, an exact copy would be great, but that would mean that a self-maintained fork would have to be written first.

I maintain this stuff for 10 years and it’s quite likely that I maintain it for 10+ more. I get to do all that work. I don’t want to reverse engineer another unspecified markdown extension. I don’t see reverse-engineering pandoc as an improvement to markdown compatibility.

Again this problem is symmetric: if you want to wrap some blocks afterwards you would need to add colons at every child, which is likewise inconvenient. In our case, the former happens quite regularly, which led to this PR.

Yes, there is 50%/50% to say for either, agreed. The likeness to other CM features tips that scale IMO.

Again: The PR allows for both, so you can decide which style you want to use. You can even use the same amount of colons everywhere.

Is that how Pandoc works? That it doesn’t matter how many closing colons there are? Does it matter how many opening colons there are? Interesting. Wild. Then it sounds like there are 2 things discussed. Either arbitrary colons or inverting nesting.

Yes, whenever you can not (or do not want to) distinguish between opening and closing fences (and want to nest) you can decide on one convention: more characters around or more characters for the inner. Both conventions have advantages and disadvantages. But whenever you can distinguish opening from closing fences you do not do this e.g. brackets in every language or tags in html.

I don’t understand, in the rust example, the opening and closing can be disambiguated?

The flexible pipeline structure will lead to a ton of incompatible markdown compilers.

The AST stuff is fundamentally different from the parsing stuff / markdown syntax.

And, the stuff I work on is pretty popular. If you use directives and some competitor uses unified/micromark/etc too, then markdown is interchangable. When we stick close to the directive proposal here, and other tool makers do too, that’s increasing the likelihood of interchangability.

But common mark is just designed that way because you can not distinguish between opening and closing fences of a code block.

CM took this idea from old GFM. I remember there being bugs with nesting. Not sure it was allowed from the start. Perhaps but not sure.

But for directives it is possible to distinguish opening and closing fences while keeping the second rule.

What is the second rule?

But why not both: This PR allows for both.

"backwards" compatible […]. Where is the problem?

I think it gets confusing fast to allow arbitrary and unmatching colons.
XML/HTML at least have a corresponding closing name to the opening.
There is more that matching-or-more allows: the closing is optional.

::::parent
:::child
::::

child here is not closed, it runs to the end of its parent.

And, the current choices allow for fragments.

Would you like or maybe even maintain a fork of this extension which implements the pandoc standard of fenced_divs? The only reason why I want this to be merged is that I do not want to end up like all the ⚠️ extensions that did not adopt to micromark changes: remarkjs/remark@main/doc/plugins.md.

No, I am not interested in maintaining that. I don’t think looking back at Pandoc moves the language forward.

@ugogon
Copy link
Author

ugogon commented Feb 19, 2024

Is that how Pandoc works?

Yes you can try it: Try pandoc

Then it sounds like there are 2 things discussed. Either arbitrary colons or inverting nesting.

Yes, I can also add this as a third option.

I don’t understand, in the rust example, the opening and closing can be disambiguated?

Fine. I looked into the rust example and you are right but it also does not fit my argument. It is not about escaping not nesting. Of course you do not want to perturb the raw string so obviously we need flexible delimiter, that is why they added the option for arbitrary many #. Other languages solved that completely different: eg. latex \verb or rubys Heredoc

And, the stuff I work on is pretty popular. If you use directives and some competitor uses unified/micromark/etc too, then markdown is interchangable. When we stick close to the directive proposal here, and other tool makers do too, that’s increasing the likelihood of interchangability.

That is a circular argument. If you merge this all tools based on unified/micromark/remark will still be interchangeable as they are based on this repo. For others it is very likely that it is not interchangeable: e.g. Pandoc based codebases are not interchangeable and if another compiler implemented the whole proposal it is also not interchangeable as you do not support the whole initial proposal:

  1. Spaces are allowed between fences and names and attributes:
    :: name [content] {key=val}
    
  2. Colons are allowed after attributes
    :::::::::::: SPOILER :::::::::::::
    We're going to spoil it in three
    easy steps:
    
    1. ready
    2. steady
    3. go
    ::::::::::::::::::::::::::::::::::
    
  3. If no name is provided it falls back to default
    ::: {.myClass}
    some markdown
    :::
    
  4. There is a default span as well:
    hello [world]{.myClass}
    

Note: These are all pandoc features as well. So the initial proposal is pandoc compatible in every regard except for the nesting.


Anyway I don't think this discussion is getting us anywhere. So I close with one question: Would you accept a PR with arbitrary colons, inverted colons and code block like colons are three options set by a flag and code block like colons is the default?

@ugogon ugogon closed this Feb 19, 2024
Copy link

Hi! This was closed. Team: If this was merged, please describe when this is likely to be released. Otherwise, please add one of the no/* labels.

@ugogon ugogon reopened this Feb 19, 2024
@codecov-commenter
Copy link

codecov-commenter commented Feb 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (7f23ba8) to head (d47b0c5).
Report is 9 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##              main       #25     +/-   ##
===========================================
  Coverage   100.00%   100.00%             
===========================================
  Files            9        18      +9     
  Lines         1416      2720   +1304     
===========================================
+ Hits          1416      2720   +1304     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤞 phase/open Post is being triaged manually
Development

Successfully merging this pull request may close these issues.

3 participants