Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax proposal for new html attribute plugin #264

Closed
vincerubinetti opened this issue Aug 15, 2019 · 16 comments
Closed

Syntax proposal for new html attribute plugin #264

vincerubinetti opened this issue Aug 15, 2019 · 16 comments

Comments

@vincerubinetti
Copy link
Collaborator

We are a new plugin in #251 that would allow users to attach arbitrary HTML attributes to almost any element in markdown.

Briefly, it works like this: The user puts in an HTML comment like <!-- comment --> in their markdown next to the desired element. The comment will contain some kind of text indicating what HTML attribute to set and to what value. Then pandoc will pass the comment through untouched to the outputted .html. Then the plugin, when loading the manuscript, will scan the .html document for comments matching the agreed-upon syntax, parse them, and attach the attributes.


This issue is to discuss what the syntax inside the HTML comment should look like.

Here are some of the goals:

  • allow regular comment text in the same comment alongside the attribute keywords, anywhere in the comment
  • easy to remember and intuitive syntax, as consistent as possible with already-established standard syntaxes like HTML, CSS, jQuery, etc.
  • allow easy commenting in/out of individual attributes
  • avoid unintentional setting of attributes? (not sure how likely these collisions would be)

Here is my proposal

<!-- explanatory text $colspan="2" more explanatory text $style="color: red" even more explanatory text -->

The plugin will look for a $, followed by a continuous (no whitespace) word, followed by a =, followed by a pair of ".

With this, you can take out the $ to add/remove attributes fairly easily. I chose $ because jQuery uses it, and many people already are used to seeing it in HTML and javascript. We could also use #.

@agitter
Copy link
Member

agitter commented Aug 18, 2019

as consistent as possible with already-established standard syntaxes like HTML, CSS, jQuery, etc.

I agree we should aim to be consistent with these syntaxes. @vincerubinetti is there any other relevant context like the jQuery $ usage you noted that you could share to help us decide on the best syntax? I'm not very familiar with HTML and Javascript conventions.

@dhimmel
Copy link
Member

dhimmel commented Aug 28, 2019

<!-- explanatory text $colspan="2" more explanatory text $style="color: red" even more explanatory text -->

I like the proposal, but I think we should require that comments that wish to set attributions begin with setattrs (after stripping leading whitespace). This way we can avoid having to parse the majority of comments decreasing computation and avoiding any false positives that may arise. Some manuscripts may have an extreme number of comments, so I think we want avoid parsing comments that don't intend to use this feature.

As far as the attribute indicator of $ versus #, @vincerubinetti let's have you decide what's best absent a strong opinion from anyone else.

Sorry it took me a while to get around to commenting here. Looking forward to releasing #251.

@dhimmel
Copy link
Member

dhimmel commented Aug 28, 2019

I think we should require that comments that wish to set attributions begin with setattrs

I should also say that this would make me comfortable enabling the plugin in the default rootstock build, since I think it's highly unlikely someone would trigger this by mistake.

@agitter
Copy link
Member

agitter commented Aug 29, 2019

@dhimmel so your modified proposal is

<!-- setattrs explanatory text $colspan="2" more explanatory text $style="color: red" even more explanatory text -->

I have no objections to that.

@vincerubinetti
Copy link
Collaborator Author

vincerubinetti commented Aug 29, 2019

I think setattrs is unnecessary and adds complexity for what you have to remember. I already thought that it was really unlikely that anyone would accidentally trigger this without any special additions at all (ie colspan=2), but now with the added $ there's no way someone's going to use this by accident.

The html attribute names are pretty specific. No one's going to type colspan without knowing that it's an html attribute. And even if someone does accidentally trigger it, it will either change the look of the html a bit (in which case it's pretty easy to realize what's going on), or just do nothing if it's an invalid html attribute. In fact, even though there's a lot of attributes, almost all of them only apply to certain types of elements. So if you were to look through all the permutations of attributes and elements, the vast majority of them do nothing or nothing significant.

Also with regard to processing. Parsing strings in javascript is very fast and even with thousands of comments the time would be trivial (especially compared to the other processing being done by the other plugins on a manuscript of that size).

@vincerubinetti
Copy link
Collaborator Author

vincerubinetti commented Aug 29, 2019

Here is a test with 891 comments:
https://jsfiddle.net/zwL03hob/

median times on my local machine, in chrome:
find dom nodes: ~3 ms
parse comments: ~8 ms

The same test but with half of the comments starting with setattr and only parsing those:
https://jsfiddle.net/9541ujqo/

median times on my local machine, in chrome:
find dom nodes: ~3 ms
parse comments: ~6 ms

@dhimmel
Copy link
Member

dhimmel commented Aug 29, 2019

@dhimmel so your modified proposal is

Yes.

@vincerubinetti I agree the speed and reliability benefits of setattrs are modest. One other benefit of requiring a setattrs prefix is that it makes it easy to enable/disable a comment from being processed. It's often handy to be able to include HTML attributes in a disabled way, in case the user would like to activate them later. For example:

<!-- disable setattrs $colspan="2" $style="color: red" -->

would not set attributes. Without setattrs, the user would have to delete every $, which would not be too much of a problem unless there were many attributes... but it does require a bit more thought to re-enable them.

Here is a test with 891 comments:

The test is pretty cool and helpful. The type of comments where setattrs would show the greatest benefits will likely be very long comments. For example, perhaps someone will insert 1000 lines of commented source code to create a figure. Not that I suggest users do this, but that I think its possible. In this case, short-circuiting the evaluation of that comment could have more major time savings.

@vincerubinetti
Copy link
Collaborator Author

In this case, short-circuiting the evaluation of that comment could have more major time savings.

If you can provide me with some cases to test that would be helpful. But even with the most massive papers, the time savings is going to be negligible compared to the work the other plugins will also have to do, and I think it's an anti-pattern to worry about that.

If you want to be able to comment out a whole comment at once, I think that's valid. Though I don't imagine people would need to be commenting and uncommenting these types of comments very often. The use case for this plugin is already very edge case... mostly just for colspan.

@vincerubinetti
Copy link
Collaborator Author

Also maybe # instead of $ because maybe sometimes indicate a command line with $, though I've always seen it with a space in those cases.

@dhimmel
Copy link
Member

dhimmel commented Aug 29, 2019

If you can provide me with some cases to test that would be helpful

I imagine including source code for figures via comments would be one thing users will do. You could try putting this R-code for a figure from a recent publication. This file is 178 lines. In the worst case, we could imagine someone doing something like (on the order of 500 line comments) for all of their figures.

Also maybe # instead of $ because maybe sometimes indicate a command line with $, though I've always seen it with a space in those cases.

# makes sense to me... would it get confused with setting an HTML object id? If we require a setattrs prefix, we may not need a symbol at all. @vincerubinetti do you have a preference? If so, we could finalize two options and invite stakeholders to express their preference?

@vincerubinetti
Copy link
Collaborator Author

vincerubinetti commented Aug 29, 2019

Would you be able to put together a test dummy manuscript with the absolute worst case you'd ever expect (within reason) in regards to comments.

would it get confused with setting an HTML object id?

No, id is just id='whatever'. You're thinking of the CSS selector # which refers to the id, yes.

@dhimmel
Copy link
Member

dhimmel commented Aug 29, 2019

You're thinking of the CSS selector # which refers to the id, yes.

And Pandoc's header_attributes extension, which uses the following syntax:

{#identifier .class .class key=value key=value}

Would you be able to put together a test dummy manuscript with the absolute worst case you'd ever expect (within reason) in regards to comments.

Okay here's a manuscript with several 100+ line comments of code (python, R, shell, matlab, javascript). It's based off of the meta-review HTML: manuscript-with-comments.html.txt. It's not absolute worst case... some users may have bigger comments, but it should be enough for us to get meaningful performance estimates. Would also be good to see if attributes are extracted from any of these comments by the plugin.

@vincerubinetti
Copy link
Collaborator Author

And Pandoc's header_attributes extension, which uses the following syntax:

Yeah, Pandoc's format is clearly modeled after CSS, where # is an id and . is a class. CSS also has key=value, although in CSS it's [key=value].


Okay, I tested the plugin on the sample manuscript you gave me. There were 0 false positives (no comments were parsed at all), and it takes less than a millisecond on average. Tested it on mobile, and there is no perceptible lag from the plugin on my iPhone SE (fairly modern), or my HTC One (fairly old).

@vincerubinetti
Copy link
Collaborator Author

Some further testing:
https://jsfiddle.net/2vwedcgf/

Parsing the comment doesn't even take most of the time. Most of the time is taken by actually setting the attribute in the DOM. DOM manipulations are very slow compared to everything else, and as I said before, string parsing and manipulation in javascript is actually very fast.

At this point I would definitely say that testing or optimizing for speed is an anti pattern, especially considering we haven't done this for any of the other plugins. The discussion should be confined to what will be easiest for the user to remember, and what will cause the least false positives.

And as I've stated, I really think false positives are incredibly unlikely (and very very low consequences even if they do happen), so that just leaves which is easier to remember and/or use: $ or #.

The desire to be able to uncomment/comment a whole line of the plugin at once is something to be considered, but I really don't think people are going to need to switch these things on and off. The use case is already so edge. In fact, it's really unlikely people will set more than one attribute in a comment, or even use the plugin at all.

@dhimmel
Copy link
Member

dhimmel commented Sep 9, 2019

Okay, given that you didn't have any false positives on the comment-heavy example manuscript, let's proceed without setattrs as a prefix. If there is future demand to disable parsing of entire comments, we could always add a plugin setting to require settattrs.

Also I am okay enabling the plugin by default since it seems unlikely to cause much harm even if it does falsely trigger.

which is easier to remember and/or use: $ or #.

I don't have a strong opinion. @vincerubinetti your call.

Is #251 ready for review / merging?

@vincerubinetti
Copy link
Collaborator Author

Is #251 ready for review / merging?

Not ready yet. I'll have it by the end of today.

dhimmel pushed a commit that referenced this issue Sep 11, 2019
This build is based on
0dc5ea8.

This commit was created by the following Travis CI build and job:
https://travis-ci.com/manubot/rootstock/builds/127061386
https://travis-ci.com/manubot/rootstock/jobs/234117445

The full commit message that triggered this build is copied below:

Plugin to set HTML attributes via comments

Merges #251
Closes #264
Refs #240

Uses javascript to apply attributes extracted from HTML comments.
dhimmel pushed a commit that referenced this issue Sep 11, 2019
This build is based on
0dc5ea8.

This commit was created by the following Travis CI build and job:
https://travis-ci.com/manubot/rootstock/builds/127061386
https://travis-ci.com/manubot/rootstock/jobs/234117445

The full commit message that triggered this build is copied below:

Plugin to set HTML attributes via comments

Merges #251
Closes #264
Refs #240

Uses javascript to apply attributes extracted from HTML comments.
rhagenson pushed a commit to rhagenson/biological-messaging that referenced this issue Sep 18, 2019
This build is based on
0dc5ea8.

This commit was created by the following Travis CI build and job:
https://travis-ci.com/RHagenson/biological-messaging/builds/128096845
https://travis-ci.com/RHagenson/biological-messaging/jobs/236645304

The full commit message that triggered this build is copied below:

Plugin to set HTML attributes via comments

Merges manubot/rootstock#251
Closes manubot/rootstock#264
Refs manubot/rootstock#240

Uses javascript to apply attributes extracted from HTML comments.
rhagenson pushed a commit to rhagenson/biological-messaging that referenced this issue Sep 18, 2019
This build is based on
0dc5ea8.

This commit was created by the following Travis CI build and job:
https://travis-ci.com/RHagenson/biological-messaging/builds/128096845
https://travis-ci.com/RHagenson/biological-messaging/jobs/236645304

The full commit message that triggered this build is copied below:

Plugin to set HTML attributes via comments

Merges manubot/rootstock#251
Closes manubot/rootstock#264
Refs manubot/rootstock#240

Uses javascript to apply attributes extracted from HTML comments.
adebali pushed a commit to CompGenomeLab/lemur-manuscript-archive that referenced this issue Mar 4, 2020
Merges manubot/rootstock#251
Closes manubot/rootstock#264
Refs manubot/rootstock#240

Uses javascript to apply attributes extracted from HTML comments.
ploegieku added a commit to ploegieku/2023-functional-homology-paper that referenced this issue Aug 6, 2024
Merges manubot/rootstock#251
Closes manubot/rootstock#264
Refs manubot/rootstock#240

Uses javascript to apply attributes extracted from HTML comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants