You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey! Awesome work, the first person to write a micromark extension! I’d love to hear more about your experience.
I went through the code and added some questions/tips/bugs:
import{html}from'./html'functionwikiLink(opts={}){constaliasDivider=opts.aliasDivider||':'constaliasMarker=aliasDivider// ^ Generally, I would suggest working internally on character codes instead// of strings though. In certain cases I had to work with string in micromark,// (such as w/ `<![CDATA[`), in which case I’ve used one index to keep track// of the position, and one string to track that “buffer”:// <https://github.com/micromark/micromark/blob/63cf51445077f18145d35dd3c506859d6f8195ce/lib/tokenize/html-text.js#L62-L64>// and// <https://github.com/micromark/micromark/blob/63cf51445077f18145d35dd3c506859d6f8195ce/lib/tokenize/html-text.js#L133>// It might be a micro optimization though: I found that using less variables// was the surest way to make the code small, and otherwise using string.// If you’re using codes, you could comment their names:// `(code === 91) { // left square bracket` or `(code === 91 /* '[' */) {`.conststartMarker='[['constendMarker=']]'// ^ If these are always static two-character, maybe an extra state will help.// E.g., to parse a processing instruction (`<?x?>`) I didn’t use a buffer but// worked on the codes directly.functiontokenize(effects,ok,nok){vardatavaraliasvaraliasCursor=0varstartMarkerCursor=0varendMarkerCursor=0returnstartfunctionstart(code){if(code!==startMarker.charCodeAt(startMarkerCursor))returnnok(code)// ^ this *should* not happen (as micromark knows you’re hooking into// `91`), but I do say to keep it in, just to be safe and signal here too// that you’re looking for a bracket.effects.enter('wikiLink')effects.enter('wikiLinkMarker')// ! You are missing a `effects.consume(code)`, or you should pass it to// the next with `return consumeStart(code)`.// While testing, you can use `micromark/lib`, which comes with// assertions to catch this!// <https://github.com/micromark/micromark#size--debug>returnconsumeStart}functionconsumeStart(code){if(startMarkerCursor===startMarker.length){effects.exit('wikiLinkMarker')// ! `return consumeData(code)`returnconsumeData}if(code!==startMarker.charCodeAt(startMarkerCursor)){returnnok(code)}effects.consume(code)startMarkerCursor++returnconsumeStart}functionconsumeData(code){// ! You might also inline this into the above, because it’s called once.effects.enter('wikiLinkData')effects.enter('wikiLinkTarget')// ! `return consumeTarget(code)`returnconsumeTarget}functionconsumeTarget(code){if(code===aliasMarker.charCodeAt(aliasCursor)){if(!data)returnnok(code)effects.exit('wikiLinkTarget')effects.enter('wikiLinkAliasMarker')// ! `return consumeAliasMarker(code)`returnconsumeAliasMarker}if(code===endMarker.charCodeAt(endMarkerCursor)){if(!data)returnnok(code)effects.exit('wikiLinkTarget')effects.exit('wikiLinkData')effects.enter('wikiLinkMarker')// ! `return consumeEnd(code)`returnconsumeEnd}// ! you might want to comment that this is CR, LF, CRLF, HT, VS, and// SP (whitespace, EOLs, EOF)if(!(code<0||code===32)){data=true}// ! One thing here that might be of interest to add, is support for// escapes: how should `[[my\]page]]` and `[a\:b]` work?// If that’s the case, you don’t need to care about all that is escapable// in Markdown (ASCII punctuation). In labels, which work somewhat// similar to this, I’m doing it like so:// <https://github.com/micromark/micromark/blob/63cf51445077f18145d35dd3c506859d6f8195ce/lib/tokenize/factory-label.js#L76>// you will probably also support escaping `aliasDivider.charCodeAt(0)`?// So in this tokenizer, you mark this as a [“string”](https://github.com/micromark/micromark/blob/63cf51445077f18145d35dd3c506859d6f8195ce/lib/tokenize/factory-label.js#L58)// but otherwise you “skip” over the escapes.// micromark will later go find all strings, which support character// references (`&`) and character escapes (`\&`), and handle ’em.effects.consume()returnconsumeTarget}functionconsumeAliasMarker(code){if(aliasCursor===aliasMarker.length){effects.exit('wikiLinkAliasMarker')effects.enter('wikiLinkAlias')// ! `return consumeAlias(code)`returnconsumeAlias}if(code!==aliasMarker.charCodeAt(aliasCursor)){returnnok(code)}effects.consume(code)aliasCursor++returnconsumeAliasMarker}functionconsumeAlias(code){if(code===endMarker.charCodeAt(endMarkerCursor)){if(!alias)returnnok(code)effects.exit('wikiLinkAlias')effects.exit('wikiLinkData')effects.enter('wikiLinkMarker')// ! `return consumeEnd(code)`returnconsumeEnd}if(!(code<0||code===32)){alias=true}// ! `code` must be consumed!!effects.consume()returnconsumeAlias}functionconsumeEnd(code){if(endMarkerCursor===endMarker.length){effects.exit('wikiLinkMarker')effects.exit('wikiLink')// ! `return ok(code)`returnok}if(code!==endMarker.charCodeAt(endMarkerCursor)){returnnok(code)}effects.consume(code)endMarkerCursor++// ! `return consumeEnd(code)`returnconsumeEnd}}// ! just fyi, no change needed: I call these things “constructs”, which could also have more fieldsvarcall={tokenize: tokenize}return{text: {91: call}}}export{wikiLinkassyntax,html}
The text was updated successfully, but these errors were encountered:
Hey! Awesome work, the first person to write a micromark extension! I’d love to hear more about your experience.
I went through the code and added some questions/tips/bugs:
The text was updated successfully, but these errors were encountered: