WIP: use dedent for de-indentation in lexer, fix #15184 #16699

a-mr · 2021-01-12T20:12:23Z

When first line in a doc comment is empty, lexer.nim memorizes indentation at it and keeps one space on the following lines:

##
## x

tokenized as comment tok.literal =

(with space before x) which is considered a block quote in RST.

a-mr · 2021-01-12T20:33:57Z

bummer: there is no dedent in csources, because csources is of version 0.20.0.

It worked so well in my local copy :-(

timotheecour · 2021-01-13T00:25:16Z

@a-mr
you're hitting #16646, see workaround mentioned there which works:

in strutils, change as follows:

since (1, 3):
  func indentation*(s: string): Natural = ...
  func dedent*(s: string, count: Natural = indentation(s)): string {.rtl,
      extern: "nsuDedent".} = ...

after that you're hitting timotheecour#521
for which there's also a workaound:
add this to compiler/lexer.nim:

proc dedent(a: string): string =
  # pending https://github.com/timotheecour/Nim/issues/521
  let b = a
  let i = indentation(b)
  dedent(a, i)

then this works

Araq · 2021-01-13T15:06:20Z

Why can't we fix the existing logic instead...

timotheecour · 2021-01-13T17:25:37Z

To avoid duplicating code. Code reuse is good.

a-mr · 2021-01-13T17:32:27Z

@Araq , @timotheecour . Bringing into consideration another edge case. Assume we really want to start comment from a quote:

proc f* =
  ##   Quote
  ## Paragraph
  discard

Current logic sets indentation to 0 at "Quote", then it will re-adapt indentation to 0 again at "Paragraph", so we get tok.literal wrongly as:

Quote
Paragraph

To make indentation right in such cases we need to look ahead through the entire string and find non-whitespace-character with minimal indentation. It's what dedent does. So we are back to re-inventing dedent.

I think, we can avoid the workarounds proposed by timotheecours and just copy a (short, few lines long) implementation of dedent into lexer.nim.

Alternative solution

You can specify that zero or one space in any comment correspond to zero indentation, then grows linearly. This solution avoids to do any de-indentation, except the trivial crop that can be done on each line separately #x -> x, # x -> x,

#  x

-> " x", etc.

initial indentation, spaces after ##	resulting indentation
0	0
1	0
2	1
..	..
n	n-1

The side effect of such decision is that there will appear a block quote if e.g. 2 spaces are present after "#"

const pi = 3.14 ##  I made a mistake and put 2 spaces accidentally and would get a block quote all of a sudden :-)

Araq · 2021-01-14T10:12:49Z

To avoid duplicating code. Code reuse is good.

So do it in a way that doesn't slow down things. Also: I avoid helpers from the stdlib nowadays because you never know if it gets "fixed" in an incompatible manner because it's "inconsistent with Python/Unix/some other proc somewhere else".

use dedent for de-indentation in lexer, fix nim-lang#15184

d96c9d7

a-mr closed this Jan 12, 2021

timotheecour mentioned this pull request Jan 13, 2021

bootstrap: Error: internal error: environment misses: s timotheecour/Nim#521

Open

timotheecour reopened this Jan 13, 2021

apply timotheecour's suggestion

32bd75b

just to make tests pass

d65f470

a-mr mentioned this pull request Jan 14, 2021

conservative approach to fix #15184 #16723

Merged

Araq closed this Jan 14, 2021

a-mr mentioned this pull request Mar 7, 2021

fix RST parsing when no indent after enum.item (fix #17249) #17257

Merged

timotheecour mentioned this pull request Apr 29, 2021

close #16646; since now works with bootstrap nim post csources_v1 #17895

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: use dedent for de-indentation in lexer, fix #15184 #16699

WIP: use dedent for de-indentation in lexer, fix #15184 #16699

a-mr commented Jan 12, 2021 •

edited

Loading

a-mr commented Jan 12, 2021 •

edited

Loading

timotheecour commented Jan 13, 2021 •

edited

Loading

Araq commented Jan 13, 2021

timotheecour commented Jan 13, 2021

a-mr commented Jan 13, 2021 •

edited

Loading

Araq commented Jan 14, 2021

WIP: use dedent for de-indentation in lexer, fix #15184 #16699

WIP: use dedent for de-indentation in lexer, fix #15184 #16699

Conversation

a-mr commented Jan 12, 2021 • edited Loading

a-mr commented Jan 12, 2021 • edited Loading

timotheecour commented Jan 13, 2021 • edited Loading

Araq commented Jan 13, 2021

timotheecour commented Jan 13, 2021

a-mr commented Jan 13, 2021 • edited Loading

Alternative solution

Araq commented Jan 14, 2021

a-mr commented Jan 12, 2021 •

edited

Loading

a-mr commented Jan 12, 2021 •

edited

Loading

timotheecour commented Jan 13, 2021 •

edited

Loading

a-mr commented Jan 13, 2021 •

edited

Loading