Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RFC 416: lazy symbol semantic checking: enable cyclic deps, remove need for fwd decls, etc #18818

Closed
wants to merge 155 commits into from

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Sep 7, 2021

IMO this is the most important feature that was missing in nim, lack of cyclic imports was causing headaches for project organization with long ranging effects. This PR fixes this.

  • fix lazy symbol semantic checking RFCs#416
  • enable cyclic deps
  • remove need for fwd decls
  • can lead to large compile time speedups (and smaller binaries) if many symbols are unused in the transitive depdendencies

semantic notes

epilogue: semchecking of unused symbols

  • epilogue: by default, unused symbols are semchecked at the end, after all modules have been processed; this can be controlled by a flag (eg for applications where compilation times are most important (eg inim), these can be skipped)

local epilogue is triggered by compiles and macro typed params

compiles is also lazy, but it triggers a local epilogue for the symbols in the generated code:

template bar(cond) =
  proc fn1()=fn2()
  proc fn2()=fn3()
  proc fn3()=
    static: doAssert cond
doAssert not compiles(bar(false))
doAssert compiles(bar(true))

likewise for typed macro params; note that the epilogue only concerns symbols in the generated code, it won't trigger exhaustive semchecking for unrelated code, eg:

template bar =
  import std/strutils
  proc fn: auto = split("ab")
echo compiles(bar())
  # will trigger semchecking of fn (even if fn is unused), but will not trigger semchecking
  # of unused symbols in strutils, by design

implementation notes

  • to enable keeping track of symbol context (feature, options, pushed pragmas, etc), I'm using an approach similar to PScope: the seq[TOptionEntry] is replaced by a heap allocated POptionEntry which contains parent references; this allows retrieving efficiently the correct context at declaration time
  • handling pre-existing fwd declarations turned out to be tricky (note that they're not needed after this PR, but the code should still work when those are defined in existing code); see more details here: Pr lazysemcheck exp4 timotheecour/Nim#809 (comment); I've handled these by adding both fwd and impl symbols to symbol table, and then if semchecking if triggered, I'm filtering out the impl declaration to avoid duplicate definition errors
  • XDeclaredButNotUsed, UnusedImport are processed after epilogue and work as you'd expect them to; they also work with nested symbols, unlike before PR

2-pass vs 3-pass rountine semchecking

in this PR, each symbol from a routine declaration is semchecked at most twice:

  • 1st pass creates the symbol and adds in to symbol table
  • 2nd pass (triggered by a symbol use, excluding generic pre-pass) semchecks both the declaration and the routine body

this works but has downside of causing all overloads to be fully semchecked when the name is used:

proc fn(a: int) = discard
proc fn(a: string) = discard
fn(1) # this currently triggers semcheck of both fn overloads (and their bodies)

proc fn2(a: int) = discard
type T = typeof fn(1) # this currently triggers semcheck of fn2 and its body even though semchecking the body should be deferred

in future work, this can be refined by using 3-pass semchecking, which splits the 2nd pass into a pass that only semchecks declaration, and if the symbol is accepted for sigmatch, then the routine body is semchecked only when needed, which enables more laziness and further reduces limitations on cycles

limitations to what declarations can be lazy semchecked

symbols that have an implicit behavior cannot be semchecked lazily (at least their declaration) because we'd have no way to guess whether an implicit rewrite can use those; this doesn't cover many symbols though; only the following are semchecked in a non-lazy way:

  • converters
  • routines with patterns (eg macro optOps{ (+|-|*) ** a }(a: TMat): untyped, see trmacros_various.nim)
  • type bound operators like =destroy and =copy as they affect semantics implicitly
  • procs annotated with {.gensym.} defined inside macros
  • methods: they almost work except for 1 case (see D20210909T094624) so for now they're processed non-lazily

future work can lift some of those restrictions

backward compatibility

this PR (after i remove temporary debugging/testing) should have no semantic differences by default; the lazy semchecking should only be enabled by a opt-in flag

TODO

  • remove hack in isCompilerProc
  • rename flags after bikeshedding, possibly --experimental:lazy (simple), or --lazy:on|off|aggressive where on would mean semcheck in epilogue, aggressive would mean skip smechecking of unused symbols in epilogue
  • a lot of remaining problems (eg from nimble package failures) are because effect inference currently returns pessimistic results in some cases with lazy semchecking; this is fixable with some care (and workaround is to force early semchecking one problematic declaration, via type _ = typeof(foo))

status

  • with lazy:off, everything should work as before
  • with lazy:on in config/config.nims:
    • nim bootstraps
    • sh build_all.sh works (including all tools that it builds)
    • ./koch --localdocs:/tmp/htmldocs docs works (including all the runnableExamples)
    • ./koch docs works (ditto)
    • nim CI tests now all pass (1755 tests, plus megatest which itself represents 626 tests) except for:
      • ic tests (4 fail / 14)
      • navigator tests (2/2; this is part of ic system)
      • (only fails on linux): tests/coroutines/tgc.nim + tests/coroutines/twait.nim, with same error
    • important_packages: (4+6+7)/(38+38+38) = 17/114 tests fail

links

compiler/modulegraphs.nim Outdated Show resolved Hide resolved
@Varriount Varriount changed the title fix RFC 416: lazy symbol semantic checking: enable cyclic deps, remove need for fwd decls, etc Implement RFC 416: lazy symbol semantic checking: enable cyclic deps, remove need for fwd decls, etc Sep 7, 2021
@Varriount
Copy link
Contributor

What are the semantic differences when the flag is enabled (aside from the general lazyness)? I assume that the intention here is to eventually make this mode the default.

@ringabout ringabout mentioned this pull request May 11, 2022
33 tasks
@Araq Araq mentioned this pull request Dec 29, 2022
12 tasks
@stale
Copy link

stale bot commented Jan 7, 2023

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Jan 7, 2023
@stale stale bot closed this Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Staled PR/issues; remove the label after fixing them
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lazy symbol semantic checking
3 participants