-
Notifications
You must be signed in to change notification settings - Fork 3
optimizing nim code for performance
inline procs WILL be noticably slower than templates in debug builds (or anything with --stacktrace) unless #13582 is merged and user passes the --stacktrace:noinline this enables in debug builds (even with --stacktrace:off and regardless of #13582) I've observed that templates will be a little bit faster than inline proc, but same speed with danger builds (and maybe -d:release, haven't checked); however IMO codegen can be improved to make inline proc and template same speed even for debug build by removing redundant assignment incurred by inline proc (with --stacktrace:noinline or --stacktrace:off)
const t = "abcdefghijklmnopqrstuvwxyz".indent(65).cstring
Microbenchmarks are misleading, in reality you can also have I-cache problems and then you appreciate not everything was "optimized" by 8 times loop unrolling with SSE instructions and 1KB lookup tables.
https://lists.llvm.org/pipermail/llvm-dev/2016-February/095032.html