optimizing nim code for performance

template vs inline performance

see https://github.com/nim-lang/Nim/pull/13536#issuecomment-594231970

inline procs WILL be noticably slower than templates in debug builds (or anything with --stacktrace) unless #13582 is merged and user passes the --stacktrace:noinline this enables in debug builds (even with --stacktrace:off and regardless of #13582) I've observed that templates will be a little bit faster than inline proc, but same speed with danger builds (and maybe -d:release, haven't checked); however IMO codegen can be improved to make inline proc and template same speed even for debug build by removing redundant assignment incurred by inline proc (with --stacktrace:noinline or --stacktrace:off)

inline semantics

see https://github.com/nim-lang/RFCs/issues/198

pgo

see https://github.com/nim-lang/RFCs/issues/198#issuecomment-597402688

optimization tricks

const t = "abcdefghijklmnopqrstuvwxyz".indent(65).cstring Microbenchmarks are misleading, in reality you can also have I-cache problems and then you appreciate not everything was "optimized" by 8 times loop unrolling with SSE instructions and 1KB lookup tables.

strlen(s) < 5 => strnlen(s, 5) < 5

https://lists.llvm.org/pipermail/llvm-dev/2016-February/095032.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizing nim code for performance

template vs inline performance

inline semantics

pgo

optimization tricks

strlen(s) < 5 => strnlen(s, 5) < 5

see what generated profile reports in The whole options module should be inline by mratsim · Pull Request #14417 · nim-lang/Nim

Clone this wiki locally