v0.6.22
Highlights:
- Language and syntax
- Support SNode trailing bits (#1558) (by Yuanming Hu)
- OpenGL backend
- Support 'ti.asm' to insert embed GLSL codes (experimental) (#1573) (by 彭于斌)
- Performance improvements
- Improve CUDA runtime performance with warp-level primitives (#1571) (by Yuanming Hu)
Full changelog:
- [cuda] [bug] Fix a CUDA codegen bug (#1592) (by Yuanming Hu)
- [test] Fix issues in "bls_particle_grid" tests caused by float-point errors (#1590) (by Yuanming Hu)
- [llvm] Fix LLVM runtime sparse computation issues (#1582) (by Yuanming Hu)
- [OpenGL] Support 'ti.asm' to insert embed GLSL codes (experimental) (#1573) (by 彭于斌)
- [example] Upgrade mpm88 to new syntax (#1581) (by 彭于斌)
- [gui] [error] [linux] Better error message when X display not available (#1575) (by 彭于斌)
- [test] Skip mpm88 async on Appveyor (#1566) (by Ye Kuang)
- [Perf] Improve CUDA runtime performance with warp-level primitives (#1571) (by Yuanming Hu)
- [async] Fuse tasks only if they are either from the same kernel or arg-less (#1530) (by Ye Kuang)
- [cc] Support ActionRecorder in C backend (#1559) (by 彭于斌)
- [metal] Plug in the SNodeRep structs into codegen (#1480) (by Ye Kuang)
- [Lang] Support SNode trailing bits (#1558) (by Yuanming Hu)