Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenGL] [perf] Support TLS to improve reduction performance #1574

Merged
merged 4 commits into from
Jul 26, 2020

Conversation

archibate
Copy link
Collaborator

Related PR = #1336

[Click here for the format server]


Running on my poor Intel card:
fem99: 4.6 -> 40 (!!!!)
fem128: 18 -> 26
mpm_lagrangian_forces: 3.8 -> 15.7
misc/benchmark_reduction.py: GL_OUT_OF_MEMORY :(

No performance difference observed on the NV card, seems NV's compiler is already capable of doing TLS optimization for us?

(Partial) OFT:
[cli] [opengl] [metal] TI_MAKE_THREAD_LOCAL=0 to disable TLS on OpenGL/Metal.

@archibate archibate changed the base branch from master to archibate-opengl July 24, 2020 09:37
@archibate archibate changed the title [OpenGL] [perf] [cli] Support TLS to improve reduction performance [OpenGL] [perf] Support TLS to improve reduction performance Jul 24, 2020
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM. My only concerns is that thls seems to be used in place of tls in some places. Is that intentional?

@codecov
Copy link

codecov bot commented Jul 25, 2020

Codecov Report

Merging #1574 into master will increase coverage by 0.05%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1574      +/-   ##
==========================================
+ Coverage   67.47%   67.53%   +0.05%     
==========================================
  Files          40       40              
  Lines        5624     5630       +6     
  Branches      982      982              
==========================================
+ Hits         3795     3802       +7     
+ Misses       1661     1660       -1     
  Partials      168      168              
Impacted Files Coverage Δ
python/taichi/lang/__init__.py 80.20% <ø> (-0.01%) ⬇️
python/taichi/lang/ops.py 93.33% <0.00%> (+0.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f528377...05ade2f. Read the comment docs.

@archibate
Copy link
Collaborator Author

Thank you! LGTM. My only concerns is that thls seems to be used in place of tls in some places. Is that intentional?

Sorry, that's intentional. For now all the buffers, root, args, gtmp, ..., are 4 chars.
But I think it's OK to make tls to be 3 chars since it's not a global buffer, but a per-thread buffer. Thank for pointing out.

@archibate archibate changed the base branch from archibate-opengl to master July 26, 2020 09:04
@archibate archibate merged commit 0970902 into taichi-dev:master Jul 26, 2020
@FantasyVR FantasyVR mentioned this pull request Jul 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants