Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

Closed

Conversation

archibate
Copy link
Collaborator

Related issue = #

[Click here for the format server]


          0.169 ms  0.03%  opengl_get_threads_per_group [18 x   9.378 us]
          0.137 ms  0.02%  compile               [7 x  19.550 us]
          3.498 ms  0.53%  link                  [7 x 499.657 us]
      6.088  s launch                        [25205 x 241.520 us]
          0.307  s  5.04%  dispatch_compute      [93605 x   3.277 us]
            171.665 ms 55.97%  glDispatchCompute     [93605 x   1.834 us]
             13.229 ms  4.31%  glMemoryBarrier       [93605 x 141.329 ns]
            121.825 ms 39.72%  [unaccounted]
          0.056  s  0.92%  launch:ext_arr1       [2400 x  23.377 us]
             55.077 ms 98.17%  add_buffer            [4800 x  11.474 us]
                 50.316 ms 91.36%  GLBuffer              [4800 x  10.483 us]
                     47.020 ms 93.45%  bind_data             [4800 x   9.796 us]
                      3.297 ms  6.55%  [unaccounted]
                  4.761 ms  8.64%  [unaccounted]
              1.027 ms  1.83%  [unaccounted]
          0.008  s  0.13%  add_buffer            [2400 x   3.377 us]
              7.083 ms 87.39%  GLBuffer              [2400 x   2.951 us]
                  6.223 ms 87.86%  bind_data             [2400 x   2.593 us]
                  0.860 ms 12.14%  [unaccounted]
              1.022 ms 12.61%  [unaccounted]
          5.686  s 93.40%  copy_back             [7200 x 789.696 us]
              5.665  s 99.64%  map                   [7200 x 786.832 us]
          0.031  s  0.51%  [unaccounted]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

@archibate archibate requested a review from yuanming-hu July 23, 2020 16:33
@archibate archibate changed the base branch from master to archibate-opengl July 24, 2020 13:12
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks.

Would it be possible to not introduce benchmark_mpm99.py? We already have too many copies in the repo.

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

That's intentional. Kernel profiling has a very different need from compiler profiling.

@archibate
Copy link
Collaborator Author

archibate commented Jul 25, 2020

Cool! Thanks.

Would it be possible to not introduce benchmark_mpm99.py? We already have too many copies in the repo.

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

That's intentional. Kernel profiling has a very different need from compiler profiling.

So, we actually need multiple ScopedProfiler?

  1. compiler profiler (TI_AUTO_PROF, default on)
  2. runtime profiler (TI_RT_PROF, default off)
  3. python profiler ([Infra] Add "ti.profiler" (PythonProfiler) for intuitive Python-scope profiling #1493, default off)

@archibate archibate added the wontfix We won't fix this issue or merge this PR label Jul 31, 2020
@archibate archibate closed this Aug 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix We won't fix this issue or merge this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants