[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

archibate · 2020-07-23T16:33:06Z

Related issue = #

          0.169 ms  0.03%  opengl_get_threads_per_group [18 x   9.378 us]
          0.137 ms  0.02%  compile               [7 x  19.550 us]
          3.498 ms  0.53%  link                  [7 x 499.657 us]
      6.088  s launch                        [25205 x 241.520 us]
          0.307  s  5.04%  dispatch_compute      [93605 x   3.277 us]
            171.665 ms 55.97%  glDispatchCompute     [93605 x   1.834 us]
             13.229 ms  4.31%  glMemoryBarrier       [93605 x 141.329 ns]
            121.825 ms 39.72%  [unaccounted]
          0.056  s  0.92%  launch:ext_arr1       [2400 x  23.377 us]
             55.077 ms 98.17%  add_buffer            [4800 x  11.474 us]
                 50.316 ms 91.36%  GLBuffer              [4800 x  10.483 us]
                     47.020 ms 93.45%  bind_data             [4800 x   9.796 us]
                      3.297 ms  6.55%  [unaccounted]
                  4.761 ms  8.64%  [unaccounted]
              1.027 ms  1.83%  [unaccounted]
          0.008  s  0.13%  add_buffer            [2400 x   3.377 us]
              7.083 ms 87.39%  GLBuffer              [2400 x   2.951 us]
                  6.223 ms 87.86%  bind_data             [2400 x   2.593 us]
                  0.860 ms 12.14%  [unaccounted]
              1.022 ms 12.61%  [unaccounted]
          5.686  s 93.40%  copy_back             [7200 x 789.696 us]
              5.665  s 99.64%  map                   [7200 x 786.832 us]
          0.031  s  0.51%  [unaccounted]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

yuanming-hu

Cool! Thanks.

Would it be possible to not introduce benchmark_mpm99.py? We already have too many copies in the repo.

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

That's intentional. Kernel profiling has a very different need from compiler profiling.

archibate · 2020-07-25T03:30:32Z

Cool! Thanks.

Would it be possible to not introduce benchmark_mpm99.py? We already have too many copies in the repo.

I found that currently TI_AUTO_PROF is always used in compile-time? But I thought it could also be useful in run-time performance test, is this intentional?

That's intentional. Kernel profiling has a very different need from compiler profiling.

So, we actually need multiple ScopedProfiler?

compiler profiler (TI_AUTO_PROF, default on)
runtime profiler (TI_RT_PROF, default off)
python profiler ([Infra] Add "ti.profiler" (PythonProfiler) for intuitive Python-scope profiling #1493, default off)

TI_AUTO_PROF

8b87aa8

archibate requested a review from yuanming-hu July 23, 2020 16:33

fix

c5ccc98

archibate changed the base branch from master to archibate-opengl July 24, 2020 13:12

archibate requested review from k-ye, xumingkuan and taichi-gardener July 24, 2020 13:13

[skip ci] enforce code format

c791cbd

yuanming-hu reviewed Jul 24, 2020

View reviewed changes

archibate added the wontfix We won't fix this issue or merge this PR label Jul 31, 2020

archibate closed this Aug 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

archibate commented Jul 23, 2020

yuanming-hu left a comment

archibate commented Jul 25, 2020 •

edited

Loading

[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

[opengl] [perf] Use TI_AUTO_PROF in OpenGL backend runtime #1570

Conversation

archibate commented Jul 23, 2020

yuanming-hu left a comment

Choose a reason for hiding this comment

archibate commented Jul 25, 2020 • edited Loading

archibate commented Jul 25, 2020 •

edited

Loading