Refactor and class split #4432

Esteb37 · 2024-07-26T15:51:50Z

Summary:
Big classes are scary ☹️

This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration.

Differential Revision: D60290882

pytorch-bot · 2024-07-26T15:51:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4432

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ba7f3ab with merge base b7c8378 ():

NEW FAILURE - The following job has failed:

pull / unittest / macos / macos-job (gh)
RuntimeError: Failed to compile /var/folders/bm/fnn3xd1d39lcpbxrgwys1c140000gn/T/tmpxdba_hzc/data.json to /var/folders/bm/fnn3xd1d39lcpbxrgwys1c140000gn/T/tmpxdba_hzc/data.pte. Set ET_EXIR_SAVE_FLATC_INPUTS_ON_FAILURE=1 to save input files on failure.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-07-26T15:52:23Z

This pull request was exported from Phabricator. Differential Revision: D60290882

facebook-github-bot · 2024-07-26T18:55:46Z

This pull request was exported from Phabricator. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Differential Revision: D60290882

facebook-github-bot · 2024-07-29T15:08:44Z

This pull request was exported from Phabricator. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Differential Revision: D60290882

facebook-github-bot · 2024-07-29T15:13:48Z

This pull request was exported from Phabricator. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Differential Revision: https://internalfb.com/D60290882

facebook-github-bot · 2024-07-30T17:45:33Z

This pull request was exported from Phabricator. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Reviewed By: jorgep31415 Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Differential Revision: https://internalfb.com/D60290882

Summary: Pull Request resolved: pytorch#4336 This diff introduces a profiler that obtains the maximum and minimum bandwidth for reading unique addresses from 3D textures in each of its dimensions, using the following shader, where A is a 3D texture and B is a writeonly buffer. The calculation of the texel position will depend on the dimension that is being benchmarked x : pos = ivec3(offset, 0, 0) y : pos = ivec3(0, offset, 0) z : pos = ivec3(0, 0, offset) void main() { vec4 sum = vec4(0); const uint workgroup_width = local_group_size * niter * ${NUNROLL}; uint offset = (gl_WorkGroupID[0] * workgroup_width + gl_LocalInvocationID[0]) & addr_mask; int i = 0; for (; i < niter; ++i) { sum *= texelFetch(A, pos, 0); offset = (offset + local_group_size) & addr_mask; ... ... sum *= texelFetch(A, pos, 0); offset = (offset + local_group_size) & addr_mask; } vec4 zero = vec4(i>>31); B[gl_LocalInvocationID[0]] = sum + zero; } The address mask allows us to control how many unique addresses we are accessing. If the number of unique vectors we want to read is 3, the offset will jump between three unique addresses throughout the iterations, giving us the bandwidth for that specific size of data. If the size of the unique data read is larger than the work group size, then each run will have its own block of data to read, defined by the initial offset calculation, where the offset is obtained through the workgroup ID and the local invocation ID. Finally, we make sure to use the `sum` and `i ` variables so that the compiler's optimizer does not flatten the loops. For a Samsung S22, the bandwidth behaves like this for each of the dimensions. {F1767497386} Comparing the bandwidth for the X dimension to OpenCL, which was obtained through [ArchProbe](https://github.com/microsoft/ArchProbe), we can observe that, although the behavior is the same, Vulkan has an increased bandwidth for most access sizes. {F1767497972} Comparing to the bandwidth for buffers, we can observe that the bandwidth is similar to regular buffers, but still much smaller than UBOs at small access sizes. {F1767497707} Reviewed By: jorgep31415 Differential Revision: D59980139

Summary: Pull Request resolved: pytorch#4337 Now that the tool is getting larger, a configuration file for defining which tests to run and which to skip, as well as specifying some values like thresholds and ranges, comes in handy. This diff adds support for a JSON config file with specifications for each test. Reviewed By: jorgep31415 Differential Revision: D60060188

Summary: Pull Request resolved: pytorch#4421 This diff introduces a metric to calculate the maximum concurrent cache line accesses for each dimension of a 3D texture. The experiment works by allowing each thread to access a different texel on the texture and slowly increasing the number of threads, until the cache line is no longer able to handle all simultaneous accesses. By detecting a jump in latency, we can define the optimal maximum size that can be accessed concurrently on each dimension. NOTE: ArchProbe uses this information to[ obtain a supposed cache line size for textures](https://fburl.com/98xiou3g). However, it is unclear why they define the cache line size as being the ratio between the larger concurrency value over the lower, times the texel size. It is also unclear how to extend their calculations to three dimensions. TODO: Understand the relationship between concurrency and cache line size, and modify this metric to output the cache line size. For a Samsung S22, the latency graph looks like this: {F1780375117} Reviewed By: copyrightly Differential Revision: D60246121

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Reviewed By: jorgep31415 Differential Revision: D60290882

facebook-github-bot · 2024-07-30T18:58:14Z

This pull request was exported from Phabricator. Differential Revision: D60290882

Summary: Pull Request resolved: pytorch#4432 Big classes are scary ☹️ This diff subdivides the tests into categories, places them as functions inside the gpuinfo namespace, instead of as part of the App class, and the App class is now only for persisting device information and configuration. Reviewed By: jorgep31415 Differential Revision: D60290882

facebook-github-bot · 2024-07-30T20:33:26Z

This pull request has been merged in e03181d.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 26, 2024

facebook-github-bot added the fb-exported label Jul 26, 2024

Esteb37 force-pushed the export-D60290882 branch from 220cccd to 5c211eb Compare July 26, 2024 18:55

Esteb37 force-pushed the export-D60290882 branch from 5c211eb to 61ae401 Compare July 29, 2024 15:08

Esteb37 force-pushed the export-D60290882 branch from 61ae401 to 236412c Compare July 29, 2024 15:13

jorgep31415 approved these changes Jul 29, 2024

View reviewed changes

Esteb37 force-pushed the export-D60290882 branch from 236412c to 755aab9 Compare July 30, 2024 17:45

Esteban Padilla Cerdio added 4 commits July 30, 2024 11:52

Esteb37 force-pushed the export-D60290882 branch from 755aab9 to ba7f3ab Compare July 30, 2024 18:58

facebook-github-bot closed this in e03181d Jul 30, 2024

facebook-github-bot added the Merged label Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor and class split #4432

Refactor and class split #4432

Esteb37 commented Jul 26, 2024

pytorch-bot bot commented Jul 26, 2024 •

edited

Loading

facebook-github-bot commented Jul 26, 2024

facebook-github-bot commented Jul 26, 2024

facebook-github-bot commented Jul 29, 2024

facebook-github-bot commented Jul 29, 2024

facebook-github-bot commented Jul 30, 2024

facebook-github-bot commented Jul 30, 2024

facebook-github-bot commented Jul 30, 2024

Refactor and class split #4432

Refactor and class split #4432

Conversation

Esteb37 commented Jul 26, 2024

pytorch-bot bot commented Jul 26, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4432

❌ 1 New Failure

facebook-github-bot commented Jul 26, 2024

facebook-github-bot commented Jul 26, 2024

facebook-github-bot commented Jul 29, 2024

facebook-github-bot commented Jul 29, 2024

facebook-github-bot commented Jul 30, 2024

facebook-github-bot commented Jul 30, 2024

facebook-github-bot commented Jul 30, 2024

pytorch-bot bot commented Jul 26, 2024 •

edited

Loading