-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BE] single dim check helper #1192
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 26, 2025
ghstack-source-id: 2a530b64cbc285544cf4abb7b2970d1f8ffee321 Pull Request resolved: #1192
vmoens
added a commit
that referenced
this pull request
Feb 4, 2025
ghstack-source-id: 6606e4b96061f73b98787b25129c29671a78dc1e Pull Request resolved: #1192
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 55.3940μs | 21.0232μs | 47.5666 KOps/s | 47.7521 KOps/s | |
test_plain_set_stack_nested | 50.2240μs | 21.1877μs | 47.1972 KOps/s | 46.7442 KOps/s | |
test_plain_set_nested_inplace | 0.2305ms | 23.3646μs | 42.7998 KOps/s | 42.3122 KOps/s | |
test_plain_set_stack_nested_inplace | 60.8840μs | 22.7876μs | 43.8834 KOps/s | 43.8717 KOps/s | |
test_items | 38.3410μs | 4.1956μs | 238.3456 KOps/s | 241.8737 KOps/s | |
test_items_nested | 0.7202ms | 0.4012ms | 2.4924 KOps/s | 2.4269 KOps/s | |
test_items_nested_locked | 1.0868ms | 0.4060ms | 2.4631 KOps/s | 2.4146 KOps/s | |
test_items_nested_leaf | 0.2341ms | 76.1676μs | 13.1289 KOps/s | 12.9780 KOps/s | |
test_items_stack_nested | 0.7625ms | 0.4036ms | 2.4778 KOps/s | 2.4105 KOps/s | |
test_items_stack_nested_leaf | 0.1510ms | 77.5087μs | 12.9018 KOps/s | 12.6439 KOps/s | |
test_items_stack_nested_locked | 0.4954ms | 0.4020ms | 2.4878 KOps/s | 2.4198 KOps/s | |
test_keys | 27.1810μs | 3.5634μs | 280.6330 KOps/s | 280.6363 KOps/s | |
test_keys_nested | 0.2894ms | 0.1642ms | 6.0891 KOps/s | 6.0130 KOps/s | |
test_keys_nested_locked | 0.6334ms | 0.1688ms | 5.9241 KOps/s | 5.8139 KOps/s | |
test_keys_nested_leaf | 0.1979ms | 0.1411ms | 7.0892 KOps/s | 6.9455 KOps/s | |
test_keys_stack_nested | 0.2940ms | 0.1624ms | 6.1557 KOps/s | 6.0318 KOps/s | |
test_keys_stack_nested_leaf | 0.1999ms | 0.1422ms | 7.0332 KOps/s | 6.9487 KOps/s | |
test_keys_stack_nested_locked | 0.2948ms | 0.1690ms | 5.9186 KOps/s | 5.8444 KOps/s | |
test_values | 7.7986μs | 1.0978μs | 910.9397 KOps/s | 944.7889 KOps/s | |
test_values_nested | 0.1474ms | 62.9654μs | 15.8817 KOps/s | 15.7656 KOps/s | |
test_values_nested_locked | 0.1211ms | 62.3900μs | 16.0282 KOps/s | 15.2336 KOps/s | |
test_values_nested_leaf | 0.1226ms | 71.3467μs | 14.0161 KOps/s | 13.3743 KOps/s | |
test_values_stack_nested | 0.1114ms | 62.7262μs | 15.9423 KOps/s | 15.8112 KOps/s | |
test_values_stack_nested_leaf | 0.1472ms | 72.0882μs | 13.8719 KOps/s | 13.8173 KOps/s | |
test_values_stack_nested_locked | 0.1311ms | 62.5139μs | 15.9964 KOps/s | 15.6216 KOps/s | |
test_membership | 22.4020μs | 0.8575μs | 1.1662 MOps/s | 1.1808 MOps/s | |
test_membership_nested | 28.5130μs | 2.9023μs | 344.5576 KOps/s | 334.2481 KOps/s | |
test_membership_nested_leaf | 56.3650μs | 2.9280μs | 341.5319 KOps/s | 331.1453 KOps/s | |
test_membership_stacked_nested | 23.1630μs | 2.8834μs | 346.8094 KOps/s | 330.1801 KOps/s | |
test_membership_stacked_nested_leaf | 46.6870μs | 2.9231μs | 342.1008 KOps/s | 311.7206 KOps/s | |
test_membership_nested_last | 27.4510μs | 4.3772μs | 228.4560 KOps/s | 224.4534 KOps/s | |
test_membership_nested_leaf_last | 32.0490μs | 4.3776μs | 228.4372 KOps/s | 226.5486 KOps/s | |
test_membership_stacked_nested_last | 28.9040μs | 5.1106μs | 195.6707 KOps/s | 227.7240 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.2450μs | 5.1910μs | 192.6400 KOps/s | 224.1463 KOps/s | |
test_nested_getleaf | 37.7710μs | 10.6481μs | 93.9134 KOps/s | 95.0714 KOps/s | |
test_nested_get | 44.4030μs | 10.1073μs | 98.9388 KOps/s | 99.7321 KOps/s | |
test_stacked_getleaf | 38.5630μs | 10.6001μs | 94.3385 KOps/s | 95.5766 KOps/s | |
test_stacked_get | 33.2320μs | 10.1168μs | 98.8457 KOps/s | 100.0133 KOps/s | |
test_nested_getitemleaf | 30.0050μs | 11.2103μs | 89.2037 KOps/s | 88.5064 KOps/s | |
test_nested_getitem | 64.2500μs | 10.8035μs | 92.5629 KOps/s | 93.0639 KOps/s | |
test_stacked_getitemleaf | 34.8050μs | 11.0983μs | 90.1040 KOps/s | 88.9266 KOps/s | |
test_stacked_getitem | 34.1840μs | 10.6237μs | 94.1292 KOps/s | 93.5148 KOps/s | |
test_lock_nested | 0.5397ms | 0.4047ms | 2.4712 KOps/s | 2.4317 KOps/s | |
test_lock_stack_nested | 0.6483ms | 0.4157ms | 2.4055 KOps/s | 2.3279 KOps/s | |
test_unlock_nested | 0.5278ms | 0.3347ms | 2.9878 KOps/s | 2.9358 KOps/s | |
test_unlock_stack_nested | 0.5810ms | 0.3392ms | 2.9480 KOps/s | 2.8956 KOps/s | |
test_flatten_speed | 0.1792ms | 98.6519μs | 10.1366 KOps/s | 9.9916 KOps/s | |
test_unflatten_speed | 1.0653ms | 0.5222ms | 1.9151 KOps/s | 1.8920 KOps/s | |
test_common_ops | 5.4736ms | 0.8327ms | 1.2010 KOps/s | 1.1701 KOps/s | |
test_creation | 25.4780μs | 2.5177μs | 397.1937 KOps/s | 398.9191 KOps/s | |
test_creation_empty | 35.8270μs | 12.9520μs | 77.2084 KOps/s | 79.5355 KOps/s | |
test_creation_nested_1 | 42.3790μs | 15.9093μs | 62.8565 KOps/s | 64.6569 KOps/s | |
test_creation_nested_2 | 44.0620μs | 20.5928μs | 48.5608 KOps/s | 49.6124 KOps/s | |
test_clone | 0.1395ms | 13.5349μs | 73.8829 KOps/s | 72.1528 KOps/s | |
test_getitem[int] | 0.8421ms | 12.6102μs | 79.3011 KOps/s | 77.0860 KOps/s | |
test_getitem[slice_int] | 0.1452ms | 24.8130μs | 40.3014 KOps/s | 39.9243 KOps/s | |
test_getitem[range] | 0.1743ms | 50.6078μs | 19.7598 KOps/s | 20.0240 KOps/s | |
test_getitem[tuple] | 0.1253ms | 20.0512μs | 49.8724 KOps/s | 49.0721 KOps/s | |
test_getitem[list] | 0.1623ms | 45.4184μs | 22.0175 KOps/s | 21.8117 KOps/s | |
test_setitem_dim[int] | 57.9280μs | 26.2591μs | 38.0820 KOps/s | 37.4237 KOps/s | |
test_setitem_dim[slice_int] | 83.0950μs | 50.6163μs | 19.7565 KOps/s | 18.8132 KOps/s | |
test_setitem_dim[range] | 0.1301ms | 76.0775μs | 13.1445 KOps/s | 12.7630 KOps/s | |
test_setitem_dim[tuple] | 84.2270μs | 41.0990μs | 24.3315 KOps/s | 23.3920 KOps/s | |
test_setitem | 0.1784ms | 21.1192μs | 47.3503 KOps/s | 46.9634 KOps/s | |
test_set | 0.2297ms | 20.6319μs | 48.4686 KOps/s | 48.4636 KOps/s | |
test_set_shared | 0.3907ms | 0.1792ms | 5.5801 KOps/s | 5.4301 KOps/s | |
test_update | 0.1942ms | 24.0813μs | 41.5260 KOps/s | 41.4814 KOps/s | |
test_update_nested | 0.2045ms | 33.2469μs | 30.0780 KOps/s | 28.8998 KOps/s | |
test_update__nested | 0.5030ms | 33.6445μs | 29.7226 KOps/s | 28.7421 KOps/s | |
test_set_nested | 62.8170μs | 22.5939μs | 44.2598 KOps/s | 42.3589 KOps/s | |
test_set_nested_new | 0.2159ms | 26.9331μs | 37.1290 KOps/s | 36.2472 KOps/s | |
test_select | 0.2277ms | 44.3999μs | 22.5226 KOps/s | 21.8626 KOps/s | |
test_select_nested | 0.1198ms | 63.4614μs | 15.7576 KOps/s | 15.4924 KOps/s | |
test_exclude_nested | 0.1880ms | 81.5228μs | 12.2665 KOps/s | 12.1178 KOps/s | |
test_empty[True] | 0.7425ms | 0.4143ms | 2.4139 KOps/s | 2.3905 KOps/s | |
test_empty[False] | 8.5383μs | 1.3840μs | 722.5237 KOps/s | 722.0446 KOps/s | |
test_unbind_speed | 0.3677ms | 0.2722ms | 3.6733 KOps/s | 3.6498 KOps/s | |
test_unbind_speed_stack0 | 0.4897ms | 0.2698ms | 3.7069 KOps/s | 3.6747 KOps/s | |
test_unbind_speed_stack1 | 0.1095s | 0.7353ms | 1.3599 KOps/s | 1.2258 KOps/s | |
test_split | 0.1104s | 1.7520ms | 570.7912 Ops/s | 622.3769 Ops/s | |
test_chunk | 0.1111s | 1.7517ms | 570.8762 Ops/s | 564.6371 Ops/s | |
test_consolidate_njt[False-None] | 8.7233ms | 8.2213ms | 121.6356 Ops/s | 109.7262 Ops/s | |
test_creation[device0] | 0.2930ms | 93.9816μs | 10.6404 KOps/s | 10.5741 KOps/s | |
test_creation_from_tensor | 3.6181ms | 98.6942μs | 10.1323 KOps/s | 10.3074 KOps/s | |
test_add_one[memmap_tensor0] | 0.1447ms | 4.9595μs | 201.6350 KOps/s | 196.6132 KOps/s | |
test_contiguous[memmap_tensor0] | 21.9510μs | 0.5136μs | 1.9469 MOps/s | 1.9828 MOps/s | |
test_stack[memmap_tensor0] | 31.6590μs | 3.3965μs | 294.4247 KOps/s | 289.3672 KOps/s | |
test_memmaptd_index | 0.3445ms | 0.2291ms | 4.3643 KOps/s | 4.4189 KOps/s | |
test_memmaptd_index_astensor | 0.5295ms | 0.3176ms | 3.1488 KOps/s | 3.1683 KOps/s | |
test_memmaptd_index_op | 1.3744ms | 0.5991ms | 1.6691 KOps/s | 1.6377 KOps/s | |
test_serialize_model | 0.2240s | 0.1359s | 7.3608 Ops/s | 8.7265 Ops/s | |
test_serialize_model_pickle | 0.4988s | 0.3993s | 2.5043 Ops/s | 2.5014 Ops/s | |
test_serialize_weights | 0.1239s | 0.1128s | 8.8670 Ops/s | 8.6534 Ops/s | |
test_serialize_weights_returnearly | 0.1676s | 0.1583s | 6.3180 Ops/s | 6.4288 Ops/s | |
test_serialize_weights_pickle | 0.6394s | 0.4743s | 2.1082 Ops/s | 2.3264 Ops/s | |
test_serialize_weights_filesystem | 0.2555s | 0.1617s | 6.1824 Ops/s | 7.0406 Ops/s | |
test_serialize_model_filesystem | 0.1606s | 0.1488s | 6.7195 Ops/s | 6.5642 Ops/s | |
test_reshape_pytree | 57.6770μs | 26.0092μs | 38.4479 KOps/s | 37.8690 KOps/s | |
test_reshape_td | 66.1830μs | 31.9855μs | 31.2641 KOps/s | 29.9470 KOps/s | |
test_view_pytree | 61.9460μs | 25.6658μs | 38.9623 KOps/s | 38.2859 KOps/s | |
test_view_td | 94.3860μs | 38.1934μs | 26.1826 KOps/s | 25.2423 KOps/s | |
test_unbind_pytree | 67.6770μs | 29.5825μs | 33.8038 KOps/s | 33.8301 KOps/s | |
test_unbind_td | 0.3147ms | 41.0377μs | 24.3679 KOps/s | 24.8524 KOps/s | |
test_split_pytree | 0.1500ms | 29.4327μs | 33.9758 KOps/s | 34.1927 KOps/s | |
test_split_td | 0.9351ms | 48.6750μs | 20.5444 KOps/s | 22.1601 KOps/s | |
test_add_pytree | 99.8860μs | 35.8855μs | 27.8664 KOps/s | 27.8949 KOps/s | |
test_add_td | 0.1082ms | 56.0799μs | 17.8317 KOps/s | 16.8923 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1397ms | 66.3026μs | 15.0824 KOps/s | 14.7621 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.3503ms | 0.1703ms | 5.8713 KOps/s | 5.7658 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1094ms | 46.2457μs | 21.6236 KOps/s | 20.6165 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2014ms | 0.1177ms | 8.4967 KOps/s | 8.4591 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.7970μs | 28.3352μs | 35.2918 KOps/s | 35.6824 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1365ms | 58.1278μs | 17.2035 KOps/s | 16.9894 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1481ms | 78.1925μs | 12.7890 KOps/s | 12.5438 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1532ms | 65.9806μs | 15.1560 KOps/s | 14.7142 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2356ms | 0.1066ms | 9.3803 KOps/s | 9.1558 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4269ms | 0.2137ms | 4.6789 KOps/s | 4.5661 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1098ms | 47.7283μs | 20.9519 KOps/s | 20.8819 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1408ms | 65.8891μs | 15.1770 KOps/s | 14.3457 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4669ms | 0.1034ms | 9.6717 KOps/s | 9.9403 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3135ms | 0.2029ms | 4.9279 KOps/s | 4.9236 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4720ms | 0.2310ms | 4.3299 KOps/s | 4.1899 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3129ms | 0.1097ms | 9.1120 KOps/s | 8.9539 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1319ms | 63.6583μs | 15.7089 KOps/s | 15.2185 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 97.9520μs | 48.5044μs | 20.6167 KOps/s | 20.1425 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3316ms | 0.1564ms | 6.3928 KOps/s | 6.3658 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2169ms | 0.1014ms | 9.8657 KOps/s | 9.6746 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 71.1730μs | 21.1950μs | 47.1810 KOps/s | 44.5872 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1171ms | 66.7033μs | 14.9918 KOps/s | 14.4092 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1810ms | 85.8327μs | 11.6506 KOps/s | 11.8533 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1576ms | 68.8006μs | 14.5348 KOps/s | 14.4905 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2872ms | 0.2139ms | 4.6748 KOps/s | 4.5362 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6076ms | 1.3595ms | 735.5546 Ops/s | 704.1992 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3820ms | 0.2081ms | 4.8049 KOps/s | 4.6361 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7770ms | 0.8393ms | 1.1914 KOps/s | 1.2134 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.7329ms | 0.4521ms | 2.2118 KOps/s | 2.0935 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.4608ms | 2.7689ms | 361.1505 Ops/s | 359.5036 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 96.5310μs | 37.6581μs | 26.5547 KOps/s | 24.3326 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5855ms | 33.8054μs | 29.5811 KOps/s | 29.5943 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 81.0310μs | 31.4210μs | 31.8259 KOps/s | 30.8003 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 77.4340μs | 23.1034μs | 43.2837 KOps/s | 43.6349 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 77.7750μs | 31.7505μs | 31.4955 KOps/s | 29.7550 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2965ms | 22.9682μs | 43.5385 KOps/s | 43.6374 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 95.0280μs | 53.4659μs | 18.7035 KOps/s | 18.0779 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3854ms | 20.1386μs | 49.6559 KOps/s | 48.6971 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1051ms | 45.9548μs | 21.7605 KOps/s | 21.0601 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 75.7010μs | 18.7980μs | 53.1971 KOps/s | 53.2836 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1434ms | 46.1577μs | 21.6649 KOps/s | 20.9334 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 66.3440μs | 18.5920μs | 53.7866 KOps/s | 52.8480 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1385ms | 54.9278μs | 18.2057 KOps/s | 17.9438 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0276ms | 19.6252μs | 50.9548 KOps/s | 49.3633 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 95.2680μs | 45.9803μs | 21.7485 KOps/s | 21.1340 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 54.2110μs | 18.6819μs | 53.5277 KOps/s | 53.9221 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1195ms | 46.5800μs | 21.4685 KOps/s | 21.1626 KOps/s | |
test_compile_indexing[int-pytree-eager] | 56.1250μs | 18.6954μs | 53.4892 KOps/s | 53.3884 KOps/s | |
test_mod_add[eager] | 0.1320ms | 36.5039μs | 27.3944 KOps/s | 27.5986 KOps/s | |
test_mod_add[compile] | 0.1182ms | 63.3197μs | 15.7929 KOps/s | 15.0821 KOps/s | |
test_mod_add[compile-overhead] | 0.1054ms | 62.1797μs | 16.0824 KOps/s | 15.0924 KOps/s | |
test_mod_wrap[eager] | 0.3607ms | 0.2196ms | 4.5534 KOps/s | 4.3443 KOps/s | |
test_mod_wrap[compile] | 2.3716ms | 0.2284ms | 4.3786 KOps/s | 4.2995 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3570ms | 0.2264ms | 4.4173 KOps/s | 4.3499 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.3961ms | 11.1322ms | 89.8298 Ops/s | 72.8044 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.6346ms | 10.7608ms | 92.9301 Ops/s | 82.0639 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.6310ms | 10.8344ms | 92.2986 Ops/s | 84.8779 Ops/s | |
test_seq_add[eager] | 0.2147ms | 0.1185ms | 8.4372 KOps/s | 8.2533 KOps/s | |
test_seq_add[compile] | 0.1338ms | 76.1717μs | 13.1282 KOps/s | 12.9701 KOps/s | |
test_seq_add[compile-overhead] | 0.1335ms | 76.2428μs | 13.1160 KOps/s | 13.0029 KOps/s | |
test_seq_wrap[eager] | 0.6768ms | 0.4504ms | 2.2204 KOps/s | 2.2040 KOps/s | |
test_seq_wrap[compile] | 0.3377ms | 0.2448ms | 4.0849 KOps/s | 4.1018 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3902ms | 0.2433ms | 4.1105 KOps/s | 4.1153 KOps/s | |
test_func_call_runtime[False-eager] | 0.7105ms | 0.5450ms | 1.8349 KOps/s | 1.8430 KOps/s | |
test_func_call_runtime[False-compile] | 0.6983ms | 0.4496ms | 2.2242 KOps/s | 2.2661 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5607ms | 0.4467ms | 2.2388 KOps/s | 2.2813 KOps/s | |
test_func_call_runtime[True-eager] | 0.9863ms | 0.7423ms | 1.3472 KOps/s | 1.3095 KOps/s | |
test_func_call_runtime[True-compile] | 0.8482ms | 0.4697ms | 2.1290 KOps/s | 2.1608 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 1.0125ms | 0.4713ms | 2.1219 KOps/s | 2.1665 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9211ms | 0.5377ms | 1.8598 KOps/s | 1.8793 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7808ms | 0.4500ms | 2.2221 KOps/s | 2.2960 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 1.4053ms | 0.4566ms | 2.1901 KOps/s | 2.2661 KOps/s | |
test_func_call_cm_runtime[True-eager] | 0.9925ms | 0.8971ms | 1.1147 KOps/s | 1.1089 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9220ms | 0.8011ms | 1.2484 KOps/s | 1.2541 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9376ms | 0.8078ms | 1.2379 KOps/s | 1.2471 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6093ms | 1.9134ms | 522.6396 Ops/s | 519.8251 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9393ms | 0.5482ms | 1.8240 KOps/s | 1.8795 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0457ms | 0.5523ms | 1.8106 KOps/s | 1.8845 KOps/s | |
test_distributed | 0.2321ms | 0.1256ms | 7.9626 KOps/s | 7.7598 KOps/s | |
test_tdmodule | 44.8040μs | 27.4781μs | 36.3927 KOps/s | 36.0437 KOps/s | |
test_tdmodule_dispatch | 85.6200μs | 49.9539μs | 20.0185 KOps/s | 20.0257 KOps/s | |
test_tdseq | 89.5270μs | 33.1119μs | 30.2006 KOps/s | 32.8252 KOps/s | |
test_tdseq_dispatch | 88.1950μs | 56.0515μs | 17.8407 KOps/s | 17.8366 KOps/s | |
test_instantiation_functorch | 1.7894ms | 1.5063ms | 663.8608 Ops/s | 643.5744 Ops/s | |
test_exec_functorch | 0.3296ms | 0.1770ms | 5.6500 KOps/s | 5.4358 KOps/s | |
test_exec_functional_call | 0.3366ms | 0.1697ms | 5.8933 KOps/s | 5.7280 KOps/s | |
test_exec_td_decorator | 0.4519ms | 0.2322ms | 4.3073 KOps/s | 4.2042 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1363ms | 0.6671ms | 1.4990 KOps/s | 1.4592 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1778ms | 0.6698ms | 1.4930 KOps/s | 1.4908 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7328ms | 0.5290ms | 1.8903 KOps/s | 1.8410 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7501ms | 0.5320ms | 1.8797 KOps/s | 1.8535 KOps/s | |
test_to_module_speed[True] | 2.1768ms | 1.3128ms | 761.7471 Ops/s | 736.6011 Ops/s | |
test_to_module_speed[False] | 2.1182ms | 1.2884ms | 776.1305 Ops/s | 758.9429 Ops/s | |
test_tc_init | 91.5210μs | 48.5200μs | 20.6101 KOps/s | 21.5002 KOps/s | |
test_tc_init_nested | 0.1876ms | 97.0134μs | 10.3079 KOps/s | 10.7453 KOps/s | |
test_tc_first_layer_tensor | 15.1780μs | 1.5245μs | 655.9513 KOps/s | 657.1073 KOps/s | |
test_tc_first_layer_nontensor | 29.7250μs | 4.7539μs | 210.3537 KOps/s | 212.1530 KOps/s | |
test_tc_second_layer_tensor | 28.2530μs | 2.8011μs | 357.0061 KOps/s | 345.4772 KOps/s | |
test_tc_second_layer_nontensor | 32.7110μs | 5.9419μs | 168.2949 KOps/s | 165.5416 KOps/s | |
test_unbind | 0.2194s | 12.8985ms | 77.5283 Ops/s | 63.9222 Ops/s | |
test_full_like | 8.2801ms | 7.0149ms | 142.5527 Ops/s | 120.4196 Ops/s | |
test_zeros_like | 4.4683ms | 2.7184ms | 367.8591 Ops/s | 214.8217 Ops/s | |
test_ones_like | 3.6256ms | 3.0356ms | 329.4206 Ops/s | 289.7849 Ops/s | |
test_clone | 4.9840ms | 4.6649ms | 214.3690 Ops/s | 141.8689 Ops/s | |
test_squeeze | 56.1450μs | 12.0020μs | 83.3193 KOps/s | 81.7545 KOps/s | |
test_unsqueeze | 0.2662ms | 89.7582μs | 11.1410 KOps/s | 11.1802 KOps/s | |
test_split | 0.3660ms | 0.1911ms | 5.2322 KOps/s | 5.1209 KOps/s | |
test_permute | 0.3363ms | 0.1943ms | 5.1480 KOps/s | 4.9653 KOps/s | |
test_stack | 27.5389ms | 24.4998ms | 40.8167 Ops/s | 38.2520 Ops/s | |
test_cat | 28.3284ms | 24.5552ms | 40.7246 Ops/s | 39.6024 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 4, 2025
ghstack-source-id: 6606e4b96061f73b98787b25129c29671a78dc1e Pull Request resolved: #1192
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
BE
Better errors, logs, docs or test utils
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):