-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] cat_tensors and stack_tensors #1017
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 44.7130μs | 21.0366μs | 47.5362 KOps/s | 50.6876 KOps/s | |
test_plain_set_stack_nested | 50.4030μs | 21.0042μs | 47.6096 KOps/s | 50.9806 KOps/s | |
test_plain_set_nested_inplace | 68.3770μs | 22.8817μs | 43.7030 KOps/s | 46.7940 KOps/s | |
test_plain_set_stack_nested_inplace | 50.4140μs | 22.8333μs | 43.7957 KOps/s | 46.3075 KOps/s | |
test_items | 25.0460μs | 4.2939μs | 232.8904 KOps/s | 240.7553 KOps/s | |
test_items_nested | 0.6461ms | 0.3660ms | 2.7321 KOps/s | 2.7844 KOps/s | |
test_items_nested_locked | 0.5923ms | 0.3626ms | 2.7582 KOps/s | 2.7737 KOps/s | |
test_items_nested_leaf | 0.1400ms | 68.9429μs | 14.5048 KOps/s | 14.4583 KOps/s | |
test_items_stack_nested | 0.5596ms | 0.3709ms | 2.6963 KOps/s | 2.7566 KOps/s | |
test_items_stack_nested_leaf | 0.1325ms | 71.5512μs | 13.9760 KOps/s | 13.8953 KOps/s | |
test_items_stack_nested_locked | 0.6854ms | 0.3693ms | 2.7080 KOps/s | 2.7635 KOps/s | |
test_keys | 24.6360μs | 3.5257μs | 283.6348 KOps/s | 286.8116 KOps/s | |
test_keys_nested | 0.2031ms | 0.1032ms | 9.6932 KOps/s | 10.1510 KOps/s | |
test_keys_nested_locked | 0.7681ms | 0.1076ms | 9.2916 KOps/s | 9.6737 KOps/s | |
test_keys_nested_leaf | 0.1815ms | 85.5944μs | 11.6830 KOps/s | 12.3041 KOps/s | |
test_keys_stack_nested | 0.2006ms | 99.7704μs | 10.0230 KOps/s | 10.0917 KOps/s | |
test_keys_stack_nested_leaf | 0.1507ms | 82.2631μs | 12.1561 KOps/s | 12.2327 KOps/s | |
test_keys_stack_nested_locked | 0.1977ms | 0.1055ms | 9.4761 KOps/s | 9.5651 KOps/s | |
test_values | 5.9190μs | 1.0354μs | 965.8001 KOps/s | 951.7237 KOps/s | |
test_values_nested | 0.1310ms | 73.4961μs | 13.6062 KOps/s | 13.8304 KOps/s | |
test_values_nested_locked | 0.1655ms | 72.3242μs | 13.8266 KOps/s | 13.6065 KOps/s | |
test_values_nested_leaf | 0.1146ms | 61.3789μs | 16.2922 KOps/s | 16.2896 KOps/s | |
test_values_stack_nested | 0.1352ms | 73.9305μs | 13.5262 KOps/s | 13.5828 KOps/s | |
test_values_stack_nested_leaf | 0.1137ms | 59.1848μs | 16.8962 KOps/s | 16.0841 KOps/s | |
test_values_stack_nested_locked | 0.1325ms | 72.9640μs | 13.7054 KOps/s | 13.5921 KOps/s | |
test_membership | 22.2210μs | 0.8648μs | 1.1564 MOps/s | 1.3577 MOps/s | |
test_membership_nested | 22.0410μs | 2.7808μs | 359.6120 KOps/s | 364.3838 KOps/s | |
test_membership_nested_leaf | 34.6440μs | 2.7786μs | 359.8887 KOps/s | 362.1230 KOps/s | |
test_membership_stacked_nested | 34.7640μs | 2.7829μs | 359.3409 KOps/s | 355.4980 KOps/s | |
test_membership_stacked_nested_leaf | 29.2850μs | 2.7970μs | 357.5213 KOps/s | 360.7501 KOps/s | |
test_membership_nested_last | 28.7740μs | 3.9779μs | 251.3906 KOps/s | 253.2901 KOps/s | |
test_membership_nested_leaf_last | 25.3670μs | 3.9894μs | 250.6645 KOps/s | 251.2412 KOps/s | |
test_membership_stacked_nested_last | 34.9850μs | 13.0150μs | 76.8346 KOps/s | 250.8916 KOps/s | |
test_membership_stacked_nested_leaf_last | 42.2790μs | 13.0254μs | 76.7731 KOps/s | 249.3397 KOps/s | |
test_nested_getleaf | 35.8470μs | 10.7982μs | 92.6079 KOps/s | 92.6843 KOps/s | |
test_nested_get | 39.9440μs | 10.1876μs | 98.1584 KOps/s | 97.7910 KOps/s | |
test_stacked_getleaf | 38.5820μs | 10.6407μs | 93.9784 KOps/s | 92.8633 KOps/s | |
test_stacked_get | 32.9210μs | 10.2546μs | 97.5171 KOps/s | 98.1585 KOps/s | |
test_nested_getitemleaf | 36.0870μs | 11.1858μs | 89.3994 KOps/s | 90.1416 KOps/s | |
test_nested_getitem | 40.8630μs | 10.3023μs | 97.0654 KOps/s | 95.7378 KOps/s | |
test_stacked_getitemleaf | 36.7580μs | 11.1009μs | 90.0831 KOps/s | 91.1472 KOps/s | |
test_stacked_getitem | 37.6600μs | 10.3423μs | 96.6906 KOps/s | 96.0176 KOps/s | |
test_lock_nested | 85.2494ms | 0.5787ms | 1.7281 KOps/s | 2.0123 KOps/s | |
test_lock_stack_nested | 0.6832ms | 0.4422ms | 2.2613 KOps/s | 2.1275 KOps/s | |
test_unlock_nested | 87.5026ms | 0.4939ms | 2.0247 KOps/s | 2.3846 KOps/s | |
test_unlock_stack_nested | 0.5289ms | 0.3608ms | 2.7719 KOps/s | 2.5842 KOps/s | |
test_flatten_speed | 0.1511ms | 86.8554μs | 11.5134 KOps/s | 10.9529 KOps/s | |
test_unflatten_speed | 0.5826ms | 0.4634ms | 2.1581 KOps/s | 2.1359 KOps/s | |
test_common_ops | 2.1612ms | 1.1389ms | 878.0446 Ops/s | 889.1541 Ops/s | |
test_creation | 25.0370μs | 2.0906μs | 478.3426 KOps/s | 468.0298 KOps/s | |
test_creation_empty | 69.9010μs | 18.9939μs | 52.6486 KOps/s | 58.4844 KOps/s | |
test_creation_nested_1 | 83.6860μs | 21.9964μs | 45.4620 KOps/s | 49.6712 KOps/s | |
test_creation_nested_2 | 78.9170μs | 26.4652μs | 37.7855 KOps/s | 41.3171 KOps/s | |
test_clone | 1.3190ms | 17.2755μs | 57.8856 KOps/s | 57.6042 KOps/s | |
test_getitem[int] | 0.7990ms | 16.7733μs | 59.6184 KOps/s | 56.8409 KOps/s | |
test_getitem[slice_int] | 0.1461ms | 31.5603μs | 31.6854 KOps/s | 31.4987 KOps/s | |
test_getitem[range] | 0.1760ms | 60.9876μs | 16.3968 KOps/s | 16.9056 KOps/s | |
test_getitem[tuple] | 0.1323ms | 25.2986μs | 39.5279 KOps/s | 39.0598 KOps/s | |
test_getitem[list] | 0.2114ms | 57.7782μs | 17.3076 KOps/s | 18.4563 KOps/s | |
test_setitem_dim[int] | 89.4960μs | 36.9943μs | 27.0312 KOps/s | 29.4319 KOps/s | |
test_setitem_dim[slice_int] | 0.1130ms | 61.4761μs | 16.2665 KOps/s | 16.2102 KOps/s | |
test_setitem_dim[range] | 0.1513ms | 84.0920μs | 11.8917 KOps/s | 11.7957 KOps/s | |
test_setitem_dim[tuple] | 89.0960μs | 49.9836μs | 20.0066 KOps/s | 19.9725 KOps/s | |
test_setitem | 0.3077ms | 29.9890μs | 33.3455 KOps/s | 33.8977 KOps/s | |
test_set | 0.1901ms | 29.3875μs | 34.0281 KOps/s | 34.9001 KOps/s | |
test_set_shared | 2.2471ms | 0.2136ms | 4.6807 KOps/s | 4.5985 KOps/s | |
test_update | 0.2171ms | 37.2533μs | 26.8433 KOps/s | 28.2941 KOps/s | |
test_update_nested | 0.2431ms | 47.9892μs | 20.8380 KOps/s | 21.7254 KOps/s | |
test_update__nested | 0.1417ms | 34.8513μs | 28.6934 KOps/s | 27.7388 KOps/s | |
test_set_nested | 0.1619ms | 32.0242μs | 31.2264 KOps/s | 31.7658 KOps/s | |
test_set_nested_new | 0.3030ms | 37.3727μs | 26.7575 KOps/s | 27.1542 KOps/s | |
test_select | 0.2865ms | 54.6530μs | 18.2973 KOps/s | 18.5697 KOps/s | |
test_select_nested | 0.1126ms | 58.9514μs | 16.9631 KOps/s | 16.7362 KOps/s | |
test_exclude_nested | 0.1397ms | 73.6306μs | 13.5813 KOps/s | 13.6081 KOps/s | |
test_empty[True] | 0.4126ms | 0.3184ms | 3.1402 KOps/s | 3.1363 KOps/s | |
test_empty[False] | 7.9323μs | 1.2070μs | 828.5310 KOps/s | 821.6175 KOps/s | |
test_unbind_speed | 0.5320ms | 0.3131ms | 3.1938 KOps/s | 3.2194 KOps/s | |
test_unbind_speed_stack0 | 0.4438ms | 0.2917ms | 3.4286 KOps/s | 3.3152 KOps/s | |
test_unbind_speed_stack1 | 93.6979ms | 0.7918ms | 1.2629 KOps/s | 1.4363 KOps/s | |
test_split | 2.2147ms | 2.0024ms | 499.3916 Ops/s | 453.0369 Ops/s | |
test_chunk | 96.6796ms | 2.1943ms | 455.7337 Ops/s | 443.8029 Ops/s | |
test_creation[device0] | 0.2335ms | 0.1178ms | 8.4903 KOps/s | 8.5049 KOps/s | |
test_creation_from_tensor | 3.4986ms | 0.1193ms | 8.3826 KOps/s | 8.4394 KOps/s | |
test_add_one[memmap_tensor0] | 0.4193ms | 7.5241μs | 132.9062 KOps/s | 130.1608 KOps/s | |
test_contiguous[memmap_tensor0] | 21.9710μs | 1.9080μs | 524.1139 KOps/s | 465.3542 KOps/s | |
test_stack[memmap_tensor0] | 52.4870μs | 5.6666μs | 176.4741 KOps/s | 175.6311 KOps/s | |
test_memmaptd_index | 1.0640ms | 0.4111ms | 2.4325 KOps/s | 2.4591 KOps/s | |
test_memmaptd_index_astensor | 0.7655ms | 0.4881ms | 2.0487 KOps/s | 2.0475 KOps/s | |
test_memmaptd_index_op | 1.4231ms | 1.0308ms | 970.1352 Ops/s | 968.8956 Ops/s | |
test_serialize_model | 0.2228s | 0.1368s | 7.3125 Ops/s | 8.3292 Ops/s | |
test_serialize_model_pickle | 0.4605s | 0.3945s | 2.5346 Ops/s | 2.5012 Ops/s | |
test_serialize_weights | 0.1260s | 0.1150s | 8.6969 Ops/s | 8.6466 Ops/s | |
test_serialize_weights_returnearly | 0.1737s | 0.1617s | 6.1855 Ops/s | 6.3695 Ops/s | |
test_serialize_weights_pickle | 0.4674s | 0.4056s | 2.4655 Ops/s | 2.4706 Ops/s | |
test_serialize_weights_filesystem | 0.2433s | 0.1560s | 6.4101 Ops/s | 6.9240 Ops/s | |
test_serialize_model_filesystem | 0.1608s | 0.1531s | 6.5319 Ops/s | 6.0602 Ops/s | |
test_reshape_pytree | 95.8550μs | 39.1263μs | 25.5582 KOps/s | 25.3373 KOps/s | |
test_reshape_td | 0.1165ms | 46.6501μs | 21.4362 KOps/s | 20.5426 KOps/s | |
test_view_pytree | 86.4200μs | 38.5141μs | 25.9645 KOps/s | 25.4928 KOps/s | |
test_view_td | 0.1205ms | 51.9134μs | 19.2628 KOps/s | 18.0086 KOps/s | |
test_unbind_pytree | 83.1950μs | 35.7007μs | 28.0107 KOps/s | 27.1948 KOps/s | |
test_unbind_td | 0.3729ms | 45.3959μs | 22.0284 KOps/s | 21.7814 KOps/s | |
test_split_pytree | 0.1172ms | 38.0960μs | 26.2495 KOps/s | 26.0629 KOps/s | |
test_split_td | 0.4499ms | 58.4542μs | 17.1074 KOps/s | 16.7770 KOps/s | |
test_add_pytree | 99.0140μs | 46.4898μs | 21.5101 KOps/s | 21.7092 KOps/s | |
test_add_td | 0.2993ms | 82.1712μs | 12.1697 KOps/s | 12.1235 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1082ms | 57.3897μs | 17.4247 KOps/s | 17.1528 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2972ms | 0.1773ms | 5.6389 KOps/s | 5.6184 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1076ms | 56.9732μs | 17.5521 KOps/s | 17.4254 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3250ms | 0.1440ms | 6.9436 KOps/s | 6.8964 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.9870μs | 21.9891μs | 45.4770 KOps/s | 43.6787 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1467ms | 66.5560μs | 15.0249 KOps/s | 15.2934 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1393ms | 74.2779μs | 13.4630 KOps/s | 13.3147 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1390ms | 67.9363μs | 14.7197 KOps/s | 14.6540 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3495ms | 0.1735ms | 5.7630 KOps/s | 5.6776 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3142ms | 0.1906ms | 5.2469 KOps/s | 5.2058 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 88.9650μs | 47.6041μs | 21.0066 KOps/s | 20.5942 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2321ms | 67.9849μs | 14.7092 KOps/s | 14.2165 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3720ms | 0.1768ms | 5.6551 KOps/s | 5.6820 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4249ms | 0.2865ms | 3.4903 KOps/s | 3.4644 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4390ms | 0.2040ms | 4.9019 KOps/s | 4.9307 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3647ms | 0.1764ms | 5.6692 KOps/s | 5.6972 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1501ms | 61.8085μs | 16.1790 KOps/s | 15.8505 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1444ms | 48.5114μs | 20.6137 KOps/s | 20.6837 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3417ms | 0.2285ms | 4.3757 KOps/s | 4.2695 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3832ms | 0.1749ms | 5.7186 KOps/s | 5.5544 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2415ms | 0.1030ms | 9.7050 KOps/s | 9.6112 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1513ms | 56.9324μs | 17.5647 KOps/s | 17.7854 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1431ms | 75.4073μs | 13.2613 KOps/s | 12.9639 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1276ms | 67.7230μs | 14.7660 KOps/s | 14.4908 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2949ms | 0.1918ms | 5.2141 KOps/s | 5.1880 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8119ms | 1.6277ms | 614.3721 Ops/s | 600.1101 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3801ms | 0.1898ms | 5.2688 KOps/s | 5.1616 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3346ms | 1.0776ms | 928.0252 Ops/s | 890.9976 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.8168ms | 0.4181ms | 2.3915 KOps/s | 2.3916 KOps/s | |
test_compile_assign_and_add_stack[eager] | 7.2442ms | 3.8691ms | 258.4589 Ops/s | 264.1793 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 90.6190μs | 33.8960μs | 29.5020 KOps/s | 28.4186 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.1682ms | 49.7515μs | 20.0999 KOps/s | 19.9207 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1301ms | 29.7818μs | 33.5775 KOps/s | 32.6615 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 83.4560μs | 28.6025μs | 34.9620 KOps/s | 33.9026 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 78.9670μs | 29.8286μs | 33.5248 KOps/s | 32.6738 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 68.8480μs | 28.2798μs | 35.3609 KOps/s | 34.2341 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1372ms | 73.4986μs | 13.6057 KOps/s | 13.2909 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5504ms | 28.3100μs | 35.3232 KOps/s | 34.1668 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1519ms | 68.8869μs | 14.5165 KOps/s | 14.4358 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 73.2060μs | 23.2811μs | 42.9534 KOps/s | 41.7517 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1412ms | 67.6986μs | 14.7714 KOps/s | 14.5649 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 76.5630μs | 22.9692μs | 43.5365 KOps/s | 41.6421 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1398ms | 73.3596μs | 13.6315 KOps/s | 13.6832 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9888ms | 27.8815μs | 35.8660 KOps/s | 35.3587 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1528ms | 67.9864μs | 14.7088 KOps/s | 14.4857 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.2610μs | 22.6728μs | 44.1058 KOps/s | 41.6704 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1265ms | 67.8449μs | 14.7395 KOps/s | 14.5781 KOps/s | |
test_compile_indexing[int-pytree-eager] | 64.3200μs | 22.8716μs | 43.7224 KOps/s | 42.2697 KOps/s | |
test_mod_add[eager] | 81.2410μs | 25.7676μs | 38.8085 KOps/s | 40.4954 KOps/s | |
test_mod_add[compile] | 0.1004ms | 38.8549μs | 25.7368 KOps/s | 25.7325 KOps/s | |
test_mod_add[compile-overhead] | 87.7430μs | 39.0213μs | 25.6270 KOps/s | 25.8294 KOps/s | |
test_mod_wrap[eager] | 0.2958ms | 0.2074ms | 4.8218 KOps/s | 4.8526 KOps/s | |
test_mod_wrap[compile] | 0.3143ms | 0.2322ms | 4.3057 KOps/s | 4.1339 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3663ms | 0.2290ms | 4.3669 KOps/s | 4.1559 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.0217ms | 10.8762ms | 91.9442 Ops/s | 86.2961 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.3347ms | 10.8252ms | 92.3771 Ops/s | 80.8981 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.4514ms | 10.9289ms | 91.5004 Ops/s | 81.1618 Ops/s | |
test_seq_add[eager] | 0.1640ms | 91.0197μs | 10.9866 KOps/s | 11.0788 KOps/s | |
test_seq_add[compile] | 0.3496ms | 64.3561μs | 15.5385 KOps/s | 15.2764 KOps/s | |
test_seq_add[compile-overhead] | 0.1316ms | 63.2669μs | 15.8060 KOps/s | 15.5219 KOps/s | |
test_seq_wrap[eager] | 0.4927ms | 0.3840ms | 2.6043 KOps/s | 2.5244 KOps/s | |
test_seq_wrap[compile] | 1.2587ms | 0.2673ms | 3.7405 KOps/s | 3.6305 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3295ms | 0.2674ms | 3.7397 KOps/s | 3.6235 KOps/s | |
test_func_call_runtime[False-eager] | 0.6989ms | 0.5304ms | 1.8852 KOps/s | 1.8978 KOps/s | |
test_func_call_runtime[False-compile] | 0.9005ms | 0.5019ms | 1.9925 KOps/s | 1.9521 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7127ms | 0.5042ms | 1.9835 KOps/s | 1.9382 KOps/s | |
test_func_call_runtime[True-eager] | 0.9064ms | 0.7532ms | 1.3277 KOps/s | 1.3330 KOps/s | |
test_func_call_runtime[True-compile] | 0.9677ms | 0.5157ms | 1.9391 KOps/s | 1.9133 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6128ms | 0.5148ms | 1.9426 KOps/s | 1.9006 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9655ms | 0.5299ms | 1.8873 KOps/s | 1.8946 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5950ms | 0.4986ms | 2.0055 KOps/s | 1.9501 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6303ms | 0.4994ms | 2.0023 KOps/s | 1.9497 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1501ms | 0.8778ms | 1.1392 KOps/s | 1.1208 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9431ms | 0.7358ms | 1.3591 KOps/s | 1.3238 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1486ms | 0.7396ms | 1.3521 KOps/s | 1.3124 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7213ms | 1.8609ms | 537.3673 Ops/s | 524.3858 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.7781ms | 1.9095ms | 523.7050 Ops/s | 513.6993 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 3.1976ms | 1.9135ms | 522.6052 Ops/s | 513.4767 Ops/s | |
test_distributed | 0.2249ms | 0.1275ms | 7.8436 KOps/s | 7.6457 KOps/s | |
test_tdmodule | 39.7650μs | 18.5194μs | 53.9973 KOps/s | 55.4807 KOps/s | |
test_tdmodule_dispatch | 63.9890μs | 37.0338μs | 27.0023 KOps/s | 28.2044 KOps/s | |
test_tdseq | 55.8740μs | 21.7336μs | 46.0118 KOps/s | 48.4825 KOps/s | |
test_tdseq_dispatch | 62.4760μs | 43.1883μs | 23.1544 KOps/s | 24.5645 KOps/s | |
test_instantiation_functorch | 2.1026ms | 1.5857ms | 630.6233 Ops/s | 610.1194 Ops/s | |
test_instantiation_td | 1.8408ms | 1.1685ms | 855.8257 Ops/s | 839.7532 Ops/s | |
test_exec_functorch | 0.3196ms | 0.1841ms | 5.4325 KOps/s | 5.2359 KOps/s | |
test_exec_functional_call | 0.3700ms | 0.1736ms | 5.7602 KOps/s | 5.6842 KOps/s | |
test_exec_td | 0.2907ms | 0.1690ms | 5.9164 KOps/s | 5.8374 KOps/s | |
test_exec_td_decorator | 1.1106ms | 0.2244ms | 4.4562 KOps/s | 4.4168 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1173ms | 0.6497ms | 1.5393 KOps/s | 1.5378 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9476ms | 0.6435ms | 1.5539 KOps/s | 1.5453 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7065ms | 0.4962ms | 2.0155 KOps/s | 1.9934 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6936ms | 0.4959ms | 2.0165 KOps/s | 1.9870 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.4206ms | 0.6217ms | 1.6084 KOps/s | 1.5906 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0771ms | 0.6242ms | 1.6021 KOps/s | 1.5959 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8509ms | 0.5127ms | 1.9504 KOps/s | 1.9283 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6014ms | 0.5085ms | 1.9664 KOps/s | 1.9358 KOps/s | |
test_to_module_speed[True] | 1.9574ms | 1.3014ms | 768.4037 Ops/s | 758.7925 Ops/s | |
test_to_module_speed[False] | 1.4279ms | 1.2595ms | 793.9578 Ops/s | 793.1632 Ops/s | |
test_tc_init | 96.1990μs | 45.9082μs | 21.7826 KOps/s | 22.8234 KOps/s | |
test_tc_init_nested | 0.1599ms | 90.5385μs | 11.0450 KOps/s | 11.7534 KOps/s | |
test_tc_first_layer_tensor | 15.7690μs | 1.5344μs | 651.7115 KOps/s | 653.9869 KOps/s | |
test_tc_first_layer_nontensor | 31.3180μs | 4.7473μs | 210.6453 KOps/s | 210.9332 KOps/s | |
test_tc_second_layer_tensor | 27.4010μs | 2.7992μs | 357.2472 KOps/s | 356.9168 KOps/s | |
test_tc_second_layer_nontensor | 32.8010μs | 6.0163μs | 166.2156 KOps/s | 164.0250 KOps/s | |
test_unbind | 0.5044s | 13.4453ms | 74.3752 Ops/s | 69.9757 Ops/s | |
test_full_like | 9.4372ms | 7.9070ms | 126.4702 Ops/s | 128.9854 Ops/s | |
test_zeros_like | 3.8023ms | 3.0114ms | 332.0745 Ops/s | 338.3099 Ops/s | |
test_ones_like | 4.4049ms | 3.5176ms | 284.2877 Ops/s | 286.0339 Ops/s | |
test_clone | 6.4870ms | 5.6469ms | 177.0875 Ops/s | 170.2166 Ops/s | |
test_squeeze | 61.7850μs | 12.8057μs | 78.0904 KOps/s | 79.1581 KOps/s | |
test_unsqueeze | 0.3234ms | 92.3337μs | 10.8303 KOps/s | 10.5073 KOps/s | |
test_split | 0.3580ms | 0.1957ms | 5.1094 KOps/s | 5.1089 KOps/s | |
test_permute | 0.4597ms | 0.2172ms | 4.6050 KOps/s | 4.3832 KOps/s | |
test_stack | 31.6412ms | 26.4287ms | 37.8377 Ops/s | 37.2424 Ops/s | |
test_cat | 34.5179ms | 26.1608ms | 38.2251 Ops/s | 37.8155 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1187ms | 13.8820μs | 72.0357 KOps/s | 68.7161 KOps/s | |
test_plain_set_stack_nested | 34.6320μs | 14.3680μs | 69.5993 KOps/s | 68.4647 KOps/s | |
test_plain_set_nested_inplace | 43.0030μs | 15.1310μs | 66.0895 KOps/s | 63.8824 KOps/s | |
test_plain_set_stack_nested_inplace | 45.3530μs | 15.2618μs | 65.5232 KOps/s | 65.2531 KOps/s | |
test_items | 25.3710μs | 2.8748μs | 347.8518 KOps/s | 342.7950 KOps/s | |
test_items_nested | 0.4053ms | 0.3342ms | 2.9919 KOps/s | 3.0658 KOps/s | |
test_items_nested_locked | 0.3881ms | 0.3359ms | 2.9772 KOps/s | 3.0453 KOps/s | |
test_items_nested_leaf | 80.7050μs | 55.7055μs | 17.9515 KOps/s | 18.0118 KOps/s | |
test_items_stack_nested | 0.4134ms | 0.3337ms | 2.9963 KOps/s | 3.0582 KOps/s | |
test_items_stack_nested_leaf | 86.0250μs | 56.8483μs | 17.5907 KOps/s | 17.5665 KOps/s | |
test_items_stack_nested_locked | 0.3945ms | 0.3406ms | 2.9359 KOps/s | 3.0204 KOps/s | |
test_keys | 28.3820μs | 3.4302μs | 291.5248 KOps/s | 290.3769 KOps/s | |
test_keys_nested | 84.5350μs | 55.2758μs | 18.0911 KOps/s | 18.0159 KOps/s | |
test_keys_nested_locked | 0.9192ms | 62.6166μs | 15.9702 KOps/s | 16.0996 KOps/s | |
test_keys_nested_leaf | 75.6350μs | 48.0275μs | 20.8214 KOps/s | 21.0892 KOps/s | |
test_keys_stack_nested | 81.8450μs | 57.4775μs | 17.3981 KOps/s | 17.6896 KOps/s | |
test_keys_stack_nested_leaf | 73.8840μs | 48.5032μs | 20.6172 KOps/s | 20.7444 KOps/s | |
test_keys_stack_nested_locked | 91.3360μs | 62.8773μs | 15.9040 KOps/s | 16.1822 KOps/s | |
test_values | 4.9253μs | 0.8645μs | 1.1567 MOps/s | 1.1863 MOps/s | |
test_values_nested | 81.1550μs | 40.9318μs | 24.4309 KOps/s | 24.3977 KOps/s | |
test_values_nested_locked | 70.9140μs | 43.2083μs | 23.1437 KOps/s | 23.2102 KOps/s | |
test_values_nested_leaf | 67.7040μs | 35.4223μs | 28.2308 KOps/s | 28.1116 KOps/s | |
test_values_stack_nested | 70.9040μs | 42.1724μs | 23.7122 KOps/s | 23.9249 KOps/s | |
test_values_stack_nested_leaf | 65.2940μs | 35.9556μs | 27.8120 KOps/s | 28.0420 KOps/s | |
test_values_stack_nested_locked | 76.8540μs | 43.6247μs | 22.9228 KOps/s | 22.8496 KOps/s | |
test_membership | 1.9051μs | 0.5005μs | 1.9981 MOps/s | 1.9858 MOps/s | |
test_membership_nested | 15.2710μs | 1.9167μs | 521.7427 KOps/s | 540.1543 KOps/s | |
test_membership_nested_leaf | 14.8710μs | 1.9202μs | 520.7925 KOps/s | 541.9083 KOps/s | |
test_membership_stacked_nested | 41.1630μs | 1.9895μs | 502.6454 KOps/s | 521.7488 KOps/s | |
test_membership_stacked_nested_leaf | 18.7810μs | 1.9984μs | 500.4037 KOps/s | 520.8548 KOps/s | |
test_membership_nested_last | 38.6620μs | 2.8049μs | 356.5209 KOps/s | 361.1016 KOps/s | |
test_membership_nested_leaf_last | 32.9520μs | 2.8124μs | 355.5693 KOps/s | 356.0597 KOps/s | |
test_membership_stacked_nested_last | 25.3820μs | 3.2012μs | 312.3855 KOps/s | 127.9481 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.8620μs | 3.2170μs | 310.8448 KOps/s | 128.4369 KOps/s | |
test_nested_getleaf | 31.2020μs | 6.1417μs | 162.8220 KOps/s | 164.5920 KOps/s | |
test_nested_get | 50.5830μs | 5.7917μs | 172.6598 KOps/s | 174.4874 KOps/s | |
test_stacked_getleaf | 26.5320μs | 6.0815μs | 164.4319 KOps/s | 164.7605 KOps/s | |
test_stacked_get | 48.3030μs | 5.7885μs | 172.7563 KOps/s | 177.3633 KOps/s | |
test_nested_getitemleaf | 40.6720μs | 6.1783μs | 161.8559 KOps/s | 164.4556 KOps/s | |
test_nested_getitem | 50.9530μs | 5.8587μs | 170.6867 KOps/s | 174.9375 KOps/s | |
test_stacked_getitemleaf | 22.8210μs | 6.1452μs | 162.7281 KOps/s | 163.9711 KOps/s | |
test_stacked_getitem | 46.3820μs | 5.8056μs | 172.2466 KOps/s | 175.1370 KOps/s | |
test_lock_nested | 7.5981ms | 0.4268ms | 2.3432 KOps/s | 2.3496 KOps/s | |
test_lock_stack_nested | 0.4510ms | 0.3864ms | 2.5877 KOps/s | 2.6626 KOps/s | |
test_unlock_nested | 0.7884ms | 0.3616ms | 2.7659 KOps/s | 2.8144 KOps/s | |
test_unlock_stack_nested | 0.3832ms | 0.3265ms | 3.0623 KOps/s | 3.1813 KOps/s | |
test_flatten_speed | 0.1459ms | 68.9450μs | 14.5043 KOps/s | 14.3303 KOps/s | |
test_unflatten_speed | 0.3416ms | 0.2824ms | 3.5405 KOps/s | 3.4725 KOps/s | |
test_common_ops | 1.6584ms | 1.2823ms | 779.8577 Ops/s | 772.7646 Ops/s | |
test_creation | 26.8910μs | 1.4920μs | 670.2355 KOps/s | 674.4908 KOps/s | |
test_creation_empty | 48.9030μs | 15.7323μs | 63.5634 KOps/s | 58.8845 KOps/s | |
test_creation_nested_1 | 49.9130μs | 17.7069μs | 56.4751 KOps/s | 53.4713 KOps/s | |
test_creation_nested_2 | 69.5550μs | 20.2768μs | 49.3174 KOps/s | 46.7796 KOps/s | |
test_clone | 1.3640ms | 30.7958μs | 32.4720 KOps/s | 33.9466 KOps/s | |
test_getitem[int] | 1.1937ms | 16.5055μs | 60.5859 KOps/s | 60.4133 KOps/s | |
test_getitem[slice_int] | 0.1363ms | 28.2010μs | 35.4598 KOps/s | 34.9750 KOps/s | |
test_getitem[range] | 0.2246ms | 0.1106ms | 9.0417 KOps/s | 9.0926 KOps/s | |
test_getitem[tuple] | 0.1157ms | 23.7639μs | 42.0806 KOps/s | 41.6932 KOps/s | |
test_getitem[list] | 0.1923ms | 99.6896μs | 10.0311 KOps/s | 9.9289 KOps/s | |
test_setitem_dim[int] | 75.7750μs | 45.0289μs | 22.2080 KOps/s | 22.1983 KOps/s | |
test_setitem_dim[slice_int] | 0.1138ms | 67.8195μs | 14.7450 KOps/s | 14.7275 KOps/s | |
test_setitem_dim[range] | 0.1718ms | 0.1277ms | 7.8305 KOps/s | 7.7794 KOps/s | |
test_setitem_dim[tuple] | 84.0450μs | 60.7531μs | 16.4601 KOps/s | 16.4877 KOps/s | |
test_setitem | 82.1750μs | 43.4275μs | 23.0269 KOps/s | 23.0234 KOps/s | |
test_set | 77.9750μs | 42.1155μs | 23.7442 KOps/s | 22.1119 KOps/s | |
test_set_shared | 0.3648ms | 51.9203μs | 19.2603 KOps/s | 18.4760 KOps/s | |
test_update | 87.7160μs | 50.6474μs | 19.7443 KOps/s | 19.2375 KOps/s | |
test_update_nested | 0.1102ms | 58.5521μs | 17.0788 KOps/s | 17.0085 KOps/s | |
test_update__nested | 96.2260μs | 62.0708μs | 16.1106 KOps/s | 16.4924 KOps/s | |
test_set_nested | 80.3640μs | 44.8783μs | 22.2825 KOps/s | 20.4548 KOps/s | |
test_set_nested_new | 86.3550μs | 49.2207μs | 20.3167 KOps/s | 19.4937 KOps/s | |
test_select | 94.0850μs | 61.6555μs | 16.2192 KOps/s | 15.1605 KOps/s | |
test_select_nested | 74.1040μs | 41.8827μs | 23.8762 KOps/s | 22.9475 KOps/s | |
test_exclude_nested | 90.9160μs | 58.2758μs | 17.1598 KOps/s | 16.8980 KOps/s | |
test_empty[True] | 0.2976ms | 0.2451ms | 4.0802 KOps/s | 4.0378 KOps/s | |
test_empty[False] | 3.6543μs | 0.7445μs | 1.3433 MOps/s | 1.3539 MOps/s | |
test_to | 53.8230μs | 26.8876μs | 37.1919 KOps/s | 38.8816 KOps/s | |
test_to_nonblocking | 81.6650μs | 25.7351μs | 38.8574 KOps/s | 41.6309 KOps/s | |
test_unbind_speed | 1.0549ms | 0.2861ms | 3.4950 KOps/s | 3.6125 KOps/s | |
test_unbind_speed_stack0 | 0.3478ms | 0.2798ms | 3.5740 KOps/s | 3.7252 KOps/s | |
test_unbind_speed_stack1 | 92.2272ms | 0.7133ms | 1.4019 KOps/s | 1.4370 KOps/s | |
test_split | 94.0693ms | 2.1974ms | 455.0746 Ops/s | 454.9456 Ops/s | |
test_chunk | 94.2022ms | 2.1795ms | 458.8227 Ops/s | 452.6599 Ops/s | |
test_creation[device0] | 0.3487ms | 0.1290ms | 7.7533 KOps/s | 7.7473 KOps/s | |
test_creation_from_tensor | 0.3827ms | 0.1309ms | 7.6408 KOps/s | 7.3795 KOps/s | |
test_add_one[memmap_tensor0] | 0.2037ms | 9.4723μs | 105.5705 KOps/s | 107.7780 KOps/s | |
test_contiguous[memmap_tensor0] | 29.7220μs | 2.2270μs | 449.0314 KOps/s | 452.7398 KOps/s | |
test_stack[memmap_tensor0] | 32.7620μs | 6.9409μs | 144.0736 KOps/s | 145.4174 KOps/s | |
test_memmaptd_index | 1.2496ms | 0.4172ms | 2.3972 KOps/s | 2.3335 KOps/s | |
test_memmaptd_index_astensor | 0.7852ms | 0.4790ms | 2.0876 KOps/s | 2.0551 KOps/s | |
test_memmaptd_index_op | 1.4531ms | 1.0478ms | 954.3785 Ops/s | 933.0064 Ops/s | |
test_serialize_model | 0.1313s | 0.1297s | 7.7098 Ops/s | 7.7107 Ops/s | |
test_serialize_model_pickle | 1.3497s | 1.2133s | 0.8242 Ops/s | 0.8211 Ops/s | |
test_serialize_weights | 0.2215s | 0.1431s | 6.9901 Ops/s | 7.0033 Ops/s | |
test_serialize_weights_returnearly | 0.2139s | 55.5903ms | 17.9888 Ops/s | 17.5867 Ops/s | |
test_serialize_weights_pickle | 1.3477s | 1.2185s | 0.8207 Ops/s | 0.8214 Ops/s | |
test_reshape_pytree | 65.0740μs | 36.5079μs | 27.3913 KOps/s | 27.6701 KOps/s | |
test_reshape_td | 79.9350μs | 41.4845μs | 24.1054 KOps/s | 23.9730 KOps/s | |
test_view_pytree | 71.3440μs | 35.8581μs | 27.8877 KOps/s | 28.3878 KOps/s | |
test_view_td | 0.1034ms | 47.8645μs | 20.8923 KOps/s | 21.2465 KOps/s | |
test_unbind_pytree | 70.9650μs | 35.0765μs | 28.5091 KOps/s | 28.8410 KOps/s | |
test_unbind_td | 0.5645ms | 44.5459μs | 22.4488 KOps/s | 23.7090 KOps/s | |
test_split_pytree | 88.0060μs | 49.3435μs | 20.2661 KOps/s | 22.3143 KOps/s | |
test_split_td | 0.7004ms | 57.8948μs | 17.2727 KOps/s | 17.6016 KOps/s | |
test_add_pytree | 0.1033ms | 60.8205μs | 16.4418 KOps/s | 17.3135 KOps/s | |
test_add_td | 0.1475ms | 98.7086μs | 10.1308 KOps/s | 10.7343 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4109ms | 0.2079ms | 4.8107 KOps/s | 4.7130 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.1897ms | 0.1514ms | 6.6052 KOps/s | 6.5978 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1889ms | 0.1460ms | 6.8489 KOps/s | 6.8726 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2340ms | 0.1850ms | 5.4068 KOps/s | 5.3823 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 57.1530μs | 22.0005μs | 45.4535 KOps/s | 45.0307 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1060ms | 43.2378μs | 23.1279 KOps/s | 22.7405 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2169ms | 64.7634μs | 15.4408 KOps/s | 15.5707 KOps/s | |
test_compile_copy_nested[pytree-eager] | 80.5650μs | 49.2311μs | 20.3124 KOps/s | 20.3166 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4431ms | 0.3168ms | 3.1567 KOps/s | 3.1227 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2543ms | 0.2055ms | 4.8663 KOps/s | 4.7051 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1765ms | 0.1277ms | 7.8330 KOps/s | 7.8563 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1040ms | 63.7593μs | 15.6840 KOps/s | 16.3012 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3854ms | 0.3170ms | 3.1548 KOps/s | 3.1181 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8053ms | 0.6442ms | 1.5523 KOps/s | 1.5957 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3358ms | 0.2452ms | 4.0779 KOps/s | 3.9674 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3828ms | 0.3170ms | 3.1546 KOps/s | 3.1301 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1178ms | 73.3292μs | 13.6371 KOps/s | 14.3756 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1832ms | 0.1289ms | 7.7597 KOps/s | 7.7860 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6338ms | 0.5491ms | 1.8210 KOps/s | 1.8591 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4527ms | 0.3196ms | 3.1286 KOps/s | 3.1294 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1731ms | 17.8228μs | 56.1079 KOps/s | 53.9691 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 55.8230μs | 26.7324μs | 37.4078 KOps/s | 36.2327 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1056ms | 70.6098μs | 14.1623 KOps/s | 14.1692 KOps/s | |
test_compile_copy_flat[pytree-eager] | 84.3050μs | 50.8665μs | 19.6593 KOps/s | 19.4607 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3703ms | 0.8430ms | 1.1862 KOps/s | 1.1179 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.7797ms | 3.3641ms | 297.2594 Ops/s | 300.5114 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4203ms | 0.8430ms | 1.1862 KOps/s | 1.1285 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6181ms | 3.3952ms | 294.5342 Ops/s | 307.6731 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2422ms | 0.1168ms | 8.5629 KOps/s | 8.8064 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2281ms | 66.9749μs | 14.9310 KOps/s | 15.7547 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2463ms | 0.1066ms | 9.3784 KOps/s | 9.3679 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1540ms | 44.1800μs | 22.6347 KOps/s | 21.8530 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2324ms | 0.1072ms | 9.3268 KOps/s | 9.2301 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1831ms | 46.5129μs | 21.4994 KOps/s | 21.6267 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2438ms | 0.1454ms | 6.8778 KOps/s | 7.2631 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1639ms | 26.5428μs | 37.6750 KOps/s | 38.4994 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2353ms | 0.1388ms | 7.2048 KOps/s | 7.6115 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1289ms | 21.9547μs | 45.5483 KOps/s | 47.3672 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2905ms | 0.1391ms | 7.1875 KOps/s | 7.5451 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 58.5730μs | 20.8677μs | 47.9210 KOps/s | 47.1891 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1875ms | 0.1377ms | 7.2602 KOps/s | 7.1990 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4964ms | 26.2660μs | 38.0720 KOps/s | 39.1085 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1741ms | 0.1314ms | 7.6115 KOps/s | 7.4469 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 51.9830μs | 21.2306μs | 47.1019 KOps/s | 47.3956 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1738ms | 0.1320ms | 7.5780 KOps/s | 7.2455 KOps/s | |
test_compile_indexing[int-pytree-eager] | 60.8830μs | 21.1275μs | 47.3318 KOps/s | 47.1291 KOps/s | |
test_mod_add[eager] | 0.1114ms | 32.4380μs | 30.8280 KOps/s | 28.6254 KOps/s | |
test_mod_add[compile] | 0.1176ms | 70.8634μs | 14.1117 KOps/s | 13.5644 KOps/s | |
test_mod_add[compile-overhead] | 0.2623ms | 0.1350ms | 7.4082 KOps/s | 6.9957 KOps/s | |
test_mod_wrap[eager] | 0.3123ms | 0.2524ms | 3.9619 KOps/s | 4.0461 KOps/s | |
test_mod_wrap[compile] | 1.4095ms | 0.2966ms | 3.3713 KOps/s | 3.2432 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7316ms | 4.0822ms | 244.9660 Ops/s | 271.4085 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5784ms | 1.3458ms | 743.0642 Ops/s | 693.3149 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5612ms | 1.3406ms | 745.9535 Ops/s | 693.7466 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3163ms | 0.8997ms | 1.1115 KOps/s | 988.0162 Ops/s | |
test_seq_add[eager] | 0.1684ms | 0.1025ms | 9.7536 KOps/s | 9.8832 KOps/s | |
test_seq_add[compile] | 0.4825ms | 80.8636μs | 12.3665 KOps/s | 12.1488 KOps/s | |
test_seq_add[compile-overhead] | 0.1979ms | 0.1139ms | 8.7797 KOps/s | 8.6954 KOps/s | |
test_seq_wrap[eager] | 0.4948ms | 0.3811ms | 2.6241 KOps/s | 2.5091 KOps/s | |
test_seq_wrap[compile] | 0.3755ms | 0.3151ms | 3.1736 KOps/s | 3.1021 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3725ms | 0.2207ms | 4.5309 KOps/s | 4.3533 KOps/s | |
test_func_call_runtime[False-eager] | 0.8888ms | 0.8046ms | 1.2428 KOps/s | 1.2743 KOps/s | |
test_func_call_runtime[False-compile] | 0.8979ms | 0.8138ms | 1.2288 KOps/s | 1.2505 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4371ms | 0.3643ms | 2.7452 KOps/s | 2.7369 KOps/s | |
test_func_call_runtime[True-eager] | 0.9917ms | 0.9052ms | 1.1048 KOps/s | 1.0816 KOps/s | |
test_func_call_runtime[True-compile] | 0.8836ms | 0.8180ms | 1.2225 KOps/s | 1.2277 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4487ms | 0.3866ms | 2.5869 KOps/s | 2.5923 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8265ms | 0.7379ms | 1.3553 KOps/s | 1.2888 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8707ms | 0.8037ms | 1.2442 KOps/s | 1.2594 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5692ms | 0.3686ms | 2.7132 KOps/s | 2.7411 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0792ms | 0.9942ms | 1.0058 KOps/s | 978.9854 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9191ms | 0.8457ms | 1.1824 KOps/s | 1.1855 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4693ms | 0.4093ms | 2.4431 KOps/s | 2.4301 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6008ms | 2.0635ms | 484.6094 Ops/s | 474.9257 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9527ms | 0.8554ms | 1.1690 KOps/s | 1.1647 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4878ms | 0.4143ms | 2.4138 KOps/s | 2.4252 KOps/s | |
test_distributed | 4.5796ms | 0.2287ms | 4.3732 KOps/s | 8.7643 KOps/s | |
test_tdmodule | 0.1273ms | 14.5729μs | 68.6206 KOps/s | 63.7849 KOps/s | |
test_tdmodule_dispatch | 67.7240μs | 29.4261μs | 33.9834 KOps/s | 32.3058 KOps/s | |
test_tdseq | 35.8320μs | 15.6893μs | 63.7378 KOps/s | 60.3385 KOps/s | |
test_tdseq_dispatch | 61.7930μs | 31.9103μs | 31.3378 KOps/s | 29.7079 KOps/s | |
test_instantiation_functorch | 2.0213ms | 1.8847ms | 530.6000 Ops/s | 531.8011 Ops/s | |
test_instantiation_td | 1.8333ms | 1.2068ms | 828.6505 Ops/s | 822.1957 Ops/s | |
test_exec_functorch | 0.2661ms | 0.2114ms | 4.7307 KOps/s | 4.7917 KOps/s | |
test_exec_functional_call | 0.2648ms | 0.2096ms | 4.7711 KOps/s | 4.7582 KOps/s | |
test_exec_td | 0.2570ms | 0.2154ms | 4.6418 KOps/s | 4.6569 KOps/s | |
test_exec_td_decorator | 0.3423ms | 0.2572ms | 3.8885 KOps/s | 3.8651 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7847ms | 0.6886ms | 1.4522 KOps/s | 1.4391 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7553ms | 0.6861ms | 1.4575 KOps/s | 1.4451 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6567ms | 0.5763ms | 1.7353 KOps/s | 1.7233 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6346ms | 0.5771ms | 1.7329 KOps/s | 1.7254 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1917ms | 0.6741ms | 1.4834 KOps/s | 1.4730 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7984ms | 0.6747ms | 1.4821 KOps/s | 1.4669 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6935ms | 0.5900ms | 1.6948 KOps/s | 1.6846 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7011ms | 0.5892ms | 1.6972 KOps/s | 1.6773 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.7945ms | 8.3934ms | 119.1411 Ops/s | 118.1418 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.5850ms | 8.3534ms | 119.7115 Ops/s | 118.5730 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.2750ms | 8.1460ms | 122.7599 Ops/s | 121.7195 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.2446ms | 8.1286ms | 123.0227 Ops/s | 121.1466 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.7432ms | 19.5327ms | 51.1963 Ops/s | 50.9287 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.6557ms | 19.4764ms | 51.3442 Ops/s | 50.9635 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5777ms | 19.3310ms | 51.7303 Ops/s | 51.4140 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.5725ms | 19.3731ms | 51.6179 Ops/s | 51.2725 Ops/s | |
test_to_module_speed[True] | 1.3639ms | 0.9353ms | 1.0691 KOps/s | 1.0570 KOps/s | |
test_to_module_speed[False] | 1.3430ms | 0.9094ms | 1.0996 KOps/s | 1.0700 KOps/s | |
test_tc_init | 66.9340μs | 34.1185μs | 29.3096 KOps/s | 26.9793 KOps/s | |
test_tc_init_nested | 98.2060μs | 68.5735μs | 14.5829 KOps/s | 13.3752 KOps/s | |
test_tc_first_layer_tensor | 4.2787μs | 0.6848μs | 1.4603 MOps/s | 1.4393 MOps/s | |
test_tc_first_layer_nontensor | 26.9710μs | 2.2841μs | 437.8054 KOps/s | 443.2155 KOps/s | |
test_tc_second_layer_tensor | 11.0907μs | 1.3954μs | 716.6184 KOps/s | 724.6366 KOps/s | |
test_tc_second_layer_nontensor | 25.7110μs | 3.0039μs | 332.9058 KOps/s | 336.0208 KOps/s | |
test_unbind | 0.1963s | 12.2915ms | 81.3572 Ops/s | 91.3793 Ops/s | |
test_full_like | 0.6636ms | 0.5733ms | 1.7443 KOps/s | 1.7432 KOps/s | |
test_zeros_like | 0.2754ms | 0.1980ms | 5.0512 KOps/s | 5.0540 KOps/s | |
test_ones_like | 0.2274ms | 0.1977ms | 5.0588 KOps/s | 5.0578 KOps/s | |
test_clone | 0.4401ms | 0.4144ms | 2.4130 KOps/s | 2.4135 KOps/s | |
test_squeeze | 35.3320μs | 10.0299μs | 99.7015 KOps/s | 100.6858 KOps/s | |
test_unsqueeze | 0.2505ms | 76.1898μs | 13.1251 KOps/s | 13.3852 KOps/s | |
test_split | 0.3997ms | 0.1648ms | 6.0681 KOps/s | 6.4027 KOps/s | |
test_permute | 0.2212ms | 0.1787ms | 5.5951 KOps/s | 5.5964 KOps/s | |
test_stack | 1.2594ms | 0.8646ms | 1.1565 KOps/s | 1.1892 KOps/s | |
test_cat | 1.2565ms | 1.2318ms | 811.8051 Ops/s | 811.7884 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: 14fa71bdac21f1109a7e42c37357f3b62db9f402 Pull Request resolved: #1017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):