Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] cat_tensors and stack_tensors #1017

Merged
merged 1 commit into from
Oct 1, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 1, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 1, 2024
Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}22$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.7130μs 21.0366μs 47.5362 KOps/s 50.6876 KOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_plain_set_stack_nested 50.4030μs 21.0042μs 47.6096 KOps/s 50.9806 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_plain_set_nested_inplace 68.3770μs 22.8817μs 43.7030 KOps/s 46.7940 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_plain_set_stack_nested_inplace 50.4140μs 22.8333μs 43.7957 KOps/s 46.3075 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_items 25.0460μs 4.2939μs 232.8904 KOps/s 240.7553 KOps/s $\color{#d91a1a}-3.27\%$
test_items_nested 0.6461ms 0.3660ms 2.7321 KOps/s 2.7844 KOps/s $\color{#d91a1a}-1.88\%$
test_items_nested_locked 0.5923ms 0.3626ms 2.7582 KOps/s 2.7737 KOps/s $\color{#d91a1a}-0.56\%$
test_items_nested_leaf 0.1400ms 68.9429μs 14.5048 KOps/s 14.4583 KOps/s $\color{#35bf28}+0.32\%$
test_items_stack_nested 0.5596ms 0.3709ms 2.6963 KOps/s 2.7566 KOps/s $\color{#d91a1a}-2.19\%$
test_items_stack_nested_leaf 0.1325ms 71.5512μs 13.9760 KOps/s 13.8953 KOps/s $\color{#35bf28}+0.58\%$
test_items_stack_nested_locked 0.6854ms 0.3693ms 2.7080 KOps/s 2.7635 KOps/s $\color{#d91a1a}-2.01\%$
test_keys 24.6360μs 3.5257μs 283.6348 KOps/s 286.8116 KOps/s $\color{#d91a1a}-1.11\%$
test_keys_nested 0.2031ms 0.1032ms 9.6932 KOps/s 10.1510 KOps/s $\color{#d91a1a}-4.51\%$
test_keys_nested_locked 0.7681ms 0.1076ms 9.2916 KOps/s 9.6737 KOps/s $\color{#d91a1a}-3.95\%$
test_keys_nested_leaf 0.1815ms 85.5944μs 11.6830 KOps/s 12.3041 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_keys_stack_nested 0.2006ms 99.7704μs 10.0230 KOps/s 10.0917 KOps/s $\color{#d91a1a}-0.68\%$
test_keys_stack_nested_leaf 0.1507ms 82.2631μs 12.1561 KOps/s 12.2327 KOps/s $\color{#d91a1a}-0.63\%$
test_keys_stack_nested_locked 0.1977ms 0.1055ms 9.4761 KOps/s 9.5651 KOps/s $\color{#d91a1a}-0.93\%$
test_values 5.9190μs 1.0354μs 965.8001 KOps/s 951.7237 KOps/s $\color{#35bf28}+1.48\%$
test_values_nested 0.1310ms 73.4961μs 13.6062 KOps/s 13.8304 KOps/s $\color{#d91a1a}-1.62\%$
test_values_nested_locked 0.1655ms 72.3242μs 13.8266 KOps/s 13.6065 KOps/s $\color{#35bf28}+1.62\%$
test_values_nested_leaf 0.1146ms 61.3789μs 16.2922 KOps/s 16.2896 KOps/s $\color{#35bf28}+0.02\%$
test_values_stack_nested 0.1352ms 73.9305μs 13.5262 KOps/s 13.5828 KOps/s $\color{#d91a1a}-0.42\%$
test_values_stack_nested_leaf 0.1137ms 59.1848μs 16.8962 KOps/s 16.0841 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_values_stack_nested_locked 0.1325ms 72.9640μs 13.7054 KOps/s 13.5921 KOps/s $\color{#35bf28}+0.83\%$
test_membership 22.2210μs 0.8648μs 1.1564 MOps/s 1.3577 MOps/s $\textbf{\color{#d91a1a}-14.82\%}$
test_membership_nested 22.0410μs 2.7808μs 359.6120 KOps/s 364.3838 KOps/s $\color{#d91a1a}-1.31\%$
test_membership_nested_leaf 34.6440μs 2.7786μs 359.8887 KOps/s 362.1230 KOps/s $\color{#d91a1a}-0.62\%$
test_membership_stacked_nested 34.7640μs 2.7829μs 359.3409 KOps/s 355.4980 KOps/s $\color{#35bf28}+1.08\%$
test_membership_stacked_nested_leaf 29.2850μs 2.7970μs 357.5213 KOps/s 360.7501 KOps/s $\color{#d91a1a}-0.90\%$
test_membership_nested_last 28.7740μs 3.9779μs 251.3906 KOps/s 253.2901 KOps/s $\color{#d91a1a}-0.75\%$
test_membership_nested_leaf_last 25.3670μs 3.9894μs 250.6645 KOps/s 251.2412 KOps/s $\color{#d91a1a}-0.23\%$
test_membership_stacked_nested_last 34.9850μs 13.0150μs 76.8346 KOps/s 250.8916 KOps/s $\textbf{\color{#d91a1a}-69.38\%}$
test_membership_stacked_nested_leaf_last 42.2790μs 13.0254μs 76.7731 KOps/s 249.3397 KOps/s $\textbf{\color{#d91a1a}-69.21\%}$
test_nested_getleaf 35.8470μs 10.7982μs 92.6079 KOps/s 92.6843 KOps/s $\color{#d91a1a}-0.08\%$
test_nested_get 39.9440μs 10.1876μs 98.1584 KOps/s 97.7910 KOps/s $\color{#35bf28}+0.38\%$
test_stacked_getleaf 38.5820μs 10.6407μs 93.9784 KOps/s 92.8633 KOps/s $\color{#35bf28}+1.20\%$
test_stacked_get 32.9210μs 10.2546μs 97.5171 KOps/s 98.1585 KOps/s $\color{#d91a1a}-0.65\%$
test_nested_getitemleaf 36.0870μs 11.1858μs 89.3994 KOps/s 90.1416 KOps/s $\color{#d91a1a}-0.82\%$
test_nested_getitem 40.8630μs 10.3023μs 97.0654 KOps/s 95.7378 KOps/s $\color{#35bf28}+1.39\%$
test_stacked_getitemleaf 36.7580μs 11.1009μs 90.0831 KOps/s 91.1472 KOps/s $\color{#d91a1a}-1.17\%$
test_stacked_getitem 37.6600μs 10.3423μs 96.6906 KOps/s 96.0176 KOps/s $\color{#35bf28}+0.70\%$
test_lock_nested 85.2494ms 0.5787ms 1.7281 KOps/s 2.0123 KOps/s $\textbf{\color{#d91a1a}-14.12\%}$
test_lock_stack_nested 0.6832ms 0.4422ms 2.2613 KOps/s 2.1275 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_unlock_nested 87.5026ms 0.4939ms 2.0247 KOps/s 2.3846 KOps/s $\textbf{\color{#d91a1a}-15.09\%}$
test_unlock_stack_nested 0.5289ms 0.3608ms 2.7719 KOps/s 2.5842 KOps/s $\textbf{\color{#35bf28}+7.26\%}$
test_flatten_speed 0.1511ms 86.8554μs 11.5134 KOps/s 10.9529 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_unflatten_speed 0.5826ms 0.4634ms 2.1581 KOps/s 2.1359 KOps/s $\color{#35bf28}+1.04\%$
test_common_ops 2.1612ms 1.1389ms 878.0446 Ops/s 889.1541 Ops/s $\color{#d91a1a}-1.25\%$
test_creation 25.0370μs 2.0906μs 478.3426 KOps/s 468.0298 KOps/s $\color{#35bf28}+2.20\%$
test_creation_empty 69.9010μs 18.9939μs 52.6486 KOps/s 58.4844 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_creation_nested_1 83.6860μs 21.9964μs 45.4620 KOps/s 49.6712 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_creation_nested_2 78.9170μs 26.4652μs 37.7855 KOps/s 41.3171 KOps/s $\textbf{\color{#d91a1a}-8.55\%}$
test_clone 1.3190ms 17.2755μs 57.8856 KOps/s 57.6042 KOps/s $\color{#35bf28}+0.49\%$
test_getitem[int] 0.7990ms 16.7733μs 59.6184 KOps/s 56.8409 KOps/s $\color{#35bf28}+4.89\%$
test_getitem[slice_int] 0.1461ms 31.5603μs 31.6854 KOps/s 31.4987 KOps/s $\color{#35bf28}+0.59\%$
test_getitem[range] 0.1760ms 60.9876μs 16.3968 KOps/s 16.9056 KOps/s $\color{#d91a1a}-3.01\%$
test_getitem[tuple] 0.1323ms 25.2986μs 39.5279 KOps/s 39.0598 KOps/s $\color{#35bf28}+1.20\%$
test_getitem[list] 0.2114ms 57.7782μs 17.3076 KOps/s 18.4563 KOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_setitem_dim[int] 89.4960μs 36.9943μs 27.0312 KOps/s 29.4319 KOps/s $\textbf{\color{#d91a1a}-8.16\%}$
test_setitem_dim[slice_int] 0.1130ms 61.4761μs 16.2665 KOps/s 16.2102 KOps/s $\color{#35bf28}+0.35\%$
test_setitem_dim[range] 0.1513ms 84.0920μs 11.8917 KOps/s 11.7957 KOps/s $\color{#35bf28}+0.81\%$
test_setitem_dim[tuple] 89.0960μs 49.9836μs 20.0066 KOps/s 19.9725 KOps/s $\color{#35bf28}+0.17\%$
test_setitem 0.3077ms 29.9890μs 33.3455 KOps/s 33.8977 KOps/s $\color{#d91a1a}-1.63\%$
test_set 0.1901ms 29.3875μs 34.0281 KOps/s 34.9001 KOps/s $\color{#d91a1a}-2.50\%$
test_set_shared 2.2471ms 0.2136ms 4.6807 KOps/s 4.5985 KOps/s $\color{#35bf28}+1.79\%$
test_update 0.2171ms 37.2533μs 26.8433 KOps/s 28.2941 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_update_nested 0.2431ms 47.9892μs 20.8380 KOps/s 21.7254 KOps/s $\color{#d91a1a}-4.08\%$
test_update__nested 0.1417ms 34.8513μs 28.6934 KOps/s 27.7388 KOps/s $\color{#35bf28}+3.44\%$
test_set_nested 0.1619ms 32.0242μs 31.2264 KOps/s 31.7658 KOps/s $\color{#d91a1a}-1.70\%$
test_set_nested_new 0.3030ms 37.3727μs 26.7575 KOps/s 27.1542 KOps/s $\color{#d91a1a}-1.46\%$
test_select 0.2865ms 54.6530μs 18.2973 KOps/s 18.5697 KOps/s $\color{#d91a1a}-1.47\%$
test_select_nested 0.1126ms 58.9514μs 16.9631 KOps/s 16.7362 KOps/s $\color{#35bf28}+1.36\%$
test_exclude_nested 0.1397ms 73.6306μs 13.5813 KOps/s 13.6081 KOps/s $\color{#d91a1a}-0.20\%$
test_empty[True] 0.4126ms 0.3184ms 3.1402 KOps/s 3.1363 KOps/s $\color{#35bf28}+0.12\%$
test_empty[False] 7.9323μs 1.2070μs 828.5310 KOps/s 821.6175 KOps/s $\color{#35bf28}+0.84\%$
test_unbind_speed 0.5320ms 0.3131ms 3.1938 KOps/s 3.2194 KOps/s $\color{#d91a1a}-0.79\%$
test_unbind_speed_stack0 0.4438ms 0.2917ms 3.4286 KOps/s 3.3152 KOps/s $\color{#35bf28}+3.42\%$
test_unbind_speed_stack1 93.6979ms 0.7918ms 1.2629 KOps/s 1.4363 KOps/s $\textbf{\color{#d91a1a}-12.07\%}$
test_split 2.2147ms 2.0024ms 499.3916 Ops/s 453.0369 Ops/s $\textbf{\color{#35bf28}+10.23\%}$
test_chunk 96.6796ms 2.1943ms 455.7337 Ops/s 443.8029 Ops/s $\color{#35bf28}+2.69\%$
test_creation[device0] 0.2335ms 0.1178ms 8.4903 KOps/s 8.5049 KOps/s $\color{#d91a1a}-0.17\%$
test_creation_from_tensor 3.4986ms 0.1193ms 8.3826 KOps/s 8.4394 KOps/s $\color{#d91a1a}-0.67\%$
test_add_one[memmap_tensor0] 0.4193ms 7.5241μs 132.9062 KOps/s 130.1608 KOps/s $\color{#35bf28}+2.11\%$
test_contiguous[memmap_tensor0] 21.9710μs 1.9080μs 524.1139 KOps/s 465.3542 KOps/s $\textbf{\color{#35bf28}+12.63\%}$
test_stack[memmap_tensor0] 52.4870μs 5.6666μs 176.4741 KOps/s 175.6311 KOps/s $\color{#35bf28}+0.48\%$
test_memmaptd_index 1.0640ms 0.4111ms 2.4325 KOps/s 2.4591 KOps/s $\color{#d91a1a}-1.08\%$
test_memmaptd_index_astensor 0.7655ms 0.4881ms 2.0487 KOps/s 2.0475 KOps/s $\color{#35bf28}+0.06\%$
test_memmaptd_index_op 1.4231ms 1.0308ms 970.1352 Ops/s 968.8956 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_model 0.2228s 0.1368s 7.3125 Ops/s 8.3292 Ops/s $\textbf{\color{#d91a1a}-12.21\%}$
test_serialize_model_pickle 0.4605s 0.3945s 2.5346 Ops/s 2.5012 Ops/s $\color{#35bf28}+1.34\%$
test_serialize_weights 0.1260s 0.1150s 8.6969 Ops/s 8.6466 Ops/s $\color{#35bf28}+0.58\%$
test_serialize_weights_returnearly 0.1737s 0.1617s 6.1855 Ops/s 6.3695 Ops/s $\color{#d91a1a}-2.89\%$
test_serialize_weights_pickle 0.4674s 0.4056s 2.4655 Ops/s 2.4706 Ops/s $\color{#d91a1a}-0.21\%$
test_serialize_weights_filesystem 0.2433s 0.1560s 6.4101 Ops/s 6.9240 Ops/s $\textbf{\color{#d91a1a}-7.42\%}$
test_serialize_model_filesystem 0.1608s 0.1531s 6.5319 Ops/s 6.0602 Ops/s $\textbf{\color{#35bf28}+7.78\%}$
test_reshape_pytree 95.8550μs 39.1263μs 25.5582 KOps/s 25.3373 KOps/s $\color{#35bf28}+0.87\%$
test_reshape_td 0.1165ms 46.6501μs 21.4362 KOps/s 20.5426 KOps/s $\color{#35bf28}+4.35\%$
test_view_pytree 86.4200μs 38.5141μs 25.9645 KOps/s 25.4928 KOps/s $\color{#35bf28}+1.85\%$
test_view_td 0.1205ms 51.9134μs 19.2628 KOps/s 18.0086 KOps/s $\textbf{\color{#35bf28}+6.96\%}$
test_unbind_pytree 83.1950μs 35.7007μs 28.0107 KOps/s 27.1948 KOps/s $\color{#35bf28}+3.00\%$
test_unbind_td 0.3729ms 45.3959μs 22.0284 KOps/s 21.7814 KOps/s $\color{#35bf28}+1.13\%$
test_split_pytree 0.1172ms 38.0960μs 26.2495 KOps/s 26.0629 KOps/s $\color{#35bf28}+0.72\%$
test_split_td 0.4499ms 58.4542μs 17.1074 KOps/s 16.7770 KOps/s $\color{#35bf28}+1.97\%$
test_add_pytree 99.0140μs 46.4898μs 21.5101 KOps/s 21.7092 KOps/s $\color{#d91a1a}-0.92\%$
test_add_td 0.2993ms 82.1712μs 12.1697 KOps/s 12.1235 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_nested[tensordict-compile] 0.1082ms 57.3897μs 17.4247 KOps/s 17.1528 KOps/s $\color{#35bf28}+1.59\%$
test_compile_add_one_nested[tensordict-eager] 0.2972ms 0.1773ms 5.6389 KOps/s 5.6184 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_nested[pytree-compile] 0.1076ms 56.9732μs 17.5521 KOps/s 17.4254 KOps/s $\color{#35bf28}+0.73\%$
test_compile_add_one_nested[pytree-eager] 0.3250ms 0.1440ms 6.9436 KOps/s 6.8964 KOps/s $\color{#35bf28}+0.68\%$
test_compile_copy_nested[tensordict-compile] 62.9870μs 21.9891μs 45.4770 KOps/s 43.6787 KOps/s $\color{#35bf28}+4.12\%$
test_compile_copy_nested[tensordict-eager] 0.1467ms 66.5560μs 15.0249 KOps/s 15.2934 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_copy_nested[pytree-compile] 0.1393ms 74.2779μs 13.4630 KOps/s 13.3147 KOps/s $\color{#35bf28}+1.11\%$
test_compile_copy_nested[pytree-eager] 0.1390ms 67.9363μs 14.7197 KOps/s 14.6540 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_flat[tensordict-compile] 0.3495ms 0.1735ms 5.7630 KOps/s 5.6776 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_flat[tensordict-eager] 0.3142ms 0.1906ms 5.2469 KOps/s 5.2058 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_one_flat[tensorclass-compile] 88.9650μs 47.6041μs 21.0066 KOps/s 20.5942 KOps/s $\color{#35bf28}+2.00\%$
test_compile_add_one_flat[tensorclass-eager] 0.2321ms 67.9849μs 14.7092 KOps/s 14.2165 KOps/s $\color{#35bf28}+3.47\%$
test_compile_add_one_flat[pytree-compile] 0.3720ms 0.1768ms 5.6551 KOps/s 5.6820 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_add_one_flat[pytree-eager] 0.4249ms 0.2865ms 3.4903 KOps/s 3.4644 KOps/s $\color{#35bf28}+0.75\%$
test_compile_add_self_flat[tensordict-eager] 0.4390ms 0.2040ms 4.9019 KOps/s 4.9307 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_self_flat[tensordict-compile] 0.3647ms 0.1764ms 5.6692 KOps/s 5.6972 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_self_flat[tensorclass-eager] 0.1501ms 61.8085μs 16.1790 KOps/s 15.8505 KOps/s $\color{#35bf28}+2.07\%$
test_compile_add_self_flat[tensorclass-compile] 0.1444ms 48.5114μs 20.6137 KOps/s 20.6837 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_self_flat[pytree-eager] 0.3417ms 0.2285ms 4.3757 KOps/s 4.2695 KOps/s $\color{#35bf28}+2.49\%$
test_compile_add_self_flat[pytree-compile] 0.3832ms 0.1749ms 5.7186 KOps/s 5.5544 KOps/s $\color{#35bf28}+2.96\%$
test_compile_copy_flat[tensordict-compile] 0.2415ms 0.1030ms 9.7050 KOps/s 9.6112 KOps/s $\color{#35bf28}+0.98\%$
test_compile_copy_flat[tensordict-eager] 0.1513ms 56.9324μs 17.5647 KOps/s 17.7854 KOps/s $\color{#d91a1a}-1.24\%$
test_compile_copy_flat[pytree-compile] 0.1431ms 75.4073μs 13.2613 KOps/s 12.9639 KOps/s $\color{#35bf28}+2.29\%$
test_compile_copy_flat[pytree-eager] 0.1276ms 67.7230μs 14.7660 KOps/s 14.4908 KOps/s $\color{#35bf28}+1.90\%$
test_compile_assign_and_add[tensordict-compile] 0.2949ms 0.1918ms 5.2141 KOps/s 5.1880 KOps/s $\color{#35bf28}+0.50\%$
test_compile_assign_and_add[tensordict-eager] 2.8119ms 1.6277ms 614.3721 Ops/s 600.1101 Ops/s $\color{#35bf28}+2.38\%$
test_compile_assign_and_add[pytree-compile] 0.3801ms 0.1898ms 5.2688 KOps/s 5.1616 KOps/s $\color{#35bf28}+2.08\%$
test_compile_assign_and_add[pytree-eager] 1.3346ms 1.0776ms 928.0252 Ops/s 890.9976 Ops/s $\color{#35bf28}+4.16\%$
test_compile_assign_and_add_stack[compile] 0.8168ms 0.4181ms 2.3915 KOps/s 2.3916 KOps/s $-0.00\%$
test_compile_assign_and_add_stack[eager] 7.2442ms 3.8691ms 258.4589 Ops/s 264.1793 Ops/s $\color{#d91a1a}-2.17\%$
test_compile_indexing[tensor-tensordict-compile] 90.6190μs 33.8960μs 29.5020 KOps/s 28.4186 KOps/s $\color{#35bf28}+3.81\%$
test_compile_indexing[tensor-tensordict-eager] 1.1682ms 49.7515μs 20.0999 KOps/s 19.9207 KOps/s $\color{#35bf28}+0.90\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1301ms 29.7818μs 33.5775 KOps/s 32.6615 KOps/s $\color{#35bf28}+2.80\%$
test_compile_indexing[tensor-tensorclass-eager] 83.4560μs 28.6025μs 34.9620 KOps/s 33.9026 KOps/s $\color{#35bf28}+3.12\%$
test_compile_indexing[tensor-pytree-compile] 78.9670μs 29.8286μs 33.5248 KOps/s 32.6738 KOps/s $\color{#35bf28}+2.60\%$
test_compile_indexing[tensor-pytree-eager] 68.8480μs 28.2798μs 35.3609 KOps/s 34.2341 KOps/s $\color{#35bf28}+3.29\%$
test_compile_indexing[slice-tensordict-compile] 0.1372ms 73.4986μs 13.6057 KOps/s 13.2909 KOps/s $\color{#35bf28}+2.37\%$
test_compile_indexing[slice-tensordict-eager] 0.5504ms 28.3100μs 35.3232 KOps/s 34.1668 KOps/s $\color{#35bf28}+3.38\%$
test_compile_indexing[slice-tensorclass-compile] 0.1519ms 68.8869μs 14.5165 KOps/s 14.4358 KOps/s $\color{#35bf28}+0.56\%$
test_compile_indexing[slice-tensorclass-eager] 73.2060μs 23.2811μs 42.9534 KOps/s 41.7517 KOps/s $\color{#35bf28}+2.88\%$
test_compile_indexing[slice-pytree-compile] 0.1412ms 67.6986μs 14.7714 KOps/s 14.5649 KOps/s $\color{#35bf28}+1.42\%$
test_compile_indexing[slice-pytree-eager] 76.5630μs 22.9692μs 43.5365 KOps/s 41.6421 KOps/s $\color{#35bf28}+4.55\%$
test_compile_indexing[int-tensordict-compile] 0.1398ms 73.3596μs 13.6315 KOps/s 13.6832 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_indexing[int-tensordict-eager] 0.9888ms 27.8815μs 35.8660 KOps/s 35.3587 KOps/s $\color{#35bf28}+1.43\%$
test_compile_indexing[int-tensorclass-compile] 0.1528ms 67.9864μs 14.7088 KOps/s 14.4857 KOps/s $\color{#35bf28}+1.54\%$
test_compile_indexing[int-tensorclass-eager] 65.2610μs 22.6728μs 44.1058 KOps/s 41.6704 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_compile_indexing[int-pytree-compile] 0.1265ms 67.8449μs 14.7395 KOps/s 14.5781 KOps/s $\color{#35bf28}+1.11\%$
test_compile_indexing[int-pytree-eager] 64.3200μs 22.8716μs 43.7224 KOps/s 42.2697 KOps/s $\color{#35bf28}+3.44\%$
test_mod_add[eager] 81.2410μs 25.7676μs 38.8085 KOps/s 40.4954 KOps/s $\color{#d91a1a}-4.17\%$
test_mod_add[compile] 0.1004ms 38.8549μs 25.7368 KOps/s 25.7325 KOps/s $\color{#35bf28}+0.02\%$
test_mod_add[compile-overhead] 87.7430μs 39.0213μs 25.6270 KOps/s 25.8294 KOps/s $\color{#d91a1a}-0.78\%$
test_mod_wrap[eager] 0.2958ms 0.2074ms 4.8218 KOps/s 4.8526 KOps/s $\color{#d91a1a}-0.63\%$
test_mod_wrap[compile] 0.3143ms 0.2322ms 4.3057 KOps/s 4.1339 KOps/s $\color{#35bf28}+4.16\%$
test_mod_wrap[compile-overhead] 0.3663ms 0.2290ms 4.3669 KOps/s 4.1559 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_mod_wrap_and_backward[eager] 13.0217ms 10.8762ms 91.9442 Ops/s 86.2961 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_mod_wrap_and_backward[compile] 12.3347ms 10.8252ms 92.3771 Ops/s 80.8981 Ops/s $\textbf{\color{#35bf28}+14.19\%}$
test_mod_wrap_and_backward[compile-overhead] 12.4514ms 10.9289ms 91.5004 Ops/s 81.1618 Ops/s $\textbf{\color{#35bf28}+12.74\%}$
test_seq_add[eager] 0.1640ms 91.0197μs 10.9866 KOps/s 11.0788 KOps/s $\color{#d91a1a}-0.83\%$
test_seq_add[compile] 0.3496ms 64.3561μs 15.5385 KOps/s 15.2764 KOps/s $\color{#35bf28}+1.72\%$
test_seq_add[compile-overhead] 0.1316ms 63.2669μs 15.8060 KOps/s 15.5219 KOps/s $\color{#35bf28}+1.83\%$
test_seq_wrap[eager] 0.4927ms 0.3840ms 2.6043 KOps/s 2.5244 KOps/s $\color{#35bf28}+3.16\%$
test_seq_wrap[compile] 1.2587ms 0.2673ms 3.7405 KOps/s 3.6305 KOps/s $\color{#35bf28}+3.03\%$
test_seq_wrap[compile-overhead] 1.3295ms 0.2674ms 3.7397 KOps/s 3.6235 KOps/s $\color{#35bf28}+3.21\%$
test_func_call_runtime[False-eager] 0.6989ms 0.5304ms 1.8852 KOps/s 1.8978 KOps/s $\color{#d91a1a}-0.67\%$
test_func_call_runtime[False-compile] 0.9005ms 0.5019ms 1.9925 KOps/s 1.9521 KOps/s $\color{#35bf28}+2.07\%$
test_func_call_runtime[False-compile-overhead] 0.7127ms 0.5042ms 1.9835 KOps/s 1.9382 KOps/s $\color{#35bf28}+2.34\%$
test_func_call_runtime[True-eager] 0.9064ms 0.7532ms 1.3277 KOps/s 1.3330 KOps/s $\color{#d91a1a}-0.40\%$
test_func_call_runtime[True-compile] 0.9677ms 0.5157ms 1.9391 KOps/s 1.9133 KOps/s $\color{#35bf28}+1.35\%$
test_func_call_runtime[True-compile-overhead] 0.6128ms 0.5148ms 1.9426 KOps/s 1.9006 KOps/s $\color{#35bf28}+2.21\%$
test_func_call_cm_runtime[False-eager] 0.9655ms 0.5299ms 1.8873 KOps/s 1.8946 KOps/s $\color{#d91a1a}-0.38\%$
test_func_call_cm_runtime[False-compile] 0.5950ms 0.4986ms 2.0055 KOps/s 1.9501 KOps/s $\color{#35bf28}+2.84\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6303ms 0.4994ms 2.0023 KOps/s 1.9497 KOps/s $\color{#35bf28}+2.70\%$
test_func_call_cm_runtime[True-eager] 1.1501ms 0.8778ms 1.1392 KOps/s 1.1208 KOps/s $\color{#35bf28}+1.64\%$
test_func_call_cm_runtime[True-compile] 0.9431ms 0.7358ms 1.3591 KOps/s 1.3238 KOps/s $\color{#35bf28}+2.66\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1486ms 0.7396ms 1.3521 KOps/s 1.3124 KOps/s $\color{#35bf28}+3.03\%$
test_vmap_func_call_cm_runtime[eager] 2.7213ms 1.8609ms 537.3673 Ops/s 524.3858 Ops/s $\color{#35bf28}+2.48\%$
test_vmap_func_call_cm_runtime[compile] 2.7781ms 1.9095ms 523.7050 Ops/s 513.6993 Ops/s $\color{#35bf28}+1.95\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.1976ms 1.9135ms 522.6052 Ops/s 513.4767 Ops/s $\color{#35bf28}+1.78\%$
test_distributed 0.2249ms 0.1275ms 7.8436 KOps/s 7.6457 KOps/s $\color{#35bf28}+2.59\%$
test_tdmodule 39.7650μs 18.5194μs 53.9973 KOps/s 55.4807 KOps/s $\color{#d91a1a}-2.67\%$
test_tdmodule_dispatch 63.9890μs 37.0338μs 27.0023 KOps/s 28.2044 KOps/s $\color{#d91a1a}-4.26\%$
test_tdseq 55.8740μs 21.7336μs 46.0118 KOps/s 48.4825 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_tdseq_dispatch 62.4760μs 43.1883μs 23.1544 KOps/s 24.5645 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_instantiation_functorch 2.1026ms 1.5857ms 630.6233 Ops/s 610.1194 Ops/s $\color{#35bf28}+3.36\%$
test_instantiation_td 1.8408ms 1.1685ms 855.8257 Ops/s 839.7532 Ops/s $\color{#35bf28}+1.91\%$
test_exec_functorch 0.3196ms 0.1841ms 5.4325 KOps/s 5.2359 KOps/s $\color{#35bf28}+3.75\%$
test_exec_functional_call 0.3700ms 0.1736ms 5.7602 KOps/s 5.6842 KOps/s $\color{#35bf28}+1.34\%$
test_exec_td 0.2907ms 0.1690ms 5.9164 KOps/s 5.8374 KOps/s $\color{#35bf28}+1.35\%$
test_exec_td_decorator 1.1106ms 0.2244ms 4.4562 KOps/s 4.4168 KOps/s $\color{#35bf28}+0.89\%$
test_vmap_mlp_speed[True-True] 1.1173ms 0.6497ms 1.5393 KOps/s 1.5378 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_mlp_speed[True-False] 0.9476ms 0.6435ms 1.5539 KOps/s 1.5453 KOps/s $\color{#35bf28}+0.56\%$
test_vmap_mlp_speed[False-True] 0.7065ms 0.4962ms 2.0155 KOps/s 1.9934 KOps/s $\color{#35bf28}+1.11\%$
test_vmap_mlp_speed[False-False] 0.6936ms 0.4959ms 2.0165 KOps/s 1.9870 KOps/s $\color{#35bf28}+1.49\%$
test_vmap_mlp_speed_decorator[True-True] 1.4206ms 0.6217ms 1.6084 KOps/s 1.5906 KOps/s $\color{#35bf28}+1.12\%$
test_vmap_mlp_speed_decorator[True-False] 1.0771ms 0.6242ms 1.6021 KOps/s 1.5959 KOps/s $\color{#35bf28}+0.39\%$
test_vmap_mlp_speed_decorator[False-True] 0.8509ms 0.5127ms 1.9504 KOps/s 1.9283 KOps/s $\color{#35bf28}+1.15\%$
test_vmap_mlp_speed_decorator[False-False] 0.6014ms 0.5085ms 1.9664 KOps/s 1.9358 KOps/s $\color{#35bf28}+1.58\%$
test_to_module_speed[True] 1.9574ms 1.3014ms 768.4037 Ops/s 758.7925 Ops/s $\color{#35bf28}+1.27\%$
test_to_module_speed[False] 1.4279ms 1.2595ms 793.9578 Ops/s 793.1632 Ops/s $\color{#35bf28}+0.10\%$
test_tc_init 96.1990μs 45.9082μs 21.7826 KOps/s 22.8234 KOps/s $\color{#d91a1a}-4.56\%$
test_tc_init_nested 0.1599ms 90.5385μs 11.0450 KOps/s 11.7534 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_tc_first_layer_tensor 15.7690μs 1.5344μs 651.7115 KOps/s 653.9869 KOps/s $\color{#d91a1a}-0.35\%$
test_tc_first_layer_nontensor 31.3180μs 4.7473μs 210.6453 KOps/s 210.9332 KOps/s $\color{#d91a1a}-0.14\%$
test_tc_second_layer_tensor 27.4010μs 2.7992μs 357.2472 KOps/s 356.9168 KOps/s $\color{#35bf28}+0.09\%$
test_tc_second_layer_nontensor 32.8010μs 6.0163μs 166.2156 KOps/s 164.0250 KOps/s $\color{#35bf28}+1.34\%$
test_unbind 0.5044s 13.4453ms 74.3752 Ops/s 69.9757 Ops/s $\textbf{\color{#35bf28}+6.29\%}$
test_full_like 9.4372ms 7.9070ms 126.4702 Ops/s 128.9854 Ops/s $\color{#d91a1a}-1.95\%$
test_zeros_like 3.8023ms 3.0114ms 332.0745 Ops/s 338.3099 Ops/s $\color{#d91a1a}-1.84\%$
test_ones_like 4.4049ms 3.5176ms 284.2877 Ops/s 286.0339 Ops/s $\color{#d91a1a}-0.61\%$
test_clone 6.4870ms 5.6469ms 177.0875 Ops/s 170.2166 Ops/s $\color{#35bf28}+4.04\%$
test_squeeze 61.7850μs 12.8057μs 78.0904 KOps/s 79.1581 KOps/s $\color{#d91a1a}-1.35\%$
test_unsqueeze 0.3234ms 92.3337μs 10.8303 KOps/s 10.5073 KOps/s $\color{#35bf28}+3.07\%$
test_split 0.3580ms 0.1957ms 5.1094 KOps/s 5.1089 KOps/s $\color{#35bf28}+0.01\%$
test_permute 0.4597ms 0.2172ms 4.6050 KOps/s 4.3832 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_stack 31.6412ms 26.4287ms 37.8377 Ops/s 37.2424 Ops/s $\color{#35bf28}+1.60\%$
test_cat 34.5179ms 26.1608ms 38.2251 Ops/s 37.8155 Ops/s $\color{#35bf28}+1.08\%$

Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1187ms 13.8820μs 72.0357 KOps/s 68.7161 KOps/s $\color{#35bf28}+4.83\%$
test_plain_set_stack_nested 34.6320μs 14.3680μs 69.5993 KOps/s 68.4647 KOps/s $\color{#35bf28}+1.66\%$
test_plain_set_nested_inplace 43.0030μs 15.1310μs 66.0895 KOps/s 63.8824 KOps/s $\color{#35bf28}+3.45\%$
test_plain_set_stack_nested_inplace 45.3530μs 15.2618μs 65.5232 KOps/s 65.2531 KOps/s $\color{#35bf28}+0.41\%$
test_items 25.3710μs 2.8748μs 347.8518 KOps/s 342.7950 KOps/s $\color{#35bf28}+1.48\%$
test_items_nested 0.4053ms 0.3342ms 2.9919 KOps/s 3.0658 KOps/s $\color{#d91a1a}-2.41\%$
test_items_nested_locked 0.3881ms 0.3359ms 2.9772 KOps/s 3.0453 KOps/s $\color{#d91a1a}-2.23\%$
test_items_nested_leaf 80.7050μs 55.7055μs 17.9515 KOps/s 18.0118 KOps/s $\color{#d91a1a}-0.33\%$
test_items_stack_nested 0.4134ms 0.3337ms 2.9963 KOps/s 3.0582 KOps/s $\color{#d91a1a}-2.02\%$
test_items_stack_nested_leaf 86.0250μs 56.8483μs 17.5907 KOps/s 17.5665 KOps/s $\color{#35bf28}+0.14\%$
test_items_stack_nested_locked 0.3945ms 0.3406ms 2.9359 KOps/s 3.0204 KOps/s $\color{#d91a1a}-2.80\%$
test_keys 28.3820μs 3.4302μs 291.5248 KOps/s 290.3769 KOps/s $\color{#35bf28}+0.40\%$
test_keys_nested 84.5350μs 55.2758μs 18.0911 KOps/s 18.0159 KOps/s $\color{#35bf28}+0.42\%$
test_keys_nested_locked 0.9192ms 62.6166μs 15.9702 KOps/s 16.0996 KOps/s $\color{#d91a1a}-0.80\%$
test_keys_nested_leaf 75.6350μs 48.0275μs 20.8214 KOps/s 21.0892 KOps/s $\color{#d91a1a}-1.27\%$
test_keys_stack_nested 81.8450μs 57.4775μs 17.3981 KOps/s 17.6896 KOps/s $\color{#d91a1a}-1.65\%$
test_keys_stack_nested_leaf 73.8840μs 48.5032μs 20.6172 KOps/s 20.7444 KOps/s $\color{#d91a1a}-0.61\%$
test_keys_stack_nested_locked 91.3360μs 62.8773μs 15.9040 KOps/s 16.1822 KOps/s $\color{#d91a1a}-1.72\%$
test_values 4.9253μs 0.8645μs 1.1567 MOps/s 1.1863 MOps/s $\color{#d91a1a}-2.50\%$
test_values_nested 81.1550μs 40.9318μs 24.4309 KOps/s 24.3977 KOps/s $\color{#35bf28}+0.14\%$
test_values_nested_locked 70.9140μs 43.2083μs 23.1437 KOps/s 23.2102 KOps/s $\color{#d91a1a}-0.29\%$
test_values_nested_leaf 67.7040μs 35.4223μs 28.2308 KOps/s 28.1116 KOps/s $\color{#35bf28}+0.42\%$
test_values_stack_nested 70.9040μs 42.1724μs 23.7122 KOps/s 23.9249 KOps/s $\color{#d91a1a}-0.89\%$
test_values_stack_nested_leaf 65.2940μs 35.9556μs 27.8120 KOps/s 28.0420 KOps/s $\color{#d91a1a}-0.82\%$
test_values_stack_nested_locked 76.8540μs 43.6247μs 22.9228 KOps/s 22.8496 KOps/s $\color{#35bf28}+0.32\%$
test_membership 1.9051μs 0.5005μs 1.9981 MOps/s 1.9858 MOps/s $\color{#35bf28}+0.62\%$
test_membership_nested 15.2710μs 1.9167μs 521.7427 KOps/s 540.1543 KOps/s $\color{#d91a1a}-3.41\%$
test_membership_nested_leaf 14.8710μs 1.9202μs 520.7925 KOps/s 541.9083 KOps/s $\color{#d91a1a}-3.90\%$
test_membership_stacked_nested 41.1630μs 1.9895μs 502.6454 KOps/s 521.7488 KOps/s $\color{#d91a1a}-3.66\%$
test_membership_stacked_nested_leaf 18.7810μs 1.9984μs 500.4037 KOps/s 520.8548 KOps/s $\color{#d91a1a}-3.93\%$
test_membership_nested_last 38.6620μs 2.8049μs 356.5209 KOps/s 361.1016 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_nested_leaf_last 32.9520μs 2.8124μs 355.5693 KOps/s 356.0597 KOps/s $\color{#d91a1a}-0.14\%$
test_membership_stacked_nested_last 25.3820μs 3.2012μs 312.3855 KOps/s 127.9481 KOps/s $\textbf{\color{#35bf28}+144.15\%}$
test_membership_stacked_nested_leaf_last 41.8620μs 3.2170μs 310.8448 KOps/s 128.4369 KOps/s $\textbf{\color{#35bf28}+142.02\%}$
test_nested_getleaf 31.2020μs 6.1417μs 162.8220 KOps/s 164.5920 KOps/s $\color{#d91a1a}-1.08\%$
test_nested_get 50.5830μs 5.7917μs 172.6598 KOps/s 174.4874 KOps/s $\color{#d91a1a}-1.05\%$
test_stacked_getleaf 26.5320μs 6.0815μs 164.4319 KOps/s 164.7605 KOps/s $\color{#d91a1a}-0.20\%$
test_stacked_get 48.3030μs 5.7885μs 172.7563 KOps/s 177.3633 KOps/s $\color{#d91a1a}-2.60\%$
test_nested_getitemleaf 40.6720μs 6.1783μs 161.8559 KOps/s 164.4556 KOps/s $\color{#d91a1a}-1.58\%$
test_nested_getitem 50.9530μs 5.8587μs 170.6867 KOps/s 174.9375 KOps/s $\color{#d91a1a}-2.43\%$
test_stacked_getitemleaf 22.8210μs 6.1452μs 162.7281 KOps/s 163.9711 KOps/s $\color{#d91a1a}-0.76\%$
test_stacked_getitem 46.3820μs 5.8056μs 172.2466 KOps/s 175.1370 KOps/s $\color{#d91a1a}-1.65\%$
test_lock_nested 7.5981ms 0.4268ms 2.3432 KOps/s 2.3496 KOps/s $\color{#d91a1a}-0.27\%$
test_lock_stack_nested 0.4510ms 0.3864ms 2.5877 KOps/s 2.6626 KOps/s $\color{#d91a1a}-2.81\%$
test_unlock_nested 0.7884ms 0.3616ms 2.7659 KOps/s 2.8144 KOps/s $\color{#d91a1a}-1.72\%$
test_unlock_stack_nested 0.3832ms 0.3265ms 3.0623 KOps/s 3.1813 KOps/s $\color{#d91a1a}-3.74\%$
test_flatten_speed 0.1459ms 68.9450μs 14.5043 KOps/s 14.3303 KOps/s $\color{#35bf28}+1.21\%$
test_unflatten_speed 0.3416ms 0.2824ms 3.5405 KOps/s 3.4725 KOps/s $\color{#35bf28}+1.96\%$
test_common_ops 1.6584ms 1.2823ms 779.8577 Ops/s 772.7646 Ops/s $\color{#35bf28}+0.92\%$
test_creation 26.8910μs 1.4920μs 670.2355 KOps/s 674.4908 KOps/s $\color{#d91a1a}-0.63\%$
test_creation_empty 48.9030μs 15.7323μs 63.5634 KOps/s 58.8845 KOps/s $\textbf{\color{#35bf28}+7.95\%}$
test_creation_nested_1 49.9130μs 17.7069μs 56.4751 KOps/s 53.4713 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_creation_nested_2 69.5550μs 20.2768μs 49.3174 KOps/s 46.7796 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_clone 1.3640ms 30.7958μs 32.4720 KOps/s 33.9466 KOps/s $\color{#d91a1a}-4.34\%$
test_getitem[int] 1.1937ms 16.5055μs 60.5859 KOps/s 60.4133 KOps/s $\color{#35bf28}+0.29\%$
test_getitem[slice_int] 0.1363ms 28.2010μs 35.4598 KOps/s 34.9750 KOps/s $\color{#35bf28}+1.39\%$
test_getitem[range] 0.2246ms 0.1106ms 9.0417 KOps/s 9.0926 KOps/s $\color{#d91a1a}-0.56\%$
test_getitem[tuple] 0.1157ms 23.7639μs 42.0806 KOps/s 41.6932 KOps/s $\color{#35bf28}+0.93\%$
test_getitem[list] 0.1923ms 99.6896μs 10.0311 KOps/s 9.9289 KOps/s $\color{#35bf28}+1.03\%$
test_setitem_dim[int] 75.7750μs 45.0289μs 22.2080 KOps/s 22.1983 KOps/s $\color{#35bf28}+0.04\%$
test_setitem_dim[slice_int] 0.1138ms 67.8195μs 14.7450 KOps/s 14.7275 KOps/s $\color{#35bf28}+0.12\%$
test_setitem_dim[range] 0.1718ms 0.1277ms 7.8305 KOps/s 7.7794 KOps/s $\color{#35bf28}+0.66\%$
test_setitem_dim[tuple] 84.0450μs 60.7531μs 16.4601 KOps/s 16.4877 KOps/s $\color{#d91a1a}-0.17\%$
test_setitem 82.1750μs 43.4275μs 23.0269 KOps/s 23.0234 KOps/s $\color{#35bf28}+0.01\%$
test_set 77.9750μs 42.1155μs 23.7442 KOps/s 22.1119 KOps/s $\textbf{\color{#35bf28}+7.38\%}$
test_set_shared 0.3648ms 51.9203μs 19.2603 KOps/s 18.4760 KOps/s $\color{#35bf28}+4.25\%$
test_update 87.7160μs 50.6474μs 19.7443 KOps/s 19.2375 KOps/s $\color{#35bf28}+2.63\%$
test_update_nested 0.1102ms 58.5521μs 17.0788 KOps/s 17.0085 KOps/s $\color{#35bf28}+0.41\%$
test_update__nested 96.2260μs 62.0708μs 16.1106 KOps/s 16.4924 KOps/s $\color{#d91a1a}-2.31\%$
test_set_nested 80.3640μs 44.8783μs 22.2825 KOps/s 20.4548 KOps/s $\textbf{\color{#35bf28}+8.93\%}$
test_set_nested_new 86.3550μs 49.2207μs 20.3167 KOps/s 19.4937 KOps/s $\color{#35bf28}+4.22\%$
test_select 94.0850μs 61.6555μs 16.2192 KOps/s 15.1605 KOps/s $\textbf{\color{#35bf28}+6.98\%}$
test_select_nested 74.1040μs 41.8827μs 23.8762 KOps/s 22.9475 KOps/s $\color{#35bf28}+4.05\%$
test_exclude_nested 90.9160μs 58.2758μs 17.1598 KOps/s 16.8980 KOps/s $\color{#35bf28}+1.55\%$
test_empty[True] 0.2976ms 0.2451ms 4.0802 KOps/s 4.0378 KOps/s $\color{#35bf28}+1.05\%$
test_empty[False] 3.6543μs 0.7445μs 1.3433 MOps/s 1.3539 MOps/s $\color{#d91a1a}-0.78\%$
test_to 53.8230μs 26.8876μs 37.1919 KOps/s 38.8816 KOps/s $\color{#d91a1a}-4.35\%$
test_to_nonblocking 81.6650μs 25.7351μs 38.8574 KOps/s 41.6309 KOps/s $\textbf{\color{#d91a1a}-6.66\%}$
test_unbind_speed 1.0549ms 0.2861ms 3.4950 KOps/s 3.6125 KOps/s $\color{#d91a1a}-3.25\%$
test_unbind_speed_stack0 0.3478ms 0.2798ms 3.5740 KOps/s 3.7252 KOps/s $\color{#d91a1a}-4.06\%$
test_unbind_speed_stack1 92.2272ms 0.7133ms 1.4019 KOps/s 1.4370 KOps/s $\color{#d91a1a}-2.44\%$
test_split 94.0693ms 2.1974ms 455.0746 Ops/s 454.9456 Ops/s $\color{#35bf28}+0.03\%$
test_chunk 94.2022ms 2.1795ms 458.8227 Ops/s 452.6599 Ops/s $\color{#35bf28}+1.36\%$
test_creation[device0] 0.3487ms 0.1290ms 7.7533 KOps/s 7.7473 KOps/s $\color{#35bf28}+0.08\%$
test_creation_from_tensor 0.3827ms 0.1309ms 7.6408 KOps/s 7.3795 KOps/s $\color{#35bf28}+3.54\%$
test_add_one[memmap_tensor0] 0.2037ms 9.4723μs 105.5705 KOps/s 107.7780 KOps/s $\color{#d91a1a}-2.05\%$
test_contiguous[memmap_tensor0] 29.7220μs 2.2270μs 449.0314 KOps/s 452.7398 KOps/s $\color{#d91a1a}-0.82\%$
test_stack[memmap_tensor0] 32.7620μs 6.9409μs 144.0736 KOps/s 145.4174 KOps/s $\color{#d91a1a}-0.92\%$
test_memmaptd_index 1.2496ms 0.4172ms 2.3972 KOps/s 2.3335 KOps/s $\color{#35bf28}+2.73\%$
test_memmaptd_index_astensor 0.7852ms 0.4790ms 2.0876 KOps/s 2.0551 KOps/s $\color{#35bf28}+1.58\%$
test_memmaptd_index_op 1.4531ms 1.0478ms 954.3785 Ops/s 933.0064 Ops/s $\color{#35bf28}+2.29\%$
test_serialize_model 0.1313s 0.1297s 7.7098 Ops/s 7.7107 Ops/s $\color{#d91a1a}-0.01\%$
test_serialize_model_pickle 1.3497s 1.2133s 0.8242 Ops/s 0.8211 Ops/s $\color{#35bf28}+0.38\%$
test_serialize_weights 0.2215s 0.1431s 6.9901 Ops/s 7.0033 Ops/s $\color{#d91a1a}-0.19\%$
test_serialize_weights_returnearly 0.2139s 55.5903ms 17.9888 Ops/s 17.5867 Ops/s $\color{#35bf28}+2.29\%$
test_serialize_weights_pickle 1.3477s 1.2185s 0.8207 Ops/s 0.8214 Ops/s $\color{#d91a1a}-0.09\%$
test_reshape_pytree 65.0740μs 36.5079μs 27.3913 KOps/s 27.6701 KOps/s $\color{#d91a1a}-1.01\%$
test_reshape_td 79.9350μs 41.4845μs 24.1054 KOps/s 23.9730 KOps/s $\color{#35bf28}+0.55\%$
test_view_pytree 71.3440μs 35.8581μs 27.8877 KOps/s 28.3878 KOps/s $\color{#d91a1a}-1.76\%$
test_view_td 0.1034ms 47.8645μs 20.8923 KOps/s 21.2465 KOps/s $\color{#d91a1a}-1.67\%$
test_unbind_pytree 70.9650μs 35.0765μs 28.5091 KOps/s 28.8410 KOps/s $\color{#d91a1a}-1.15\%$
test_unbind_td 0.5645ms 44.5459μs 22.4488 KOps/s 23.7090 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_split_pytree 88.0060μs 49.3435μs 20.2661 KOps/s 22.3143 KOps/s $\textbf{\color{#d91a1a}-9.18\%}$
test_split_td 0.7004ms 57.8948μs 17.2727 KOps/s 17.6016 KOps/s $\color{#d91a1a}-1.87\%$
test_add_pytree 0.1033ms 60.8205μs 16.4418 KOps/s 17.3135 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_add_td 0.1475ms 98.7086μs 10.1308 KOps/s 10.7343 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_compile_add_one_nested[tensordict-compile] 0.4109ms 0.2079ms 4.8107 KOps/s 4.7130 KOps/s $\color{#35bf28}+2.07\%$
test_compile_add_one_nested[tensordict-eager] 0.1897ms 0.1514ms 6.6052 KOps/s 6.5978 KOps/s $\color{#35bf28}+0.11\%$
test_compile_add_one_nested[pytree-compile] 0.1889ms 0.1460ms 6.8489 KOps/s 6.8726 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_one_nested[pytree-eager] 0.2340ms 0.1850ms 5.4068 KOps/s 5.3823 KOps/s $\color{#35bf28}+0.46\%$
test_compile_copy_nested[tensordict-compile] 57.1530μs 22.0005μs 45.4535 KOps/s 45.0307 KOps/s $\color{#35bf28}+0.94\%$
test_compile_copy_nested[tensordict-eager] 0.1060ms 43.2378μs 23.1279 KOps/s 22.7405 KOps/s $\color{#35bf28}+1.70\%$
test_compile_copy_nested[pytree-compile] 0.2169ms 64.7634μs 15.4408 KOps/s 15.5707 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_copy_nested[pytree-eager] 80.5650μs 49.2311μs 20.3124 KOps/s 20.3166 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_add_one_flat[tensordict-compile] 0.4431ms 0.3168ms 3.1567 KOps/s 3.1227 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_one_flat[tensordict-eager] 0.2543ms 0.2055ms 4.8663 KOps/s 4.7051 KOps/s $\color{#35bf28}+3.43\%$
test_compile_add_one_flat[tensorclass-compile] 0.1765ms 0.1277ms 7.8330 KOps/s 7.8563 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_one_flat[tensorclass-eager] 0.1040ms 63.7593μs 15.6840 KOps/s 16.3012 KOps/s $\color{#d91a1a}-3.79\%$
test_compile_add_one_flat[pytree-compile] 0.3854ms 0.3170ms 3.1548 KOps/s 3.1181 KOps/s $\color{#35bf28}+1.17\%$
test_compile_add_one_flat[pytree-eager] 0.8053ms 0.6442ms 1.5523 KOps/s 1.5957 KOps/s $\color{#d91a1a}-2.72\%$
test_compile_add_self_flat[tensordict-eager] 0.3358ms 0.2452ms 4.0779 KOps/s 3.9674 KOps/s $\color{#35bf28}+2.79\%$
test_compile_add_self_flat[tensordict-compile] 0.3828ms 0.3170ms 3.1546 KOps/s 3.1301 KOps/s $\color{#35bf28}+0.78\%$
test_compile_add_self_flat[tensorclass-eager] 0.1178ms 73.3292μs 13.6371 KOps/s 14.3756 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1832ms 0.1289ms 7.7597 KOps/s 7.7860 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_self_flat[pytree-eager] 0.6338ms 0.5491ms 1.8210 KOps/s 1.8591 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_self_flat[pytree-compile] 0.4527ms 0.3196ms 3.1286 KOps/s 3.1294 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_copy_flat[tensordict-compile] 0.1731ms 17.8228μs 56.1079 KOps/s 53.9691 KOps/s $\color{#35bf28}+3.96\%$
test_compile_copy_flat[tensordict-eager] 55.8230μs 26.7324μs 37.4078 KOps/s 36.2327 KOps/s $\color{#35bf28}+3.24\%$
test_compile_copy_flat[pytree-compile] 0.1056ms 70.6098μs 14.1623 KOps/s 14.1692 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_flat[pytree-eager] 84.3050μs 50.8665μs 19.6593 KOps/s 19.4607 KOps/s $\color{#35bf28}+1.02\%$
test_compile_assign_and_add[tensordict-compile] 2.3703ms 0.8430ms 1.1862 KOps/s 1.1179 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_compile_assign_and_add[tensordict-eager] 3.7797ms 3.3641ms 297.2594 Ops/s 300.5114 Ops/s $\color{#d91a1a}-1.08\%$
test_compile_assign_and_add[pytree-compile] 2.4203ms 0.8430ms 1.1862 KOps/s 1.1285 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_compile_assign_and_add[pytree-eager] 3.6181ms 3.3952ms 294.5342 Ops/s 307.6731 Ops/s $\color{#d91a1a}-4.27\%$
test_compile_indexing[tensor-tensordict-compile] 0.2422ms 0.1168ms 8.5629 KOps/s 8.8064 KOps/s $\color{#d91a1a}-2.76\%$
test_compile_indexing[tensor-tensordict-eager] 0.2281ms 66.9749μs 14.9310 KOps/s 15.7547 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.2463ms 0.1066ms 9.3784 KOps/s 9.3679 KOps/s $\color{#35bf28}+0.11\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1540ms 44.1800μs 22.6347 KOps/s 21.8530 KOps/s $\color{#35bf28}+3.58\%$
test_compile_indexing[tensor-pytree-compile] 0.2324ms 0.1072ms 9.3268 KOps/s 9.2301 KOps/s $\color{#35bf28}+1.05\%$
test_compile_indexing[tensor-pytree-eager] 0.1831ms 46.5129μs 21.4994 KOps/s 21.6267 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_indexing[slice-tensordict-compile] 0.2438ms 0.1454ms 6.8778 KOps/s 7.2631 KOps/s $\textbf{\color{#d91a1a}-5.30\%}$
test_compile_indexing[slice-tensordict-eager] 0.1639ms 26.5428μs 37.6750 KOps/s 38.4994 KOps/s $\color{#d91a1a}-2.14\%$
test_compile_indexing[slice-tensorclass-compile] 0.2353ms 0.1388ms 7.2048 KOps/s 7.6115 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_compile_indexing[slice-tensorclass-eager] 0.1289ms 21.9547μs 45.5483 KOps/s 47.3672 KOps/s $\color{#d91a1a}-3.84\%$
test_compile_indexing[slice-pytree-compile] 0.2905ms 0.1391ms 7.1875 KOps/s 7.5451 KOps/s $\color{#d91a1a}-4.74\%$
test_compile_indexing[slice-pytree-eager] 58.5730μs 20.8677μs 47.9210 KOps/s 47.1891 KOps/s $\color{#35bf28}+1.55\%$
test_compile_indexing[int-tensordict-compile] 0.1875ms 0.1377ms 7.2602 KOps/s 7.1990 KOps/s $\color{#35bf28}+0.85\%$
test_compile_indexing[int-tensordict-eager] 0.4964ms 26.2660μs 38.0720 KOps/s 39.1085 KOps/s $\color{#d91a1a}-2.65\%$
test_compile_indexing[int-tensorclass-compile] 0.1741ms 0.1314ms 7.6115 KOps/s 7.4469 KOps/s $\color{#35bf28}+2.21\%$
test_compile_indexing[int-tensorclass-eager] 51.9830μs 21.2306μs 47.1019 KOps/s 47.3956 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_indexing[int-pytree-compile] 0.1738ms 0.1320ms 7.5780 KOps/s 7.2455 KOps/s $\color{#35bf28}+4.59\%$
test_compile_indexing[int-pytree-eager] 60.8830μs 21.1275μs 47.3318 KOps/s 47.1291 KOps/s $\color{#35bf28}+0.43\%$
test_mod_add[eager] 0.1114ms 32.4380μs 30.8280 KOps/s 28.6254 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_mod_add[compile] 0.1176ms 70.8634μs 14.1117 KOps/s 13.5644 KOps/s $\color{#35bf28}+4.03\%$
test_mod_add[compile-overhead] 0.2623ms 0.1350ms 7.4082 KOps/s 6.9957 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_mod_wrap[eager] 0.3123ms 0.2524ms 3.9619 KOps/s 4.0461 KOps/s $\color{#d91a1a}-2.08\%$
test_mod_wrap[compile] 1.4095ms 0.2966ms 3.3713 KOps/s 3.2432 KOps/s $\color{#35bf28}+3.95\%$
test_mod_wrap[compile-overhead] 7.7316ms 4.0822ms 244.9660 Ops/s 271.4085 Ops/s $\textbf{\color{#d91a1a}-9.74\%}$
test_mod_wrap_and_backward[eager] 1.5784ms 1.3458ms 743.0642 Ops/s 693.3149 Ops/s $\textbf{\color{#35bf28}+7.18\%}$
test_mod_wrap_and_backward[compile] 1.5612ms 1.3406ms 745.9535 Ops/s 693.7466 Ops/s $\textbf{\color{#35bf28}+7.53\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3163ms 0.8997ms 1.1115 KOps/s 988.0162 Ops/s $\textbf{\color{#35bf28}+12.49\%}$
test_seq_add[eager] 0.1684ms 0.1025ms 9.7536 KOps/s 9.8832 KOps/s $\color{#d91a1a}-1.31\%$
test_seq_add[compile] 0.4825ms 80.8636μs 12.3665 KOps/s 12.1488 KOps/s $\color{#35bf28}+1.79\%$
test_seq_add[compile-overhead] 0.1979ms 0.1139ms 8.7797 KOps/s 8.6954 KOps/s $\color{#35bf28}+0.97\%$
test_seq_wrap[eager] 0.4948ms 0.3811ms 2.6241 KOps/s 2.5091 KOps/s $\color{#35bf28}+4.58\%$
test_seq_wrap[compile] 0.3755ms 0.3151ms 3.1736 KOps/s 3.1021 KOps/s $\color{#35bf28}+2.30\%$
test_seq_wrap[compile-overhead] 0.3725ms 0.2207ms 4.5309 KOps/s 4.3533 KOps/s $\color{#35bf28}+4.08\%$
test_func_call_runtime[False-eager] 0.8888ms 0.8046ms 1.2428 KOps/s 1.2743 KOps/s $\color{#d91a1a}-2.47\%$
test_func_call_runtime[False-compile] 0.8979ms 0.8138ms 1.2288 KOps/s 1.2505 KOps/s $\color{#d91a1a}-1.74\%$
test_func_call_runtime[False-compile-overhead] 0.4371ms 0.3643ms 2.7452 KOps/s 2.7369 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_runtime[True-eager] 0.9917ms 0.9052ms 1.1048 KOps/s 1.0816 KOps/s $\color{#35bf28}+2.15\%$
test_func_call_runtime[True-compile] 0.8836ms 0.8180ms 1.2225 KOps/s 1.2277 KOps/s $\color{#d91a1a}-0.43\%$
test_func_call_runtime[True-compile-overhead] 0.4487ms 0.3866ms 2.5869 KOps/s 2.5923 KOps/s $\color{#d91a1a}-0.21\%$
test_func_call_cm_runtime[False-eager] 0.8265ms 0.7379ms 1.3553 KOps/s 1.2888 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_func_call_cm_runtime[False-compile] 0.8707ms 0.8037ms 1.2442 KOps/s 1.2594 KOps/s $\color{#d91a1a}-1.20\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5692ms 0.3686ms 2.7132 KOps/s 2.7411 KOps/s $\color{#d91a1a}-1.02\%$
test_func_call_cm_runtime[True-eager] 1.0792ms 0.9942ms 1.0058 KOps/s 978.9854 Ops/s $\color{#35bf28}+2.74\%$
test_func_call_cm_runtime[True-compile] 0.9191ms 0.8457ms 1.1824 KOps/s 1.1855 KOps/s $\color{#d91a1a}-0.26\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4693ms 0.4093ms 2.4431 KOps/s 2.4301 KOps/s $\color{#35bf28}+0.53\%$
test_vmap_func_call_cm_runtime[eager] 2.6008ms 2.0635ms 484.6094 Ops/s 474.9257 Ops/s $\color{#35bf28}+2.04\%$
test_vmap_func_call_cm_runtime[compile] 0.9527ms 0.8554ms 1.1690 KOps/s 1.1647 KOps/s $\color{#35bf28}+0.37\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4878ms 0.4143ms 2.4138 KOps/s 2.4252 KOps/s $\color{#d91a1a}-0.47\%$
test_distributed 4.5796ms 0.2287ms 4.3732 KOps/s 8.7643 KOps/s $\textbf{\color{#d91a1a}-50.10\%}$
test_tdmodule 0.1273ms 14.5729μs 68.6206 KOps/s 63.7849 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_tdmodule_dispatch 67.7240μs 29.4261μs 33.9834 KOps/s 32.3058 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_tdseq 35.8320μs 15.6893μs 63.7378 KOps/s 60.3385 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_tdseq_dispatch 61.7930μs 31.9103μs 31.3378 KOps/s 29.7079 KOps/s $\textbf{\color{#35bf28}+5.49\%}$
test_instantiation_functorch 2.0213ms 1.8847ms 530.6000 Ops/s 531.8011 Ops/s $\color{#d91a1a}-0.23\%$
test_instantiation_td 1.8333ms 1.2068ms 828.6505 Ops/s 822.1957 Ops/s $\color{#35bf28}+0.79\%$
test_exec_functorch 0.2661ms 0.2114ms 4.7307 KOps/s 4.7917 KOps/s $\color{#d91a1a}-1.27\%$
test_exec_functional_call 0.2648ms 0.2096ms 4.7711 KOps/s 4.7582 KOps/s $\color{#35bf28}+0.27\%$
test_exec_td 0.2570ms 0.2154ms 4.6418 KOps/s 4.6569 KOps/s $\color{#d91a1a}-0.32\%$
test_exec_td_decorator 0.3423ms 0.2572ms 3.8885 KOps/s 3.8651 KOps/s $\color{#35bf28}+0.61\%$
test_vmap_mlp_speed[True-True] 0.7847ms 0.6886ms 1.4522 KOps/s 1.4391 KOps/s $\color{#35bf28}+0.91\%$
test_vmap_mlp_speed[True-False] 0.7553ms 0.6861ms 1.4575 KOps/s 1.4451 KOps/s $\color{#35bf28}+0.86\%$
test_vmap_mlp_speed[False-True] 0.6567ms 0.5763ms 1.7353 KOps/s 1.7233 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_mlp_speed[False-False] 0.6346ms 0.5771ms 1.7329 KOps/s 1.7254 KOps/s $\color{#35bf28}+0.43\%$
test_vmap_mlp_speed_decorator[True-True] 1.1917ms 0.6741ms 1.4834 KOps/s 1.4730 KOps/s $\color{#35bf28}+0.71\%$
test_vmap_mlp_speed_decorator[True-False] 0.7984ms 0.6747ms 1.4821 KOps/s 1.4669 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[False-True] 0.6935ms 0.5900ms 1.6948 KOps/s 1.6846 KOps/s $\color{#35bf28}+0.61\%$
test_vmap_mlp_speed_decorator[False-False] 0.7011ms 0.5892ms 1.6972 KOps/s 1.6773 KOps/s $\color{#35bf28}+1.19\%$
test_vmap_transformer_speed[True-True] 8.7945ms 8.3934ms 119.1411 Ops/s 118.1418 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed[True-False] 8.5850ms 8.3534ms 119.7115 Ops/s 118.5730 Ops/s $\color{#35bf28}+0.96\%$
test_vmap_transformer_speed[False-True] 8.2750ms 8.1460ms 122.7599 Ops/s 121.7195 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed[False-False] 8.2446ms 8.1286ms 123.0227 Ops/s 121.1466 Ops/s $\color{#35bf28}+1.55\%$
test_vmap_transformer_speed_decorator[True-True] 19.7432ms 19.5327ms 51.1963 Ops/s 50.9287 Ops/s $\color{#35bf28}+0.53\%$
test_vmap_transformer_speed_decorator[True-False] 19.6557ms 19.4764ms 51.3442 Ops/s 50.9635 Ops/s $\color{#35bf28}+0.75\%$
test_vmap_transformer_speed_decorator[False-True] 19.5777ms 19.3310ms 51.7303 Ops/s 51.4140 Ops/s $\color{#35bf28}+0.62\%$
test_vmap_transformer_speed_decorator[False-False] 20.5725ms 19.3731ms 51.6179 Ops/s 51.2725 Ops/s $\color{#35bf28}+0.67\%$
test_to_module_speed[True] 1.3639ms 0.9353ms 1.0691 KOps/s 1.0570 KOps/s $\color{#35bf28}+1.15\%$
test_to_module_speed[False] 1.3430ms 0.9094ms 1.0996 KOps/s 1.0700 KOps/s $\color{#35bf28}+2.77\%$
test_tc_init 66.9340μs 34.1185μs 29.3096 KOps/s 26.9793 KOps/s $\textbf{\color{#35bf28}+8.64\%}$
test_tc_init_nested 98.2060μs 68.5735μs 14.5829 KOps/s 13.3752 KOps/s $\textbf{\color{#35bf28}+9.03\%}$
test_tc_first_layer_tensor 4.2787μs 0.6848μs 1.4603 MOps/s 1.4393 MOps/s $\color{#35bf28}+1.46\%$
test_tc_first_layer_nontensor 26.9710μs 2.2841μs 437.8054 KOps/s 443.2155 KOps/s $\color{#d91a1a}-1.22\%$
test_tc_second_layer_tensor 11.0907μs 1.3954μs 716.6184 KOps/s 724.6366 KOps/s $\color{#d91a1a}-1.11\%$
test_tc_second_layer_nontensor 25.7110μs 3.0039μs 332.9058 KOps/s 336.0208 KOps/s $\color{#d91a1a}-0.93\%$
test_unbind 0.1963s 12.2915ms 81.3572 Ops/s 91.3793 Ops/s $\textbf{\color{#d91a1a}-10.97\%}$
test_full_like 0.6636ms 0.5733ms 1.7443 KOps/s 1.7432 KOps/s $\color{#35bf28}+0.06\%$
test_zeros_like 0.2754ms 0.1980ms 5.0512 KOps/s 5.0540 KOps/s $\color{#d91a1a}-0.06\%$
test_ones_like 0.2274ms 0.1977ms 5.0588 KOps/s 5.0578 KOps/s $\color{#35bf28}+0.02\%$
test_clone 0.4401ms 0.4144ms 2.4130 KOps/s 2.4135 KOps/s $\color{#d91a1a}-0.02\%$
test_squeeze 35.3320μs 10.0299μs 99.7015 KOps/s 100.6858 KOps/s $\color{#d91a1a}-0.98\%$
test_unsqueeze 0.2505ms 76.1898μs 13.1251 KOps/s 13.3852 KOps/s $\color{#d91a1a}-1.94\%$
test_split 0.3997ms 0.1648ms 6.0681 KOps/s 6.4027 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_permute 0.2212ms 0.1787ms 5.5951 KOps/s 5.5964 KOps/s $\color{#d91a1a}-0.02\%$
test_stack 1.2594ms 0.8646ms 1.1565 KOps/s 1.1892 KOps/s $\color{#d91a1a}-2.75\%$
test_cat 1.2565ms 1.2318ms 811.8051 Ops/s 811.7884 Ops/s $+0.00\%$

@vmoens vmoens added the enhancement New feature or request label Oct 1, 2024
@vmoens vmoens merged commit b316aad into gh/vmoens/21/base Oct 1, 2024
50 of 51 checks passed
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 14fa71bdac21f1109a7e42c37357f3b62db9f402
Pull Request resolved: #1017
@vmoens vmoens deleted the gh/vmoens/21/head branch October 1, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants