Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] intersection for assert_close #1078

Merged
merged 1 commit into from
Nov 7, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 7, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 7, 2024
ghstack-source-id: 3ae83c4ef90a9377405aebbf1761ace1a39417b1
Pull Request resolved: #1078
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 7, 2024
@vmoens vmoens merged commit 5cf207a into gh/vmoens/35/base Nov 7, 2024
16 of 29 checks passed
@vmoens vmoens deleted the gh/vmoens/35/head branch November 7, 2024 10:10
Copy link

github-actions bot commented Nov 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}26$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 70.4110μs 17.5104μs 57.1091 KOps/s 54.9051 KOps/s $\color{#35bf28}+4.01\%$
test_plain_set_stack_nested 50.6140μs 18.0212μs 55.4902 KOps/s 53.8576 KOps/s $\color{#35bf28}+3.03\%$
test_plain_set_nested_inplace 78.4760μs 19.5953μs 51.0326 KOps/s 48.1463 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_plain_set_stack_nested_inplace 73.7370μs 19.6234μs 50.9595 KOps/s 47.7196 KOps/s $\textbf{\color{#35bf28}+6.79\%}$
test_items 49.1610μs 4.1597μs 240.4033 KOps/s 236.2977 KOps/s $\color{#35bf28}+1.74\%$
test_items_nested 0.8502ms 0.3425ms 2.9196 KOps/s 2.9349 KOps/s $\color{#d91a1a}-0.52\%$
test_items_nested_locked 0.7845ms 0.3405ms 2.9365 KOps/s 2.9746 KOps/s $\color{#d91a1a}-1.28\%$
test_items_nested_leaf 0.1237ms 71.4296μs 13.9998 KOps/s 13.9658 KOps/s $\color{#35bf28}+0.24\%$
test_items_stack_nested 0.9002ms 0.3467ms 2.8843 KOps/s 2.9031 KOps/s $\color{#d91a1a}-0.65\%$
test_items_stack_nested_leaf 0.1298ms 74.5357μs 13.4164 KOps/s 13.6018 KOps/s $\color{#d91a1a}-1.36\%$
test_items_stack_nested_locked 0.5376ms 0.3455ms 2.8941 KOps/s 2.9562 KOps/s $\color{#d91a1a}-2.10\%$
test_keys 29.3550μs 3.5095μs 284.9377 KOps/s 283.6602 KOps/s $\color{#35bf28}+0.45\%$
test_keys_nested 0.3700ms 0.1361ms 7.3469 KOps/s 7.4075 KOps/s $\color{#d91a1a}-0.82\%$
test_keys_nested_locked 0.7576ms 0.1422ms 7.0299 KOps/s 7.1087 KOps/s $\color{#d91a1a}-1.11\%$
test_keys_nested_leaf 0.2026ms 0.1174ms 8.5208 KOps/s 8.6013 KOps/s $\color{#d91a1a}-0.94\%$
test_keys_stack_nested 0.3800ms 0.1379ms 7.2529 KOps/s 7.4380 KOps/s $\color{#d91a1a}-2.49\%$
test_keys_stack_nested_leaf 0.3144ms 0.1171ms 8.5417 KOps/s 8.7365 KOps/s $\color{#d91a1a}-2.23\%$
test_keys_stack_nested_locked 0.3667ms 0.1420ms 7.0423 KOps/s 7.2505 KOps/s $\color{#d91a1a}-2.87\%$
test_values 9.0688μs 1.0416μs 960.0246 KOps/s 962.3016 KOps/s $\color{#d91a1a}-0.24\%$
test_values_nested 0.1318ms 55.1109μs 18.1452 KOps/s 18.2633 KOps/s $\color{#d91a1a}-0.65\%$
test_values_nested_locked 0.1382ms 55.5146μs 18.0133 KOps/s 18.3543 KOps/s $\color{#d91a1a}-1.86\%$
test_values_nested_leaf 0.1592ms 60.4533μs 16.5417 KOps/s 16.1056 KOps/s $\color{#35bf28}+2.71\%$
test_values_stack_nested 0.1052ms 57.3120μs 17.4484 KOps/s 17.7028 KOps/s $\color{#d91a1a}-1.44\%$
test_values_stack_nested_leaf 0.1606ms 61.1289μs 16.3589 KOps/s 16.5560 KOps/s $\color{#d91a1a}-1.19\%$
test_values_stack_nested_locked 0.1265ms 57.0562μs 17.5266 KOps/s 17.7721 KOps/s $\color{#d91a1a}-1.38\%$
test_membership 5.7077μs 0.7529μs 1.3282 MOps/s 1.1160 MOps/s $\textbf{\color{#35bf28}+19.01\%}$
test_membership_nested 50.0330μs 2.7493μs 363.7333 KOps/s 355.8650 KOps/s $\color{#35bf28}+2.21\%$
test_membership_nested_leaf 30.7570μs 2.7612μs 362.1580 KOps/s 353.0754 KOps/s $\color{#35bf28}+2.57\%$
test_membership_stacked_nested 52.1670μs 2.7366μs 365.4200 KOps/s 361.6743 KOps/s $\color{#35bf28}+1.04\%$
test_membership_stacked_nested_leaf 23.3930μs 2.7593μs 362.4045 KOps/s 351.7767 KOps/s $\color{#35bf28}+3.02\%$
test_membership_nested_last 35.3260μs 4.0835μs 244.8853 KOps/s 236.8240 KOps/s $\color{#35bf28}+3.40\%$
test_membership_nested_leaf_last 24.2050μs 4.1180μs 242.8381 KOps/s 240.1128 KOps/s $\color{#35bf28}+1.13\%$
test_membership_stacked_nested_last 50.9450μs 5.3109μs 188.2914 KOps/s 110.5211 KOps/s $\textbf{\color{#35bf28}+70.37\%}$
test_membership_stacked_nested_leaf_last 47.7290μs 5.1942μs 192.5232 KOps/s 114.9693 KOps/s $\textbf{\color{#35bf28}+67.46\%}$
test_nested_getleaf 67.1150μs 10.9019μs 91.7275 KOps/s 95.1180 KOps/s $\color{#d91a1a}-3.56\%$
test_nested_get 31.9890μs 10.3574μs 96.5495 KOps/s 95.9231 KOps/s $\color{#35bf28}+0.65\%$
test_stacked_getleaf 59.1590μs 10.5822μs 94.4981 KOps/s 92.8159 KOps/s $\color{#35bf28}+1.81\%$
test_stacked_get 53.8300μs 10.2575μs 97.4893 KOps/s 98.6343 KOps/s $\color{#d91a1a}-1.16\%$
test_nested_getitemleaf 39.0220μs 11.2187μs 89.1370 KOps/s 89.4148 KOps/s $\color{#d91a1a}-0.31\%$
test_nested_getitem 50.3340μs 10.5811μs 94.5085 KOps/s 96.1897 KOps/s $\color{#d91a1a}-1.75\%$
test_stacked_getitemleaf 58.0880μs 11.1532μs 89.6603 KOps/s 91.4272 KOps/s $\color{#d91a1a}-1.93\%$
test_stacked_getitem 31.4890μs 10.6092μs 94.2579 KOps/s 95.5274 KOps/s $\color{#d91a1a}-1.33\%$
test_lock_nested 1.5121ms 0.4505ms 2.2197 KOps/s 2.1924 KOps/s $\color{#35bf28}+1.25\%$
test_lock_stack_nested 0.6380ms 0.4176ms 2.3945 KOps/s 2.4219 KOps/s $\color{#d91a1a}-1.13\%$
test_unlock_nested 1.1954ms 0.3751ms 2.6659 KOps/s 2.6768 KOps/s $\color{#d91a1a}-0.41\%$
test_unlock_stack_nested 0.7782ms 0.3382ms 2.9565 KOps/s 3.0256 KOps/s $\color{#d91a1a}-2.29\%$
test_flatten_speed 0.2106ms 93.8246μs 10.6582 KOps/s 10.9495 KOps/s $\color{#d91a1a}-2.66\%$
test_unflatten_speed 0.5798ms 0.4825ms 2.0724 KOps/s 2.0840 KOps/s $\color{#d91a1a}-0.56\%$
test_common_ops 4.1376ms 0.8144ms 1.2278 KOps/s 1.2385 KOps/s $\color{#d91a1a}-0.86\%$
test_creation 29.6750μs 2.0946μs 477.4164 KOps/s 466.8326 KOps/s $\color{#35bf28}+2.27\%$
test_creation_empty 26.2090μs 10.3109μs 96.9852 KOps/s 78.6322 KOps/s $\textbf{\color{#35bf28}+23.34\%}$
test_creation_nested_1 48.6510μs 13.3103μs 75.1297 KOps/s 64.1273 KOps/s $\textbf{\color{#35bf28}+17.16\%}$
test_creation_nested_2 81.9020μs 17.3175μs 57.7449 KOps/s 50.2341 KOps/s $\textbf{\color{#35bf28}+14.95\%}$
test_clone 0.1772ms 13.3777μs 74.7515 KOps/s 76.1069 KOps/s $\color{#d91a1a}-1.78\%$
test_getitem[int] 1.6421ms 12.7236μs 78.5938 KOps/s 79.3115 KOps/s $\color{#d91a1a}-0.90\%$
test_getitem[slice_int] 0.1536ms 24.5222μs 40.7793 KOps/s 40.9137 KOps/s $\color{#d91a1a}-0.33\%$
test_getitem[range] 0.1781ms 50.7266μs 19.7135 KOps/s 20.5372 KOps/s $\color{#d91a1a}-4.01\%$
test_getitem[tuple] 0.1820ms 19.9066μs 50.2345 KOps/s 50.4832 KOps/s $\color{#d91a1a}-0.49\%$
test_getitem[list] 0.3923ms 46.1813μs 21.6538 KOps/s 22.9391 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_setitem_dim[int] 83.5260μs 26.2299μs 38.1244 KOps/s 38.0245 KOps/s $\color{#35bf28}+0.26\%$
test_setitem_dim[slice_int] 97.4610μs 51.8491μs 19.2867 KOps/s 19.0932 KOps/s $\color{#35bf28}+1.01\%$
test_setitem_dim[range] 0.1926ms 77.0951μs 12.9710 KOps/s 13.6525 KOps/s $\color{#d91a1a}-4.99\%$
test_setitem_dim[tuple] 76.6230μs 41.2797μs 24.2250 KOps/s 24.2898 KOps/s $\color{#d91a1a}-0.27\%$
test_setitem 0.2004ms 20.6001μs 48.5435 KOps/s 45.6964 KOps/s $\textbf{\color{#35bf28}+6.23\%}$
test_set 0.1885ms 20.1333μs 49.6690 KOps/s 47.2386 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_set_shared 1.3740ms 0.1774ms 5.6384 KOps/s 5.6456 KOps/s $\color{#d91a1a}-0.13\%$
test_update 0.2024ms 22.4560μs 44.5315 KOps/s 39.5104 KOps/s $\textbf{\color{#35bf28}+12.71\%}$
test_update_nested 0.2447ms 32.8018μs 30.4862 KOps/s 28.9504 KOps/s $\textbf{\color{#35bf28}+5.30\%}$
test_update__nested 0.8406ms 33.8553μs 29.5375 KOps/s 31.0271 KOps/s $\color{#d91a1a}-4.80\%$
test_set_nested 0.2038ms 21.9433μs 45.5721 KOps/s 42.9382 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_set_nested_new 0.2309ms 26.9773μs 37.0682 KOps/s 35.7726 KOps/s $\color{#35bf28}+3.62\%$
test_select 0.2773ms 42.7705μs 23.3806 KOps/s 22.7519 KOps/s $\color{#35bf28}+2.76\%$
test_select_nested 0.1300ms 60.1235μs 16.6324 KOps/s 16.0977 KOps/s $\color{#35bf28}+3.32\%$
test_exclude_nested 0.1650ms 74.4674μs 13.4287 KOps/s 13.0692 KOps/s $\color{#35bf28}+2.75\%$
test_empty[True] 0.4980ms 0.3532ms 2.8315 KOps/s 2.8113 KOps/s $\color{#35bf28}+0.72\%$
test_empty[False] 15.9022μs 1.2119μs 825.1188 KOps/s 815.0928 KOps/s $\color{#35bf28}+1.23\%$
test_unbind_speed 0.4650ms 0.2743ms 3.6462 KOps/s 3.7143 KOps/s $\color{#d91a1a}-1.83\%$
test_unbind_speed_stack0 0.4554ms 0.2618ms 3.8202 KOps/s 3.8733 KOps/s $\color{#d91a1a}-1.37\%$
test_unbind_speed_stack1 0.1168s 0.7472ms 1.3383 KOps/s 1.4299 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_split 0.1133s 1.7587ms 568.5958 Ops/s 566.9729 Ops/s $\color{#35bf28}+0.29\%$
test_chunk 0.1143s 1.7563ms 569.3739 Ops/s 563.0014 Ops/s $\color{#35bf28}+1.13\%$
test_consolidate_njt[False-None] 9.3864ms 8.7594ms 114.1626 Ops/s 120.8257 Ops/s $\textbf{\color{#d91a1a}-5.51\%}$
test_creation[device0] 3.3875ms 95.4549μs 10.4762 KOps/s 10.6706 KOps/s $\color{#d91a1a}-1.82\%$
test_creation_from_tensor 0.3212ms 95.3846μs 10.4839 KOps/s 10.2102 KOps/s $\color{#35bf28}+2.68\%$
test_add_one[memmap_tensor0] 0.5461ms 5.2387μs 190.8888 KOps/s 201.0165 KOps/s $\textbf{\color{#d91a1a}-5.04\%}$
test_contiguous[memmap_tensor0] 24.7660μs 0.5205μs 1.9213 MOps/s 1.9615 MOps/s $\color{#d91a1a}-2.05\%$
test_stack[memmap_tensor0] 73.2970μs 3.8045μs 262.8453 KOps/s 280.2827 KOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_memmaptd_index 1.1204ms 0.2446ms 4.0886 KOps/s 4.2188 KOps/s $\color{#d91a1a}-3.09\%$
test_memmaptd_index_astensor 0.8183ms 0.3243ms 3.0839 KOps/s 3.1511 KOps/s $\color{#d91a1a}-2.13\%$
test_memmaptd_index_op 1.0861ms 0.5955ms 1.6792 KOps/s 1.5944 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_serialize_model 0.2333s 0.1420s 7.0443 Ops/s 8.4472 Ops/s $\textbf{\color{#d91a1a}-16.61\%}$
test_serialize_model_pickle 0.4432s 0.3936s 2.5405 Ops/s 2.5062 Ops/s $\color{#35bf28}+1.37\%$
test_serialize_weights 0.1235s 0.1177s 8.4979 Ops/s 8.7595 Ops/s $\color{#d91a1a}-2.99\%$
test_serialize_weights_returnearly 0.1858s 0.1621s 6.1686 Ops/s 6.4004 Ops/s $\color{#d91a1a}-3.62\%$
test_serialize_weights_pickle 0.5871s 0.4362s 2.2925 Ops/s 2.4549 Ops/s $\textbf{\color{#d91a1a}-6.61\%}$
test_serialize_weights_filesystem 0.1533s 0.1472s 6.7931 Ops/s 7.0814 Ops/s $\color{#d91a1a}-4.07\%$
test_serialize_model_filesystem 0.2630s 0.1642s 6.0911 Ops/s 6.4005 Ops/s $\color{#d91a1a}-4.83\%$
test_reshape_pytree 81.7920μs 26.9431μs 37.1153 KOps/s 34.6004 KOps/s $\textbf{\color{#35bf28}+7.27\%}$
test_reshape_td 86.0910μs 33.8448μs 29.5466 KOps/s 29.8931 KOps/s $\color{#d91a1a}-1.16\%$
test_view_pytree 86.4300μs 27.5423μs 36.3078 KOps/s 34.7735 KOps/s $\color{#35bf28}+4.41\%$
test_view_td 0.1012ms 38.5173μs 25.9624 KOps/s 26.0558 KOps/s $\color{#d91a1a}-0.36\%$
test_unbind_pytree 75.2900μs 30.3570μs 32.9414 KOps/s 33.4463 KOps/s $\color{#d91a1a}-1.51\%$
test_unbind_td 0.3661ms 39.6879μs 25.1966 KOps/s 25.6477 KOps/s $\color{#d91a1a}-1.76\%$
test_split_pytree 74.9790μs 30.4738μs 32.8151 KOps/s 33.1302 KOps/s $\color{#d91a1a}-0.95\%$
test_split_td 0.2315ms 44.7573μs 22.3427 KOps/s 22.3356 KOps/s $\color{#35bf28}+0.03\%$
test_add_pytree 0.1064ms 36.4779μs 27.4139 KOps/s 27.7740 KOps/s $\color{#d91a1a}-1.30\%$
test_add_td 0.1227ms 56.3619μs 17.7425 KOps/s 16.9446 KOps/s $\color{#35bf28}+4.71\%$
test_compile_add_one_nested[tensordict-compile] 0.1466ms 64.2046μs 15.5752 KOps/s 15.7887 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_add_one_nested[tensordict-eager] 0.3938ms 0.1622ms 6.1637 KOps/s 6.2315 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_add_one_nested[pytree-compile] 0.1418ms 48.2861μs 20.7099 KOps/s 21.7248 KOps/s $\color{#d91a1a}-4.67\%$
test_compile_add_one_nested[pytree-eager] 0.1991ms 0.1206ms 8.2944 KOps/s 8.3154 KOps/s $\color{#d91a1a}-0.25\%$
test_compile_copy_nested[tensordict-compile] 72.0850μs 26.0838μs 38.3380 KOps/s 38.8211 KOps/s $\color{#d91a1a}-1.24\%$
test_compile_copy_nested[tensordict-eager] 0.1110ms 53.4605μs 18.7054 KOps/s 17.9149 KOps/s $\color{#35bf28}+4.41\%$
test_compile_copy_nested[pytree-compile] 0.1424ms 79.8128μs 12.5293 KOps/s 12.3291 KOps/s $\color{#35bf28}+1.62\%$
test_compile_copy_nested[pytree-eager] 0.1376ms 68.4290μs 14.6137 KOps/s 14.5820 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_one_flat[tensordict-compile] 0.2041ms 0.1077ms 9.2889 KOps/s 9.2560 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_flat[tensordict-eager] 0.3903ms 0.2013ms 4.9676 KOps/s 5.0477 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_add_one_flat[tensorclass-compile] 0.1103ms 47.2556μs 21.1615 KOps/s 21.1521 KOps/s $\color{#35bf28}+0.04\%$
test_compile_add_one_flat[tensorclass-eager] 0.5161ms 61.8962μs 16.1561 KOps/s 15.9525 KOps/s $\color{#35bf28}+1.28\%$
test_compile_add_one_flat[pytree-compile] 0.3903ms 0.1051ms 9.5174 KOps/s 9.6584 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_add_one_flat[pytree-eager] 0.3456ms 0.2045ms 4.8907 KOps/s 4.8969 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_add_self_flat[tensordict-eager] 0.4026ms 0.2106ms 4.7493 KOps/s 4.7422 KOps/s $\color{#35bf28}+0.15\%$
test_compile_add_self_flat[tensordict-compile] 0.1946ms 0.1099ms 9.0956 KOps/s 9.3980 KOps/s $\color{#d91a1a}-3.22\%$
test_compile_add_self_flat[tensorclass-eager] 1.5357ms 57.2895μs 17.4552 KOps/s 18.2919 KOps/s $\color{#d91a1a}-4.57\%$
test_compile_add_self_flat[tensorclass-compile] 97.7820μs 47.9541μs 20.8533 KOps/s 21.4146 KOps/s $\color{#d91a1a}-2.62\%$
test_compile_add_self_flat[pytree-eager] 0.3317ms 0.1652ms 6.0537 KOps/s 6.1924 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_add_self_flat[pytree-compile] 0.1974ms 0.1052ms 9.5080 KOps/s 9.5415 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_copy_flat[tensordict-compile] 0.1115ms 21.1873μs 47.1980 KOps/s 45.5611 KOps/s $\color{#35bf28}+3.59\%$
test_compile_copy_flat[tensordict-eager] 0.1367ms 58.9928μs 16.9512 KOps/s 17.0532 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_copy_flat[pytree-compile] 0.2118ms 85.8567μs 11.6473 KOps/s 11.4893 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_flat[pytree-eager] 0.1244ms 72.2520μs 13.8404 KOps/s 13.7860 KOps/s $\color{#35bf28}+0.40\%$
test_compile_assign_and_add[tensordict-compile] 0.3165ms 0.2147ms 4.6568 KOps/s 4.7762 KOps/s $\color{#d91a1a}-2.50\%$
test_compile_assign_and_add[tensordict-eager] 1.6401ms 1.3049ms 766.3520 Ops/s 767.0459 Ops/s $\color{#d91a1a}-0.09\%$
test_compile_assign_and_add[pytree-compile] 0.2914ms 0.2068ms 4.8353 KOps/s 4.8553 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_assign_and_add[pytree-eager] 1.0134ms 0.7898ms 1.2661 KOps/s 1.2702 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_assign_and_add_stack[compile] 0.8711ms 0.4748ms 2.1062 KOps/s 2.0697 KOps/s $\color{#35bf28}+1.77\%$
test_compile_assign_and_add_stack[eager] 3.4363ms 2.6891ms 371.8653 Ops/s 357.7998 Ops/s $\color{#35bf28}+3.93\%$
test_compile_indexing[tensor-tensordict-compile] 86.3910μs 37.5930μs 26.6007 KOps/s 27.3722 KOps/s $\color{#d91a1a}-2.82\%$
test_compile_indexing[tensor-tensordict-eager] 0.6073ms 34.2140μs 29.2278 KOps/s 29.6288 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_indexing[tensor-tensorclass-compile] 94.3260μs 30.4124μs 32.8814 KOps/s 32.9412 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[tensor-tensorclass-eager] 65.2420μs 24.1024μs 41.4896 KOps/s 37.0586 KOps/s $\textbf{\color{#35bf28}+11.96\%}$
test_compile_indexing[tensor-pytree-compile] 74.5890μs 30.9747μs 32.2844 KOps/s 31.4034 KOps/s $\color{#35bf28}+2.81\%$
test_compile_indexing[tensor-pytree-eager] 2.6155ms 24.1127μs 41.4720 KOps/s 41.0482 KOps/s $\color{#35bf28}+1.03\%$
test_compile_indexing[slice-tensordict-compile] 0.1082ms 53.4179μs 18.7203 KOps/s 19.1678 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_indexing[slice-tensordict-eager] 0.4944ms 19.0725μs 52.4314 KOps/s 49.8983 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1240ms 45.2246μs 22.1118 KOps/s 22.3926 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_indexing[slice-tensorclass-eager] 73.6870μs 18.9290μs 52.8291 KOps/s 50.8269 KOps/s $\color{#35bf28}+3.94\%$
test_compile_indexing[slice-pytree-compile] 0.1096ms 45.2803μs 22.0847 KOps/s 21.9144 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[slice-pytree-eager] 64.9910μs 18.9914μs 52.6555 KOps/s 50.9854 KOps/s $\color{#35bf28}+3.28\%$
test_compile_indexing[int-tensordict-compile] 0.1148ms 55.0539μs 18.1640 KOps/s 18.6566 KOps/s $\color{#d91a1a}-2.64\%$
test_compile_indexing[int-tensordict-eager] 1.0606ms 19.2592μs 51.9232 KOps/s 51.3887 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[int-tensorclass-compile] 0.1602ms 45.4149μs 22.0192 KOps/s 21.9463 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[int-tensorclass-eager] 63.8090μs 19.0408μs 52.5188 KOps/s 51.0800 KOps/s $\color{#35bf28}+2.82\%$
test_compile_indexing[int-pytree-compile] 0.1007ms 45.1155μs 22.1653 KOps/s 21.8955 KOps/s $\color{#35bf28}+1.23\%$
test_compile_indexing[int-pytree-eager] 80.1890μs 18.9749μs 52.7013 KOps/s 52.1701 KOps/s $\color{#35bf28}+1.02\%$
test_mod_add[eager] 75.4100μs 26.5404μs 37.6784 KOps/s 35.5643 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_mod_add[compile] 97.5510μs 47.4627μs 21.0692 KOps/s 21.5414 KOps/s $\color{#d91a1a}-2.19\%$
test_mod_add[compile-overhead] 0.1083ms 46.0938μs 21.6949 KOps/s 21.2814 KOps/s $\color{#35bf28}+1.94\%$
test_mod_wrap[eager] 0.4438ms 0.2148ms 4.6561 KOps/s 4.3193 KOps/s $\textbf{\color{#35bf28}+7.80\%}$
test_mod_wrap[compile] 1.8285ms 0.2073ms 4.8247 KOps/s 4.7874 KOps/s $\color{#35bf28}+0.78\%$
test_mod_wrap[compile-overhead] 1.7710ms 0.2066ms 4.8406 KOps/s 4.8030 KOps/s $\color{#35bf28}+0.78\%$
test_mod_wrap_and_backward[eager] 15.5419ms 11.9981ms 83.3465 Ops/s 83.4327 Ops/s $\color{#d91a1a}-0.10\%$
test_mod_wrap_and_backward[compile] 20.0563ms 13.3668ms 74.8120 Ops/s 69.8325 Ops/s $\textbf{\color{#35bf28}+7.13\%}$
test_mod_wrap_and_backward[compile-overhead] 16.9536ms 13.3273ms 75.0341 Ops/s 71.5134 Ops/s $\color{#35bf28}+4.92\%$
test_seq_add[eager] 0.3169ms 92.2121μs 10.8446 KOps/s 10.1140 KOps/s $\textbf{\color{#35bf28}+7.22\%}$
test_seq_add[compile] 0.1515ms 61.7013μs 16.2071 KOps/s 16.3422 KOps/s $\color{#d91a1a}-0.83\%$
test_seq_add[compile-overhead] 0.1673ms 60.6011μs 16.5014 KOps/s 16.5142 KOps/s $\color{#d91a1a}-0.08\%$
test_seq_wrap[eager] 0.5965ms 0.3926ms 2.5473 KOps/s 2.4777 KOps/s $\color{#35bf28}+2.81\%$
test_seq_wrap[compile] 0.9703ms 0.2305ms 4.3389 KOps/s 4.2655 KOps/s $\color{#35bf28}+1.72\%$
test_seq_wrap[compile-overhead] 0.3503ms 0.2315ms 4.3200 KOps/s 4.2463 KOps/s $\color{#35bf28}+1.74\%$
test_func_call_runtime[False-eager] 0.9469ms 0.5520ms 1.8115 KOps/s 1.8236 KOps/s $\color{#d91a1a}-0.66\%$
test_func_call_runtime[False-compile] 0.5427ms 0.4403ms 2.2710 KOps/s 2.3132 KOps/s $\color{#d91a1a}-1.82\%$
test_func_call_runtime[False-compile-overhead] 1.0285ms 0.4430ms 2.2572 KOps/s 2.2968 KOps/s $\color{#d91a1a}-1.73\%$
test_func_call_runtime[True-eager] 0.8917ms 0.7712ms 1.2967 KOps/s 1.3258 KOps/s $\color{#d91a1a}-2.19\%$
test_func_call_runtime[True-compile] 0.9047ms 0.4770ms 2.0963 KOps/s 2.1158 KOps/s $\color{#d91a1a}-0.92\%$
test_func_call_runtime[True-compile-overhead] 0.5662ms 0.4714ms 2.1214 KOps/s 2.1040 KOps/s $\color{#35bf28}+0.83\%$
test_func_call_cm_runtime[False-eager] 0.7116ms 0.5446ms 1.8361 KOps/s 1.8545 KOps/s $\color{#d91a1a}-0.99\%$
test_func_call_cm_runtime[False-compile] 1.0945ms 0.4360ms 2.2938 KOps/s 2.3114 KOps/s $\color{#d91a1a}-0.76\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5568ms 0.4379ms 2.2835 KOps/s 2.2700 KOps/s $\color{#35bf28}+0.60\%$
test_func_call_cm_runtime[True-eager] 1.1289ms 0.8941ms 1.1184 KOps/s 1.1169 KOps/s $\color{#35bf28}+0.14\%$
test_func_call_cm_runtime[True-compile] 0.5865ms 0.5028ms 1.9887 KOps/s 1.9869 KOps/s $\color{#35bf28}+0.09\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6924ms 0.5014ms 1.9943 KOps/s 2.0148 KOps/s $\color{#d91a1a}-1.02\%$
test_vmap_func_call_cm_runtime[eager] 2.4051ms 1.8877ms 529.7358 Ops/s 524.4247 Ops/s $\color{#35bf28}+1.01\%$
test_vmap_func_call_cm_runtime[compile] 0.8919ms 0.5296ms 1.8882 KOps/s 1.8976 KOps/s $\color{#d91a1a}-0.50\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.2981ms 0.5282ms 1.8932 KOps/s 1.9056 KOps/s $\color{#d91a1a}-0.65\%$
test_distributed 0.3420ms 0.1275ms 7.8459 KOps/s 7.7003 KOps/s $\color{#35bf28}+1.89\%$
test_tdmodule 79.5970μs 17.8968μs 55.8758 KOps/s 49.0705 KOps/s $\textbf{\color{#35bf28}+13.87\%}$
test_tdmodule_dispatch 68.7780μs 35.9387μs 27.8252 KOps/s 25.4912 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_tdseq 41.0270μs 21.3152μs 46.9149 KOps/s 42.3243 KOps/s $\textbf{\color{#35bf28}+10.85\%}$
test_tdseq_dispatch 64.4600μs 41.5470μs 24.0691 KOps/s 21.9119 KOps/s $\textbf{\color{#35bf28}+9.85\%}$
test_instantiation_functorch 1.7989ms 1.5425ms 648.2993 Ops/s 638.4870 Ops/s $\color{#35bf28}+1.54\%$
test_exec_functorch 0.4418ms 0.1814ms 5.5115 KOps/s 4.6966 KOps/s $\textbf{\color{#35bf28}+17.35\%}$
test_exec_functional_call 0.2758ms 0.1709ms 5.8518 KOps/s 5.6466 KOps/s $\color{#35bf28}+3.63\%$
test_exec_td_decorator 0.5465ms 0.2289ms 4.3695 KOps/s 4.3530 KOps/s $\color{#35bf28}+0.38\%$
test_vmap_mlp_speed_decorator[True-True] 1.1223ms 0.6412ms 1.5596 KOps/s 1.5563 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[True-False] 1.1928ms 0.6431ms 1.5551 KOps/s 1.5290 KOps/s $\color{#35bf28}+1.70\%$
test_vmap_mlp_speed_decorator[False-True] 0.7976ms 0.5289ms 1.8909 KOps/s 1.8914 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[False-False] 0.7236ms 0.5270ms 1.8977 KOps/s 1.8634 KOps/s $\color{#35bf28}+1.84\%$
test_to_module_speed[True] 2.1068ms 1.3030ms 767.4419 Ops/s 778.9239 Ops/s $\color{#d91a1a}-1.47\%$
test_to_module_speed[False] 1.8611ms 1.2626ms 792.0377 Ops/s 795.0806 Ops/s $\color{#d91a1a}-0.38\%$
test_tc_init 0.1289ms 44.8888μs 22.2773 KOps/s 21.6130 KOps/s $\color{#35bf28}+3.07\%$
test_tc_init_nested 0.1885ms 90.8169μs 11.0112 KOps/s 10.7293 KOps/s $\color{#35bf28}+2.63\%$
test_tc_first_layer_tensor 29.7650μs 1.5228μs 656.6861 KOps/s 672.6572 KOps/s $\color{#d91a1a}-2.37\%$
test_tc_first_layer_nontensor 30.8780μs 4.6925μs 213.1039 KOps/s 211.4435 KOps/s $\color{#35bf28}+0.79\%$
test_tc_second_layer_tensor 43.1100μs 2.8274μs 353.6815 KOps/s 353.6848 KOps/s $-0.00\%$
test_tc_second_layer_nontensor 62.5270μs 6.0381μs 165.6150 KOps/s 162.3365 KOps/s $\color{#35bf28}+2.02\%$
test_unbind 0.2275s 13.2668ms 75.3759 Ops/s 79.2905 Ops/s $\color{#d91a1a}-4.94\%$
test_full_like 9.2420ms 8.1030ms 123.4110 Ops/s 138.2602 Ops/s $\textbf{\color{#d91a1a}-10.74\%}$
test_zeros_like 10.1242ms 6.6459ms 150.4681 Ops/s 352.7069 Ops/s $\textbf{\color{#d91a1a}-57.34\%}$
test_ones_like 10.2824ms 7.6406ms 130.8805 Ops/s 315.5730 Ops/s $\textbf{\color{#d91a1a}-58.53\%}$
test_clone 15.2007ms 9.3771ms 106.6424 Ops/s 198.1181 Ops/s $\textbf{\color{#d91a1a}-46.17\%}$
test_squeeze 59.8620μs 11.9897μs 83.4046 KOps/s 82.0893 KOps/s $\color{#35bf28}+1.60\%$
test_unsqueeze 0.3014ms 90.8841μs 11.0030 KOps/s 11.2200 KOps/s $\color{#d91a1a}-1.93\%$
test_split 0.3417ms 0.1878ms 5.3236 KOps/s 5.1849 KOps/s $\color{#35bf28}+2.67\%$
test_permute 0.3316ms 0.2173ms 4.6011 KOps/s 4.4835 KOps/s $\color{#35bf28}+2.62\%$
test_stack 26.3230ms 24.7907ms 40.3378 Ops/s 40.2489 Ops/s $\color{#35bf28}+0.22\%$
test_cat 29.1687ms 24.7041ms 40.4791 Ops/s 40.6100 Ops/s $\color{#d91a1a}-0.32\%$

Copy link

github-actions bot commented Nov 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}36$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.1300μs 10.1540μs 98.4838 KOps/s 82.6698 KOps/s $\textbf{\color{#35bf28}+19.13\%}$
test_plain_set_stack_nested 41.5800μs 10.3152μs 96.9443 KOps/s 83.5646 KOps/s $\textbf{\color{#35bf28}+16.01\%}$
test_plain_set_nested_inplace 48.9310μs 11.0838μs 90.2217 KOps/s 77.4517 KOps/s $\textbf{\color{#35bf28}+16.49\%}$
test_plain_set_stack_nested_inplace 42.9910μs 11.0678μs 90.3519 KOps/s 77.5510 KOps/s $\textbf{\color{#35bf28}+16.51\%}$
test_items 24.9000μs 2.8635μs 349.2275 KOps/s 343.4556 KOps/s $\color{#35bf28}+1.68\%$
test_items_nested 0.4510ms 0.3207ms 3.1185 KOps/s 3.1451 KOps/s $\color{#d91a1a}-0.85\%$
test_items_nested_locked 0.3808ms 0.3225ms 3.1012 KOps/s 3.1227 KOps/s $\color{#d91a1a}-0.69\%$
test_items_nested_leaf 85.6020μs 58.7426μs 17.0234 KOps/s 17.2771 KOps/s $\color{#d91a1a}-1.47\%$
test_items_stack_nested 0.3965ms 0.3219ms 3.1068 KOps/s 3.1168 KOps/s $\color{#d91a1a}-0.32\%$
test_items_stack_nested_leaf 92.4420μs 60.1867μs 16.6150 KOps/s 16.8368 KOps/s $\color{#d91a1a}-1.32\%$
test_items_stack_nested_locked 0.3779ms 0.3256ms 3.0716 KOps/s 3.1281 KOps/s $\color{#d91a1a}-1.81\%$
test_keys 26.2000μs 3.4676μs 288.3848 KOps/s 290.2968 KOps/s $\color{#d91a1a}-0.66\%$
test_keys_nested 0.1176ms 70.0270μs 14.2802 KOps/s 14.2248 KOps/s $\color{#35bf28}+0.39\%$
test_keys_nested_locked 0.7405ms 75.5559μs 13.2352 KOps/s 13.2188 KOps/s $\color{#35bf28}+0.12\%$
test_keys_nested_leaf 99.6120μs 61.3006μs 16.3131 KOps/s 16.2690 KOps/s $\color{#35bf28}+0.27\%$
test_keys_stack_nested 0.1134ms 71.0699μs 14.0707 KOps/s 14.0244 KOps/s $\color{#35bf28}+0.33\%$
test_keys_stack_nested_leaf 96.7220μs 62.0121μs 16.1259 KOps/s 16.0758 KOps/s $\color{#35bf28}+0.31\%$
test_keys_stack_nested_locked 0.1227ms 76.6528μs 13.0458 KOps/s 13.1293 KOps/s $\color{#d91a1a}-0.64\%$
test_values 6.2935μs 0.8451μs 1.1833 MOps/s 1.1646 MOps/s $\color{#35bf28}+1.60\%$
test_values_nested 61.5010μs 31.2146μs 32.0362 KOps/s 32.0366 KOps/s $-0.00\%$
test_values_nested_locked 62.3510μs 32.7966μs 30.4909 KOps/s 30.5560 KOps/s $\color{#d91a1a}-0.21\%$
test_values_nested_leaf 71.3010μs 33.5293μs 29.8247 KOps/s 29.7877 KOps/s $\color{#35bf28}+0.12\%$
test_values_stack_nested 77.5020μs 31.9632μs 31.2860 KOps/s 31.4777 KOps/s $\color{#d91a1a}-0.61\%$
test_values_stack_nested_leaf 66.4510μs 34.5583μs 28.9366 KOps/s 28.8912 KOps/s $\color{#35bf28}+0.16\%$
test_values_stack_nested_locked 56.4610μs 33.7518μs 29.6280 KOps/s 29.9184 KOps/s $\color{#d91a1a}-0.97\%$
test_membership 2.0345μs 0.5080μs 1.9684 MOps/s 1.9090 MOps/s $\color{#35bf28}+3.11\%$
test_membership_nested 16.8100μs 1.9138μs 522.5085 KOps/s 525.3668 KOps/s $\color{#d91a1a}-0.54\%$
test_membership_nested_leaf 13.6167μs 1.8755μs 533.1919 KOps/s 515.1189 KOps/s $\color{#35bf28}+3.51\%$
test_membership_stacked_nested 26.6410μs 2.0037μs 499.0777 KOps/s 510.1405 KOps/s $\color{#d91a1a}-2.17\%$
test_membership_stacked_nested_leaf 38.0110μs 1.9767μs 505.8899 KOps/s 507.8603 KOps/s $\color{#d91a1a}-0.39\%$
test_membership_nested_last 48.1210μs 2.8252μs 353.9510 KOps/s 354.8827 KOps/s $\color{#d91a1a}-0.26\%$
test_membership_nested_leaf_last 30.5910μs 2.8074μs 356.2017 KOps/s 353.1765 KOps/s $\color{#35bf28}+0.86\%$
test_membership_stacked_nested_last 25.2110μs 2.8360μs 352.6042 KOps/s 355.0950 KOps/s $\color{#d91a1a}-0.70\%$
test_membership_stacked_nested_leaf_last 32.0110μs 2.8324μs 353.0626 KOps/s 358.2391 KOps/s $\color{#d91a1a}-1.44\%$
test_nested_getleaf 38.8410μs 6.0235μs 166.0168 KOps/s 168.2635 KOps/s $\color{#d91a1a}-1.34\%$
test_nested_get 35.6100μs 5.7137μs 175.0191 KOps/s 176.6083 KOps/s $\color{#d91a1a}-0.90\%$
test_stacked_getleaf 53.0610μs 6.0254μs 165.9649 KOps/s 166.8888 KOps/s $\color{#d91a1a}-0.55\%$
test_stacked_get 29.6810μs 5.7180μs 174.8878 KOps/s 174.7048 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getitemleaf 54.9310μs 6.0895μs 164.2161 KOps/s 164.3046 KOps/s $\color{#d91a1a}-0.05\%$
test_nested_getitem 36.5500μs 5.7808μs 172.9870 KOps/s 172.7248 KOps/s $\color{#35bf28}+0.15\%$
test_stacked_getitemleaf 30.8210μs 6.0852μs 164.3343 KOps/s 163.8908 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_getitem 31.9200μs 5.7687μs 173.3498 KOps/s 173.8754 KOps/s $\color{#d91a1a}-0.30\%$
test_lock_nested 0.7240ms 0.3652ms 2.7380 KOps/s 2.7185 KOps/s $\color{#35bf28}+0.72\%$
test_lock_stack_nested 0.4087ms 0.3377ms 2.9609 KOps/s 2.9897 KOps/s $\color{#d91a1a}-0.96\%$
test_unlock_nested 0.6183ms 0.3067ms 3.2610 KOps/s 3.2708 KOps/s $\color{#d91a1a}-0.30\%$
test_unlock_stack_nested 0.3320ms 0.2767ms 3.6140 KOps/s 3.6656 KOps/s $\color{#d91a1a}-1.41\%$
test_flatten_speed 94.7010μs 72.2409μs 13.8426 KOps/s 13.9028 KOps/s $\color{#d91a1a}-0.43\%$
test_unflatten_speed 0.3447ms 0.2883ms 3.4690 KOps/s 3.4622 KOps/s $\color{#35bf28}+0.20\%$
test_common_ops 1.4889ms 0.5495ms 1.8197 KOps/s 1.5870 KOps/s $\textbf{\color{#35bf28}+14.66\%}$
test_creation 97.1220μs 1.4983μs 667.4240 KOps/s 674.7508 KOps/s $\color{#d91a1a}-1.09\%$
test_creation_empty 54.8810μs 6.6389μs 150.6279 KOps/s 98.2875 KOps/s $\textbf{\color{#35bf28}+53.25\%}$
test_creation_nested_1 32.9200μs 8.1066μs 123.3560 KOps/s 83.6683 KOps/s $\textbf{\color{#35bf28}+47.43\%}$
test_creation_nested_2 41.5300μs 10.6307μs 94.0674 KOps/s 70.1945 KOps/s $\textbf{\color{#35bf28}+34.01\%}$
test_clone 88.7010μs 9.4995μs 105.2691 KOps/s 96.8553 KOps/s $\textbf{\color{#35bf28}+8.69\%}$
test_getitem[int] 1.8014ms 10.4867μs 95.3587 KOps/s 93.5972 KOps/s $\color{#35bf28}+1.88\%$
test_getitem[slice_int] 0.1095ms 21.6504μs 46.1885 KOps/s 46.7051 KOps/s $\color{#d91a1a}-1.11\%$
test_getitem[range] 0.1357ms 37.2853μs 26.8202 KOps/s 27.4749 KOps/s $\color{#d91a1a}-2.38\%$
test_getitem[tuple] 0.1118ms 17.7983μs 56.1852 KOps/s 55.1522 KOps/s $\color{#35bf28}+1.87\%$
test_getitem[list] 0.2656ms 32.0100μs 31.2402 KOps/s 30.7690 KOps/s $\color{#35bf28}+1.53\%$
test_setitem_dim[int] 36.4700μs 17.4493μs 57.3088 KOps/s 53.1021 KOps/s $\textbf{\color{#35bf28}+7.92\%}$
test_setitem_dim[slice_int] 73.4810μs 38.8271μs 25.7552 KOps/s 26.1031 KOps/s $\color{#d91a1a}-1.33\%$
test_setitem_dim[range] 85.4920μs 52.5906μs 19.0148 KOps/s 19.3731 KOps/s $\color{#d91a1a}-1.85\%$
test_setitem_dim[tuple] 51.8410μs 30.2823μs 33.0226 KOps/s 31.5778 KOps/s $\color{#35bf28}+4.58\%$
test_setitem 95.7020μs 13.2720μs 75.3463 KOps/s 62.2823 KOps/s $\textbf{\color{#35bf28}+20.98\%}$
test_set 93.3610μs 12.9009μs 77.5141 KOps/s 63.9527 KOps/s $\textbf{\color{#35bf28}+21.21\%}$
test_set_shared 1.4942ms 0.1462ms 6.8414 KOps/s 6.7944 KOps/s $\color{#35bf28}+0.69\%$
test_update 0.5450ms 15.0275μs 66.5448 KOps/s 50.9896 KOps/s $\textbf{\color{#35bf28}+30.51\%}$
test_update_nested 0.1066ms 19.7470μs 50.6407 KOps/s 41.1250 KOps/s $\textbf{\color{#35bf28}+23.14\%}$
test_update__nested 1.1549ms 22.7880μs 43.8827 KOps/s 40.7182 KOps/s $\textbf{\color{#35bf28}+7.77\%}$
test_set_nested 85.9820μs 14.1755μs 70.5442 KOps/s 58.8379 KOps/s $\textbf{\color{#35bf28}+19.90\%}$
test_set_nested_new 95.4510μs 16.5061μs 60.5836 KOps/s 51.9305 KOps/s $\textbf{\color{#35bf28}+16.66\%}$
test_select 0.1061ms 27.2825μs 36.6535 KOps/s 33.1195 KOps/s $\textbf{\color{#35bf28}+10.67\%}$
test_select_nested 0.1196ms 42.0928μs 23.7570 KOps/s 23.9031 KOps/s $\color{#d91a1a}-0.61\%$
test_exclude_nested 95.6620μs 59.8920μs 16.6967 KOps/s 16.7785 KOps/s $\color{#d91a1a}-0.49\%$
test_empty[True] 1.0404ms 0.2558ms 3.9091 KOps/s 3.9341 KOps/s $\color{#d91a1a}-0.63\%$
test_empty[False] 3.8981μs 0.7414μs 1.3488 MOps/s 1.3226 MOps/s $\color{#35bf28}+1.98\%$
test_to 84.4910μs 54.0794μs 18.4913 KOps/s 17.9370 KOps/s $\color{#35bf28}+3.09\%$
test_to_nonblocking 93.8720μs 45.5589μs 21.9496 KOps/s 21.9437 KOps/s $\color{#35bf28}+0.03\%$
test_unbind_speed 0.9401ms 0.2296ms 4.3554 KOps/s 4.2813 KOps/s $\color{#35bf28}+1.73\%$
test_unbind_speed_stack0 0.2943ms 0.2302ms 4.3437 KOps/s 4.2951 KOps/s $\color{#35bf28}+1.13\%$
test_unbind_speed_stack1 93.8058ms 0.6508ms 1.5366 KOps/s 1.5246 KOps/s $\color{#35bf28}+0.79\%$
test_split 95.3257ms 1.6632ms 601.2618 Ops/s 600.2201 Ops/s $\color{#35bf28}+0.17\%$
test_chunk 95.3769ms 1.6853ms 593.3770 Ops/s 594.6535 Ops/s $\color{#d91a1a}-0.21\%$
test_consolidate[False-None] 97.8973ms 2.8971ms 345.1685 Ops/s 339.4132 Ops/s $\color{#35bf28}+1.70\%$
test_consolidate[default-None] 1.7535ms 1.6279ms 614.2922 Ops/s 616.4519 Ops/s $\color{#d91a1a}-0.35\%$
test_consolidate[reduce-overhead-None] 1.7815ms 1.6680ms 599.5164 Ops/s 603.1582 Ops/s $\color{#d91a1a}-0.60\%$
test_consolidate_njt[False-None] 6.7728ms 6.6280ms 150.8759 Ops/s 151.2445 Ops/s $\color{#d91a1a}-0.24\%$
test_to[False-False-None] 1.7594ms 1.6646ms 600.7350 Ops/s 597.9667 Ops/s $\color{#35bf28}+0.46\%$
test_to[True-False-None] 1.5971ms 1.2957ms 771.7695 Ops/s 757.3622 Ops/s $\color{#35bf28}+1.90\%$
test_to[within-False-None] 4.4048ms 4.1000ms 243.9047 Ops/s 247.5184 Ops/s $\color{#d91a1a}-1.46\%$
test_to[True-default-None] 5.2470ms 4.9655ms 201.3911 Ops/s 190.0147 Ops/s $\textbf{\color{#35bf28}+5.99\%}$
test_to_njt[False-False-None] 7.0795ms 6.9390ms 144.1134 Ops/s 144.9614 Ops/s $\color{#d91a1a}-0.58\%$
test_to_njt[True-False-None] 5.6519ms 5.5272ms 180.9222 Ops/s 184.3105 Ops/s $\color{#d91a1a}-1.84\%$
test_to_njt[within-False-None] 12.3351ms 12.1874ms 82.0521 Ops/s 83.3147 Ops/s $\color{#d91a1a}-1.52\%$
test_creation[device0] 0.5350ms 78.6010μs 12.7225 KOps/s 12.7221 KOps/s $+0.00\%$
test_creation_from_tensor 0.6241ms 82.3343μs 12.1456 KOps/s 12.0314 KOps/s $\color{#35bf28}+0.95\%$
test_add_one[memmap_tensor0] 0.2292ms 6.4518μs 154.9951 KOps/s 153.1174 KOps/s $\color{#35bf28}+1.23\%$
test_contiguous[memmap_tensor0] 1.9780μs 0.4402μs 2.2716 MOps/s 2.2627 MOps/s $\color{#35bf28}+0.39\%$
test_stack[memmap_tensor0] 36.3500μs 4.4547μs 224.4828 KOps/s 224.6867 KOps/s $\color{#d91a1a}-0.09\%$
test_memmaptd_index 2.3179ms 0.2506ms 3.9910 KOps/s 3.9761 KOps/s $\color{#35bf28}+0.37\%$
test_memmaptd_index_astensor 0.5678ms 0.3086ms 3.2401 KOps/s 3.2515 KOps/s $\color{#d91a1a}-0.35\%$
test_memmaptd_index_op 0.9763ms 0.5547ms 1.8027 KOps/s 1.6278 KOps/s $\textbf{\color{#35bf28}+10.75\%}$
test_serialize_model 0.1309s 0.1302s 7.6784 Ops/s 7.6538 Ops/s $\color{#35bf28}+0.32\%$
test_serialize_model_pickle 1.3493s 1.2149s 0.8231 Ops/s 0.8169 Ops/s $\color{#35bf28}+0.76\%$
test_serialize_weights 0.4544s 0.1763s 5.6736 Ops/s 7.7178 Ops/s $\textbf{\color{#d91a1a}-26.49\%}$
test_serialize_weights_returnearly 0.3487s 52.6934ms 18.9777 Ops/s 13.9779 Ops/s $\textbf{\color{#35bf28}+35.77\%}$
test_serialize_weights_pickle 1.3800s 1.2185s 0.8207 Ops/s 0.8140 Ops/s $\color{#35bf28}+0.82\%$
test_reshape_pytree 56.7110μs 22.3069μs 44.8292 KOps/s 43.2765 KOps/s $\color{#35bf28}+3.59\%$
test_reshape_td 53.2410μs 26.2464μs 38.1005 KOps/s 36.2379 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_view_pytree 54.4510μs 22.2482μs 44.9474 KOps/s 43.8982 KOps/s $\color{#35bf28}+2.39\%$
test_view_td 66.8710μs 29.3839μs 34.0322 KOps/s 31.1152 KOps/s $\textbf{\color{#35bf28}+9.37\%}$
test_unbind_pytree 58.5500μs 28.1056μs 35.5801 KOps/s 35.7272 KOps/s $\color{#d91a1a}-0.41\%$
test_unbind_td 0.9231ms 35.0534μs 28.5279 KOps/s 27.9881 KOps/s $\color{#35bf28}+1.93\%$
test_split_pytree 60.8710μs 30.3566μs 32.9418 KOps/s 32.3344 KOps/s $\color{#35bf28}+1.88\%$
test_split_td 1.1677ms 40.2031μs 24.8737 KOps/s 24.1434 KOps/s $\color{#35bf28}+3.02\%$
test_add_pytree 72.6310μs 33.3765μs 29.9612 KOps/s 29.6214 KOps/s $\color{#35bf28}+1.15\%$
test_add_td 80.5710μs 44.0590μs 22.6969 KOps/s 20.1705 KOps/s $\textbf{\color{#35bf28}+12.52\%}$
test_compile_add_one_nested[tensordict-compile] 0.1728ms 0.1177ms 8.4956 KOps/s 8.2172 KOps/s $\color{#35bf28}+3.39\%$
test_compile_add_one_nested[tensordict-eager] 0.2207ms 0.1230ms 8.1321 KOps/s 8.0534 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_nested[pytree-compile] 0.1428ms 96.5013μs 10.3626 KOps/s 9.9300 KOps/s $\color{#35bf28}+4.36\%$
test_compile_add_one_nested[pytree-eager] 1.7804ms 0.1501ms 6.6614 KOps/s 6.5352 KOps/s $\color{#35bf28}+1.93\%$
test_compile_copy_nested[tensordict-compile] 75.4420μs 22.8651μs 43.7348 KOps/s 44.4778 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_copy_nested[tensordict-eager] 57.6310μs 26.5754μs 37.6288 KOps/s 36.9817 KOps/s $\color{#35bf28}+1.75\%$
test_compile_copy_nested[pytree-compile] 0.2430ms 64.1766μs 15.5820 KOps/s 15.3427 KOps/s $\color{#35bf28}+1.56\%$
test_compile_copy_nested[pytree-eager] 86.1610μs 49.3798μs 20.2512 KOps/s 19.9673 KOps/s $\color{#35bf28}+1.42\%$
test_compile_add_one_flat[tensordict-compile] 0.1834ms 0.1426ms 7.0140 KOps/s 7.0097 KOps/s $\color{#35bf28}+0.06\%$
test_compile_add_one_flat[tensordict-eager] 0.3008ms 0.2066ms 4.8396 KOps/s 4.8676 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_flat[tensorclass-compile] 0.1568ms 97.8617μs 10.2185 KOps/s 10.2613 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_add_one_flat[tensorclass-eager] 0.1200ms 50.5119μs 19.7973 KOps/s 19.7777 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[pytree-compile] 0.1929ms 0.1434ms 6.9730 KOps/s 7.0134 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_flat[pytree-eager] 0.5265ms 0.4809ms 2.0796 KOps/s 2.0032 KOps/s $\color{#35bf28}+3.82\%$
test_compile_add_self_flat[tensordict-eager] 0.3758ms 0.2469ms 4.0510 KOps/s 4.0598 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_add_self_flat[tensordict-compile] 0.1805ms 0.1436ms 6.9643 KOps/s 6.7403 KOps/s $\color{#35bf28}+3.32\%$
test_compile_add_self_flat[tensorclass-eager] 0.1495ms 60.7910μs 16.4498 KOps/s 16.0604 KOps/s $\color{#35bf28}+2.42\%$
test_compile_add_self_flat[tensorclass-compile] 0.1498ms 98.3143μs 10.1715 KOps/s 9.9001 KOps/s $\color{#35bf28}+2.74\%$
test_compile_add_self_flat[pytree-eager] 0.4533ms 0.4147ms 2.4111 KOps/s 2.3532 KOps/s $\color{#35bf28}+2.46\%$
test_compile_add_self_flat[pytree-compile] 0.1949ms 0.1393ms 7.1787 KOps/s 6.8380 KOps/s $\color{#35bf28}+4.98\%$
test_compile_copy_flat[tensordict-compile] 70.1310μs 19.2506μs 51.9464 KOps/s 53.2355 KOps/s $\color{#d91a1a}-2.42\%$
test_compile_copy_flat[tensordict-eager] 0.1157ms 26.5667μs 37.6410 KOps/s 37.2649 KOps/s $\color{#35bf28}+1.01\%$
test_compile_copy_flat[pytree-compile] 0.1034ms 69.4352μs 14.4019 KOps/s 14.3772 KOps/s $\color{#35bf28}+0.17\%$
test_compile_copy_flat[pytree-eager] 87.7110μs 51.7849μs 19.3107 KOps/s 19.1791 KOps/s $\color{#35bf28}+0.69\%$
test_compile_assign_and_add[tensordict-compile] 1.6395ms 0.4442ms 2.2513 KOps/s 2.2252 KOps/s $\color{#35bf28}+1.18\%$
test_compile_assign_and_add[tensordict-eager] 2.7127ms 2.5634ms 390.1074 Ops/s 386.3010 Ops/s $\color{#35bf28}+0.99\%$
test_compile_assign_and_add[pytree-compile] 1.6266ms 0.4371ms 2.2879 KOps/s 2.2642 KOps/s $\color{#35bf28}+1.05\%$
test_compile_assign_and_add[pytree-eager] 2.6914ms 2.6146ms 382.4646 Ops/s 379.6278 Ops/s $\color{#35bf28}+0.75\%$
test_compile_indexing[tensor-tensordict-compile] 0.4053ms 0.1136ms 8.8057 KOps/s 8.8124 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[tensor-tensordict-eager] 0.5571ms 77.9440μs 12.8297 KOps/s 12.6303 KOps/s $\color{#35bf28}+1.58\%$
test_compile_indexing[tensor-tensorclass-compile] 0.3616ms 0.1058ms 9.4482 KOps/s 9.3877 KOps/s $\color{#35bf28}+0.64\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1100ms 68.3832μs 14.6235 KOps/s 14.0572 KOps/s $\color{#35bf28}+4.03\%$
test_compile_indexing[tensor-pytree-compile] 0.2043ms 0.1060ms 9.4349 KOps/s 9.4480 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_indexing[tensor-pytree-eager] 0.1088ms 67.5844μs 14.7963 KOps/s 14.6422 KOps/s $\color{#35bf28}+1.05\%$
test_compile_indexing[slice-tensordict-compile] 0.1466ms 99.6690μs 10.0332 KOps/s 9.6933 KOps/s $\color{#35bf28}+3.51\%$
test_compile_indexing[slice-tensordict-eager] 0.1437ms 17.5855μs 56.8649 KOps/s 44.9052 KOps/s $\textbf{\color{#35bf28}+26.63\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1276ms 95.8894μs 10.4287 KOps/s 10.3342 KOps/s $\color{#35bf28}+0.91\%$
test_compile_indexing[slice-tensorclass-eager] 46.9010μs 16.0691μs 62.2312 KOps/s 60.0822 KOps/s $\color{#35bf28}+3.58\%$
test_compile_indexing[slice-pytree-compile] 0.1583ms 96.8449μs 10.3258 KOps/s 10.2100 KOps/s $\color{#35bf28}+1.13\%$
test_compile_indexing[slice-pytree-eager] 46.9210μs 16.0302μs 62.3821 KOps/s 60.1940 KOps/s $\color{#35bf28}+3.64\%$
test_compile_indexing[int-tensordict-compile] 0.1804ms 0.1025ms 9.7560 KOps/s 9.7759 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[int-tensordict-eager] 0.5841ms 17.0476μs 58.6594 KOps/s 54.9651 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_compile_indexing[int-tensorclass-compile] 0.1403ms 98.4024μs 10.1624 KOps/s 9.9267 KOps/s $\color{#35bf28}+2.37\%$
test_compile_indexing[int-tensorclass-eager] 78.7110μs 15.7422μs 63.5236 KOps/s 60.9937 KOps/s $\color{#35bf28}+4.15\%$
test_compile_indexing[int-pytree-compile] 0.2247ms 97.0885μs 10.2999 KOps/s 10.2089 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[int-pytree-eager] 48.0610μs 15.9904μs 62.5376 KOps/s 60.4050 KOps/s $\color{#35bf28}+3.53\%$
test_mod_add[eager] 74.0310μs 29.8692μs 33.4793 KOps/s 29.9958 KOps/s $\textbf{\color{#35bf28}+11.61\%}$
test_mod_add[compile] 0.2141ms 75.1120μs 13.3134 KOps/s 12.5770 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_mod_add[compile-overhead] 0.3192ms 0.1629ms 6.1395 KOps/s 5.8702 KOps/s $\color{#35bf28}+4.59\%$
test_mod_wrap[eager] 0.3270ms 0.2493ms 4.0117 KOps/s 4.0341 KOps/s $\color{#d91a1a}-0.56\%$
test_mod_wrap[compile] 1.5744ms 0.2793ms 3.5806 KOps/s 3.4650 KOps/s $\color{#35bf28}+3.34\%$
test_mod_wrap[compile-overhead] 8.1331ms 4.2472ms 235.4465 Ops/s 237.0145 Ops/s $\color{#d91a1a}-0.66\%$
test_mod_wrap_and_backward[eager] 1.5662ms 1.4427ms 693.1566 Ops/s 686.0125 Ops/s $\color{#35bf28}+1.04\%$
test_mod_wrap_and_backward[compile] 1.4883ms 1.3540ms 738.5606 Ops/s 724.5415 Ops/s $\color{#35bf28}+1.93\%$
test_mod_wrap_and_backward[compile-overhead] 1.5124ms 1.0274ms 973.3347 Ops/s 973.7381 Ops/s $\color{#d91a1a}-0.04\%$
test_seq_add[eager] 0.1455ms 93.9174μs 10.6477 KOps/s 9.8408 KOps/s $\textbf{\color{#35bf28}+8.20\%}$
test_seq_add[compile] 0.3245ms 85.2847μs 11.7254 KOps/s 11.5437 KOps/s $\color{#35bf28}+1.57\%$
test_seq_add[compile-overhead] 0.1769ms 0.1255ms 7.9662 KOps/s 7.8526 KOps/s $\color{#35bf28}+1.45\%$
test_seq_wrap[eager] 0.4595ms 0.3755ms 2.6629 KOps/s 2.5606 KOps/s $\color{#35bf28}+4.00\%$
test_seq_wrap[compile] 0.3653ms 0.2955ms 3.3844 KOps/s 3.3145 KOps/s $\color{#35bf28}+2.11\%$
test_seq_wrap[compile-overhead] 0.2667ms 0.2179ms 4.5897 KOps/s 4.5167 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_runtime[False-eager] 0.8052ms 0.7404ms 1.3506 KOps/s 1.3104 KOps/s $\color{#35bf28}+3.07\%$
test_func_call_runtime[False-compile] 1.0692ms 0.7286ms 1.3725 KOps/s 1.3243 KOps/s $\color{#35bf28}+3.64\%$
test_func_call_runtime[False-compile-overhead] 0.4056ms 0.3558ms 2.8102 KOps/s 2.7845 KOps/s $\color{#35bf28}+0.92\%$
test_func_call_runtime[True-eager] 0.9626ms 0.8988ms 1.1126 KOps/s 1.0846 KOps/s $\color{#35bf28}+2.59\%$
test_func_call_runtime[True-compile] 0.8036ms 0.7466ms 1.3393 KOps/s 1.2944 KOps/s $\color{#35bf28}+3.47\%$
test_func_call_runtime[True-compile-overhead] 0.4269ms 0.3780ms 2.6455 KOps/s 2.6563 KOps/s $\color{#d91a1a}-0.41\%$
test_func_call_cm_runtime[False-eager] 0.8162ms 0.7340ms 1.3624 KOps/s 1.3233 KOps/s $\color{#35bf28}+2.95\%$
test_func_call_cm_runtime[False-compile] 0.9570ms 0.7352ms 1.3602 KOps/s 1.3229 KOps/s $\color{#35bf28}+2.82\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4467ms 0.3575ms 2.7972 KOps/s 2.7812 KOps/s $\color{#35bf28}+0.58\%$
test_func_call_cm_runtime[True-eager] 1.0974ms 0.9991ms 1.0009 KOps/s 974.5474 Ops/s $\color{#35bf28}+2.70\%$
test_func_call_cm_runtime[True-compile] 0.8318ms 0.7807ms 1.2809 KOps/s 1.2467 KOps/s $\color{#35bf28}+2.74\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4541ms 0.4041ms 2.4747 KOps/s 2.4590 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_func_call_cm_runtime[eager] 2.5301ms 2.0502ms 487.7643 Ops/s 485.5176 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_func_call_cm_runtime[compile] 0.8703ms 0.7963ms 1.2558 KOps/s 1.2217 KOps/s $\color{#35bf28}+2.79\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4951ms 0.4043ms 2.4734 KOps/s 2.4567 KOps/s $\color{#35bf28}+0.68\%$
test_distributed 2.6820ms 0.2700ms 3.7040 KOps/s 8.1718 KOps/s $\textbf{\color{#d91a1a}-54.67\%}$
test_tdmodule 42.3300μs 12.9181μs 77.4107 KOps/s 66.6035 KOps/s $\textbf{\color{#35bf28}+16.23\%}$
test_tdmodule_dispatch 59.7010μs 25.3322μs 39.4754 KOps/s 34.1782 KOps/s $\textbf{\color{#35bf28}+15.50\%}$
test_tdseq 33.1710μs 14.3635μs 69.6210 KOps/s 60.4271 KOps/s $\textbf{\color{#35bf28}+15.21\%}$
test_tdseq_dispatch 47.8610μs 27.9416μs 35.7889 KOps/s 30.2651 KOps/s $\textbf{\color{#35bf28}+18.25\%}$
test_instantiation_functorch 1.6609ms 1.5263ms 655.1624 Ops/s 650.6488 Ops/s $\color{#35bf28}+0.69\%$
test_exec_functorch 0.2189ms 0.1444ms 6.9253 KOps/s 6.7086 KOps/s $\color{#35bf28}+3.23\%$
test_exec_functional_call 0.2508ms 0.1381ms 7.2385 KOps/s 6.9436 KOps/s $\color{#35bf28}+4.25\%$
test_exec_td_decorator 0.3633ms 0.1820ms 5.4944 KOps/s 5.3079 KOps/s $\color{#35bf28}+3.51\%$
test_vmap_mlp_speed_decorator[True-True] 0.7951ms 0.6638ms 1.5064 KOps/s 1.4940 KOps/s $\color{#35bf28}+0.83\%$
test_vmap_mlp_speed_decorator[True-False] 0.8386ms 0.6613ms 1.5122 KOps/s 1.4918 KOps/s $\color{#35bf28}+1.37\%$
test_vmap_mlp_speed_decorator[False-True] 0.6909ms 0.5823ms 1.7175 KOps/s 1.7141 KOps/s $\color{#35bf28}+0.20\%$
test_vmap_mlp_speed_decorator[False-False] 0.6872ms 0.5822ms 1.7177 KOps/s 1.7138 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_transformer_speed_decorator[True-True] 19.2951ms 19.0777ms 52.4171 Ops/s 52.8064 Ops/s $\color{#d91a1a}-0.74\%$
test_vmap_transformer_speed_decorator[True-False] 19.5875ms 19.0805ms 52.4095 Ops/s 52.7127 Ops/s $\color{#d91a1a}-0.58\%$
test_vmap_transformer_speed_decorator[False-True] 19.9009ms 19.0539ms 52.4826 Ops/s 53.1978 Ops/s $\color{#d91a1a}-1.34\%$
test_vmap_transformer_speed_decorator[False-False] 19.1782ms 19.0055ms 52.6165 Ops/s 53.0641 Ops/s $\color{#d91a1a}-0.84\%$
test_to_module_speed[True] 1.0707ms 0.9392ms 1.0648 KOps/s 1.0559 KOps/s $\color{#35bf28}+0.84\%$
test_to_module_speed[False] 1.4046ms 0.9128ms 1.0955 KOps/s 1.0888 KOps/s $\color{#35bf28}+0.61\%$
test_tc_init 59.6510μs 33.8333μs 29.5567 KOps/s 27.3859 KOps/s $\textbf{\color{#35bf28}+7.93\%}$
test_tc_init_nested 0.1609ms 67.2766μs 14.8640 KOps/s 13.8990 KOps/s $\textbf{\color{#35bf28}+6.94\%}$
test_tc_first_layer_tensor 3.9857μs 0.7016μs 1.4254 MOps/s 1.4384 MOps/s $\color{#d91a1a}-0.91\%$
test_tc_first_layer_nontensor 28.2600μs 2.3183μs 431.3436 KOps/s 434.0386 KOps/s $\color{#d91a1a}-0.62\%$
test_tc_second_layer_tensor 8.5600μs 1.4517μs 688.8510 KOps/s 696.5152 KOps/s $\color{#d91a1a}-1.10\%$
test_tc_second_layer_nontensor 30.4700μs 3.0618μs 326.6016 KOps/s 327.9481 KOps/s $\color{#d91a1a}-0.41\%$
test_unbind 0.2419s 10.0431ms 99.5708 Ops/s 152.2534 Ops/s $\textbf{\color{#d91a1a}-34.60\%}$
test_full_like 9.3498ms 9.0935ms 109.9684 Ops/s 107.9376 Ops/s $\color{#35bf28}+1.88\%$
test_zeros_like 9.2813ms 7.2239ms 138.4288 Ops/s 116.1026 Ops/s $\textbf{\color{#35bf28}+19.23\%}$
test_ones_like 5.4452ms 4.3172ms 231.6340 Ops/s 233.0551 Ops/s $\color{#d91a1a}-0.61\%$
test_clone 11.3027ms 9.0198ms 110.8673 Ops/s 159.0962 Ops/s $\textbf{\color{#d91a1a}-30.31\%}$
test_squeeze 53.9410μs 9.2749μs 107.8184 KOps/s 106.9090 KOps/s $\color{#35bf28}+0.85\%$
test_unsqueeze 0.1240ms 70.6326μs 14.1578 KOps/s 13.8374 KOps/s $\color{#35bf28}+2.32\%$
test_split 0.3890ms 0.1579ms 6.3321 KOps/s 6.1325 KOps/s $\color{#35bf28}+3.26\%$
test_permute 0.2364ms 0.1774ms 5.6375 KOps/s 5.6296 KOps/s $\color{#35bf28}+0.14\%$
test_stack 50.9813ms 50.6547ms 19.7415 Ops/s 19.7354 Ops/s $\color{#35bf28}+0.03\%$
test_cat 50.7971ms 50.4687ms 19.8143 Ops/s 23.6468 Ops/s $\textbf{\color{#d91a1a}-16.21\%}$

vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 3ae83c4ef90a9377405aebbf1761ace1a39417b1
Pull Request resolved: #1078

(cherry picked from commit 84d31db)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants