Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Proper masks for padding with custom pad value #1185

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 15, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: 0580f89ce9bbaf5a13bab33f9c9b8f5a9e9df96f
Pull Request resolved: #1185
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 15, 2025
@vmoens vmoens merged commit 905d077 into gh/vmoens/45/base Jan 15, 2025
29 of 37 checks passed
vmoens added a commit that referenced this pull request Jan 15, 2025
ghstack-source-id: 0580f89ce9bbaf5a13bab33f9c9b8f5a9e9df96f
Pull Request resolved: #1185
@vmoens vmoens deleted the gh/vmoens/45/head branch January 15, 2025 15:53
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 46.0150μs 21.0085μs 47.5999 KOps/s 49.5006 KOps/s $\color{#d91a1a}-3.84\%$
test_plain_set_stack_nested 82.5130μs 21.3020μs 46.9440 KOps/s 49.0733 KOps/s $\color{#d91a1a}-4.34\%$
test_plain_set_nested_inplace 68.1370μs 23.0831μs 43.3217 KOps/s 46.0904 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_plain_set_stack_nested_inplace 75.6510μs 22.9213μs 43.6275 KOps/s 46.5680 KOps/s $\textbf{\color{#d91a1a}-6.31\%}$
test_items 49.7330μs 4.2052μs 237.7995 KOps/s 236.1933 KOps/s $\color{#35bf28}+0.68\%$
test_items_nested 0.5831ms 0.4013ms 2.4920 KOps/s 2.5599 KOps/s $\color{#d91a1a}-2.65\%$
test_items_nested_locked 0.7123ms 0.3992ms 2.5052 KOps/s 2.5563 KOps/s $\color{#d91a1a}-2.00\%$
test_items_nested_leaf 0.1672ms 78.4579μs 12.7457 KOps/s 12.9929 KOps/s $\color{#d91a1a}-1.90\%$
test_items_stack_nested 0.5683ms 0.4015ms 2.4905 KOps/s 2.5525 KOps/s $\color{#d91a1a}-2.43\%$
test_items_stack_nested_leaf 0.1575ms 81.4893μs 12.2716 KOps/s 13.0697 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_items_stack_nested_locked 0.6115ms 0.4001ms 2.4995 KOps/s 2.5230 KOps/s $\color{#d91a1a}-0.93\%$
test_keys 41.9180μs 3.4276μs 291.7536 KOps/s 276.7120 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_keys_nested 0.2876ms 0.1633ms 6.1252 KOps/s 6.1345 KOps/s $\color{#d91a1a}-0.15\%$
test_keys_nested_locked 2.0667ms 0.1692ms 5.9096 KOps/s 5.8916 KOps/s $\color{#35bf28}+0.30\%$
test_keys_nested_leaf 0.2656ms 0.1443ms 6.9302 KOps/s 6.9882 KOps/s $\color{#d91a1a}-0.83\%$
test_keys_stack_nested 0.2612ms 0.1617ms 6.1836 KOps/s 6.1160 KOps/s $\color{#35bf28}+1.11\%$
test_keys_stack_nested_leaf 0.2549ms 0.1410ms 7.0924 KOps/s 7.0025 KOps/s $\color{#35bf28}+1.28\%$
test_keys_stack_nested_locked 0.2617ms 0.1683ms 5.9420 KOps/s 5.8847 KOps/s $\color{#35bf28}+0.97\%$
test_values 8.6722μs 1.0454μs 956.5683 KOps/s 953.2344 KOps/s $\color{#35bf28}+0.35\%$
test_values_nested 0.1143ms 63.0279μs 15.8660 KOps/s 16.2558 KOps/s $\color{#d91a1a}-2.40\%$
test_values_nested_locked 0.1185ms 63.0339μs 15.8645 KOps/s 16.2182 KOps/s $\color{#d91a1a}-2.18\%$
test_values_nested_leaf 0.1242ms 71.8792μs 13.9122 KOps/s 14.3820 KOps/s $\color{#d91a1a}-3.27\%$
test_values_stack_nested 0.1098ms 63.5127μs 15.7449 KOps/s 16.1471 KOps/s $\color{#d91a1a}-2.49\%$
test_values_stack_nested_leaf 0.1247ms 71.4445μs 13.9969 KOps/s 13.5777 KOps/s $\color{#35bf28}+3.09\%$
test_values_stack_nested_locked 0.1191ms 63.2642μs 15.8067 KOps/s 16.1012 KOps/s $\color{#d91a1a}-1.83\%$
test_membership 55.9310μs 0.8673μs 1.1531 MOps/s 1.1402 MOps/s $\color{#35bf28}+1.13\%$
test_membership_nested 39.8250μs 2.8262μs 353.8352 KOps/s 346.4926 KOps/s $\color{#35bf28}+2.12\%$
test_membership_nested_leaf 44.3520μs 2.8806μs 347.1464 KOps/s 347.1271 KOps/s $+0.01\%$
test_membership_stacked_nested 34.7250μs 2.8212μs 354.4608 KOps/s 350.7223 KOps/s $\color{#35bf28}+1.07\%$
test_membership_stacked_nested_leaf 40.8360μs 2.8554μs 350.2141 KOps/s 344.5498 KOps/s $\color{#35bf28}+1.64\%$
test_membership_nested_last 41.7670μs 4.2543μs 235.0573 KOps/s 229.1790 KOps/s $\color{#35bf28}+2.56\%$
test_membership_nested_leaf_last 43.1110μs 4.3173μs 231.6286 KOps/s 225.3108 KOps/s $\color{#35bf28}+2.80\%$
test_membership_stacked_nested_last 42.9500μs 5.0110μs 199.5618 KOps/s 231.3741 KOps/s $\textbf{\color{#d91a1a}-13.75\%}$
test_membership_stacked_nested_leaf_last 47.2770μs 5.0788μs 196.8974 KOps/s 227.0917 KOps/s $\textbf{\color{#d91a1a}-13.30\%}$
test_nested_getleaf 49.0820μs 10.5699μs 94.6081 KOps/s 93.7361 KOps/s $\color{#35bf28}+0.93\%$
test_nested_get 47.4580μs 9.9901μs 100.0990 KOps/s 98.7137 KOps/s $\color{#35bf28}+1.40\%$
test_stacked_getleaf 46.3170μs 10.4901μs 95.3283 KOps/s 95.5808 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_get 51.5160μs 9.9829μs 100.1714 KOps/s 99.6017 KOps/s $\color{#35bf28}+0.57\%$
test_nested_getitemleaf 52.6980μs 11.0708μs 90.3278 KOps/s 89.9414 KOps/s $\color{#35bf28}+0.43\%$
test_nested_getitem 47.8990μs 10.6348μs 94.0309 KOps/s 93.5253 KOps/s $\color{#35bf28}+0.54\%$
test_stacked_getitemleaf 55.5440μs 11.0171μs 90.7679 KOps/s 85.1366 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_stacked_getitem 38.6920μs 10.5459μs 94.8233 KOps/s 87.1899 KOps/s $\textbf{\color{#35bf28}+8.75\%}$
test_lock_nested 4.6409ms 0.4507ms 2.2187 KOps/s 1.7701 KOps/s $\textbf{\color{#35bf28}+25.34\%}$
test_lock_stack_nested 0.8073ms 0.4162ms 2.4027 KOps/s 2.3864 KOps/s $\color{#35bf28}+0.69\%$
test_unlock_nested 1.0750ms 0.3751ms 2.6658 KOps/s 2.6520 KOps/s $\color{#35bf28}+0.52\%$
test_unlock_stack_nested 0.6126ms 0.3389ms 2.9505 KOps/s 2.9654 KOps/s $\color{#d91a1a}-0.50\%$
test_flatten_speed 0.1929ms 0.1015ms 9.8478 KOps/s 10.1419 KOps/s $\color{#d91a1a}-2.90\%$
test_unflatten_speed 0.9168ms 0.5194ms 1.9253 KOps/s 1.9650 KOps/s $\color{#d91a1a}-2.02\%$
test_common_ops 1.7686ms 0.7987ms 1.2521 KOps/s 1.2780 KOps/s $\color{#d91a1a}-2.03\%$
test_creation 56.3250μs 2.4601μs 406.4941 KOps/s 407.0420 KOps/s $\color{#d91a1a}-0.13\%$
test_creation_empty 52.3570μs 12.7853μs 78.2151 KOps/s 86.1559 KOps/s $\textbf{\color{#d91a1a}-9.22\%}$
test_creation_nested_1 43.9520μs 15.5759μs 64.2016 KOps/s 69.5487 KOps/s $\textbf{\color{#d91a1a}-7.69\%}$
test_creation_nested_2 57.6470μs 20.5014μs 48.7771 KOps/s 52.5012 KOps/s $\textbf{\color{#d91a1a}-7.09\%}$
test_clone 91.4300μs 13.3411μs 74.9565 KOps/s 76.0282 KOps/s $\color{#d91a1a}-1.41\%$
test_getitem[int] 1.3876ms 12.8388μs 77.8891 KOps/s 78.1949 KOps/s $\color{#d91a1a}-0.39\%$
test_getitem[slice_int] 0.1394ms 23.9976μs 41.6708 KOps/s 40.7639 KOps/s $\color{#35bf28}+2.22\%$
test_getitem[range] 0.2964ms 46.5292μs 21.4919 KOps/s 19.9627 KOps/s $\textbf{\color{#35bf28}+7.66\%}$
test_getitem[tuple] 0.1308ms 19.9856μs 50.0362 KOps/s 50.0060 KOps/s $\color{#35bf28}+0.06\%$
test_getitem[list] 0.2058ms 41.3968μs 24.1564 KOps/s 23.1713 KOps/s $\color{#35bf28}+4.25\%$
test_setitem_dim[int] 51.0150μs 25.2102μs 39.6665 KOps/s 38.6785 KOps/s $\color{#35bf28}+2.55\%$
test_setitem_dim[slice_int] 82.3530μs 50.1661μs 19.9338 KOps/s 19.2640 KOps/s $\color{#35bf28}+3.48\%$
test_setitem_dim[range] 0.1349ms 72.7233μs 13.7508 KOps/s 13.4753 KOps/s $\color{#35bf28}+2.04\%$
test_setitem_dim[tuple] 81.1410μs 40.4188μs 24.7410 KOps/s 24.8898 KOps/s $\color{#d91a1a}-0.60\%$
test_setitem 0.1412ms 20.9239μs 47.7921 KOps/s 49.1914 KOps/s $\color{#d91a1a}-2.84\%$
test_set 97.7720μs 20.1464μs 49.6367 KOps/s 50.5530 KOps/s $\color{#d91a1a}-1.81\%$
test_set_shared 9.0176ms 0.1735ms 5.7633 KOps/s 5.9606 KOps/s $\color{#d91a1a}-3.31\%$
test_update 0.3081ms 23.2935μs 42.9305 KOps/s 44.5555 KOps/s $\color{#d91a1a}-3.65\%$
test_update_nested 0.2023ms 33.3671μs 29.9696 KOps/s 30.7382 KOps/s $\color{#d91a1a}-2.50\%$
test_update__nested 0.3569ms 33.0831μs 30.2270 KOps/s 29.1636 KOps/s $\color{#35bf28}+3.65\%$
test_set_nested 0.1304ms 22.1240μs 45.1998 KOps/s 45.7239 KOps/s $\color{#d91a1a}-1.15\%$
test_set_nested_new 0.1406ms 26.7976μs 37.3168 KOps/s 38.4600 KOps/s $\color{#d91a1a}-2.97\%$
test_select 0.2425ms 43.1543μs 23.1727 KOps/s 23.3315 KOps/s $\color{#d91a1a}-0.68\%$
test_select_nested 0.1310ms 63.9141μs 15.6460 KOps/s 15.6835 KOps/s $\color{#d91a1a}-0.24\%$
test_exclude_nested 0.1760ms 82.4527μs 12.1282 KOps/s 12.3011 KOps/s $\color{#d91a1a}-1.41\%$
test_empty[True] 0.8286ms 0.4101ms 2.4384 KOps/s 2.4860 KOps/s $\color{#d91a1a}-1.91\%$
test_empty[False] 10.5423μs 1.3868μs 721.0973 KOps/s 736.1673 KOps/s $\color{#d91a1a}-2.05\%$
test_unbind_speed 0.3606ms 0.2685ms 3.7248 KOps/s 3.7375 KOps/s $\color{#d91a1a}-0.34\%$
test_unbind_speed_stack0 0.5842ms 0.2604ms 3.8408 KOps/s 3.7881 KOps/s $\color{#35bf28}+1.39\%$
test_unbind_speed_stack1 0.1164s 0.8025ms 1.2462 KOps/s 1.3515 KOps/s $\textbf{\color{#d91a1a}-7.79\%}$
test_split 1.7770ms 1.5710ms 636.5263 Ops/s 558.2888 Ops/s $\textbf{\color{#35bf28}+14.01\%}$
test_chunk 0.1159s 1.9591ms 510.4384 Ops/s 566.2080 Ops/s $\textbf{\color{#d91a1a}-9.85\%}$
test_consolidate_njt[False-None] 9.7298ms 8.2390ms 121.3745 Ops/s 120.7094 Ops/s $\color{#35bf28}+0.55\%$
test_creation[device0] 0.2776ms 90.5429μs 11.0445 KOps/s 10.4328 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_creation_from_tensor 3.6522ms 93.3657μs 10.7106 KOps/s 10.6138 KOps/s $\color{#35bf28}+0.91\%$
test_add_one[memmap_tensor0] 0.1366ms 4.9204μs 203.2357 KOps/s 204.9534 KOps/s $\color{#d91a1a}-0.84\%$
test_contiguous[memmap_tensor0] 13.1750μs 0.5053μs 1.9792 MOps/s 1.9605 MOps/s $\color{#35bf28}+0.95\%$
test_stack[memmap_tensor0] 30.1460μs 3.4121μs 293.0782 KOps/s 296.0159 KOps/s $\color{#d91a1a}-0.99\%$
test_memmaptd_index 0.9522ms 0.2345ms 4.2638 KOps/s 4.3586 KOps/s $\color{#d91a1a}-2.18\%$
test_memmaptd_index_astensor 0.8008ms 0.3206ms 3.1189 KOps/s 3.1492 KOps/s $\color{#d91a1a}-0.96\%$
test_memmaptd_index_op 0.9889ms 0.6002ms 1.6662 KOps/s 1.7314 KOps/s $\color{#d91a1a}-3.77\%$
test_serialize_model 0.1239s 0.1131s 8.8381 Ops/s 8.5717 Ops/s $\color{#35bf28}+3.11\%$
test_serialize_model_pickle 0.4652s 0.3922s 2.5494 Ops/s 2.5302 Ops/s $\color{#35bf28}+0.76\%$
test_serialize_weights 0.1242s 0.1157s 8.6418 Ops/s 8.6640 Ops/s $\color{#d91a1a}-0.26\%$
test_serialize_weights_returnearly 0.2778s 0.1799s 5.5578 Ops/s 6.3170 Ops/s $\textbf{\color{#d91a1a}-12.02\%}$
test_serialize_weights_pickle 0.4452s 0.4106s 2.4352 Ops/s 1.1240 Ops/s $\textbf{\color{#35bf28}+116.66\%}$
test_serialize_weights_filesystem 0.1482s 0.1420s 7.0439 Ops/s 7.0488 Ops/s $\color{#d91a1a}-0.07\%$
test_serialize_model_filesystem 0.1527s 0.1486s 6.7301 Ops/s 6.8614 Ops/s $\color{#d91a1a}-1.91\%$
test_reshape_pytree 61.0540μs 26.3199μs 37.9940 KOps/s 38.0728 KOps/s $\color{#d91a1a}-0.21\%$
test_reshape_td 0.1078ms 32.1075μs 31.1454 KOps/s 31.2876 KOps/s $\color{#d91a1a}-0.45\%$
test_view_pytree 61.0140μs 25.9129μs 38.5908 KOps/s 38.3957 KOps/s $\color{#35bf28}+0.51\%$
test_view_td 93.7350μs 37.9609μs 26.3429 KOps/s 26.9222 KOps/s $\color{#d91a1a}-2.15\%$
test_unbind_pytree 83.5060μs 29.2090μs 34.2360 KOps/s 34.6768 KOps/s $\color{#d91a1a}-1.27\%$
test_unbind_td 0.3314ms 38.8822μs 25.7187 KOps/s 25.8764 KOps/s $\color{#d91a1a}-0.61\%$
test_split_pytree 92.8720μs 29.1982μs 34.2486 KOps/s 33.9707 KOps/s $\color{#35bf28}+0.82\%$
test_split_td 0.4972ms 44.4478μs 22.4983 KOps/s 22.2038 KOps/s $\color{#35bf28}+1.33\%$
test_add_pytree 83.6360μs 34.9884μs 28.5809 KOps/s 28.8340 KOps/s $\color{#d91a1a}-0.88\%$
test_add_td 0.1493ms 56.3495μs 17.7464 KOps/s 17.5377 KOps/s $\color{#35bf28}+1.19\%$
test_compile_add_one_nested[tensordict-compile] 0.1288ms 61.8692μs 16.1631 KOps/s 16.1862 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_nested[tensordict-eager] 0.4550ms 0.1722ms 5.8075 KOps/s 5.7937 KOps/s $\color{#35bf28}+0.24\%$
test_compile_add_one_nested[pytree-compile] 0.1005ms 45.1566μs 22.1451 KOps/s 22.3914 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_add_one_nested[pytree-eager] 0.2139ms 0.1163ms 8.5983 KOps/s 8.5337 KOps/s $\color{#35bf28}+0.76\%$
test_compile_copy_nested[tensordict-compile] 68.6780μs 26.6209μs 37.5645 KOps/s 38.7658 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_copy_nested[tensordict-eager] 0.1176ms 58.7174μs 17.0307 KOps/s 17.1987 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_copy_nested[pytree-compile] 0.1682ms 76.8531μs 13.0118 KOps/s 13.2115 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_copy_nested[pytree-eager] 0.1559ms 65.8325μs 15.1901 KOps/s 15.1424 KOps/s $\color{#35bf28}+0.31\%$
test_compile_add_one_flat[tensordict-compile] 0.1810ms 0.1054ms 9.4882 KOps/s 9.5683 KOps/s $\color{#d91a1a}-0.84\%$
test_compile_add_one_flat[tensordict-eager] 0.4457ms 0.2146ms 4.6594 KOps/s 4.7043 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_add_one_flat[tensorclass-compile] 0.1379ms 45.7560μs 21.8551 KOps/s 21.6204 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_one_flat[tensorclass-eager] 0.5542ms 65.9426μs 15.1647 KOps/s 15.2218 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_add_one_flat[pytree-compile] 0.1916ms 0.1023ms 9.7797 KOps/s 9.8306 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_add_one_flat[pytree-eager] 0.4102ms 0.1974ms 5.0658 KOps/s 5.0237 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_self_flat[tensordict-eager] 0.4131ms 0.2290ms 4.3673 KOps/s 4.3736 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_self_flat[tensordict-compile] 0.2265ms 0.1061ms 9.4284 KOps/s 9.5469 KOps/s $\color{#d91a1a}-1.24\%$
test_compile_add_self_flat[tensorclass-eager] 0.1707ms 63.3245μs 15.7917 KOps/s 16.3713 KOps/s $\color{#d91a1a}-3.54\%$
test_compile_add_self_flat[tensorclass-compile] 0.3193ms 47.7311μs 20.9507 KOps/s 21.9257 KOps/s $\color{#d91a1a}-4.45\%$
test_compile_add_self_flat[pytree-eager] 0.2343ms 0.1559ms 6.4147 KOps/s 6.4497 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_add_self_flat[pytree-compile] 0.2331ms 0.1033ms 9.6798 KOps/s 9.8608 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_copy_flat[tensordict-compile] 69.9100μs 22.3550μs 44.7327 KOps/s 46.9303 KOps/s $\color{#d91a1a}-4.68\%$
test_compile_copy_flat[tensordict-eager] 0.2869ms 67.5124μs 14.8121 KOps/s 15.2980 KOps/s $\color{#d91a1a}-3.18\%$
test_compile_copy_flat[pytree-compile] 0.1506ms 77.6085μs 12.8852 KOps/s 13.1535 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_copy_flat[pytree-eager] 0.1324ms 66.3184μs 15.0788 KOps/s 15.1914 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_assign_and_add[tensordict-compile] 0.3046ms 0.2042ms 4.8968 KOps/s 4.9074 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_assign_and_add[tensordict-eager] 1.5266ms 1.3371ms 747.8985 Ops/s 774.5594 Ops/s $\color{#d91a1a}-3.44\%$
test_compile_assign_and_add[pytree-compile] 0.2859ms 0.1997ms 5.0082 KOps/s 5.0065 KOps/s $\color{#35bf28}+0.03\%$
test_compile_assign_and_add[pytree-eager] 1.3579ms 0.7655ms 1.3063 KOps/s 1.2877 KOps/s $\color{#35bf28}+1.45\%$
test_compile_assign_and_add_stack[compile] 0.5494ms 0.4448ms 2.2484 KOps/s 2.2590 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_assign_and_add_stack[eager] 5.2138ms 2.6803ms 373.0994 Ops/s 371.6976 Ops/s $\color{#35bf28}+0.38\%$
test_compile_indexing[tensor-tensordict-compile] 81.0210μs 35.4650μs 28.1968 KOps/s 28.5007 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_indexing[tensor-tensordict-eager] 0.5185ms 31.8056μs 31.4410 KOps/s 30.7241 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-tensorclass-compile] 78.5960μs 29.2247μs 34.2176 KOps/s 35.7381 KOps/s $\color{#d91a1a}-4.25\%$
test_compile_indexing[tensor-tensorclass-eager] 96.6600μs 22.8575μs 43.7493 KOps/s 44.5551 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_indexing[tensor-pytree-compile] 0.1694ms 29.9139μs 33.4293 KOps/s 34.7054 KOps/s $\color{#d91a1a}-3.68\%$
test_compile_indexing[tensor-pytree-eager] 77.8350μs 22.6794μs 44.0929 KOps/s 44.7910 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_indexing[slice-tensordict-compile] 0.1121ms 50.4940μs 19.8043 KOps/s 18.8860 KOps/s $\color{#35bf28}+4.86\%$
test_compile_indexing[slice-tensordict-eager] 0.6032ms 19.4896μs 51.3094 KOps/s 49.7320 KOps/s $\color{#35bf28}+3.17\%$
test_compile_indexing[slice-tensorclass-compile] 0.1599ms 43.4524μs 23.0137 KOps/s 22.8368 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[slice-tensorclass-eager] 0.1926ms 18.9412μs 52.7950 KOps/s 54.8608 KOps/s $\color{#d91a1a}-3.77\%$
test_compile_indexing[slice-pytree-compile] 0.1123ms 44.3471μs 22.5494 KOps/s 22.5929 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_indexing[slice-pytree-eager] 61.4340μs 18.3054μs 54.6288 KOps/s 54.7954 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[int-tensordict-compile] 0.1760ms 51.1564μs 19.5479 KOps/s 19.0155 KOps/s $\color{#35bf28}+2.80\%$
test_compile_indexing[int-tensordict-eager] 1.0827ms 19.5639μs 51.1146 KOps/s 51.0027 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[int-tensorclass-compile] 0.4071ms 44.5739μs 22.4347 KOps/s 22.5492 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_indexing[int-tensorclass-eager] 0.1074ms 18.1178μs 55.1943 KOps/s 54.7130 KOps/s $\color{#35bf28}+0.88\%$
test_compile_indexing[int-pytree-compile] 0.1204ms 44.1631μs 22.6433 KOps/s 22.5915 KOps/s $\color{#35bf28}+0.23\%$
test_compile_indexing[int-pytree-eager] 0.6194ms 18.5438μs 53.9265 KOps/s 54.6683 KOps/s $\color{#d91a1a}-1.36\%$
test_mod_add[eager] 0.1131ms 35.2952μs 28.3324 KOps/s 28.8804 KOps/s $\color{#d91a1a}-1.90\%$
test_mod_add[compile] 0.1367ms 48.2519μs 20.7246 KOps/s 21.0344 KOps/s $\color{#d91a1a}-1.47\%$
test_mod_add[compile-overhead] 0.1227ms 47.0711μs 21.2444 KOps/s 21.0278 KOps/s $\color{#35bf28}+1.03\%$
test_mod_wrap[eager] 0.3490ms 0.2255ms 4.4354 KOps/s 4.4899 KOps/s $\color{#d91a1a}-1.21\%$
test_mod_wrap[compile] 0.3121ms 0.2061ms 4.8512 KOps/s 4.7742 KOps/s $\color{#35bf28}+1.61\%$
test_mod_wrap[compile-overhead] 0.4077ms 0.2079ms 4.8106 KOps/s 4.8838 KOps/s $\color{#d91a1a}-1.50\%$
test_mod_wrap_and_backward[eager] 16.1705ms 12.2468ms 81.6542 Ops/s 82.0330 Ops/s $\color{#d91a1a}-0.46\%$
test_mod_wrap_and_backward[compile] 14.2347ms 13.0018ms 76.9125 Ops/s 72.8814 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_mod_wrap_and_backward[compile-overhead] 16.0613ms 13.0281ms 76.7571 Ops/s 71.8965 Ops/s $\textbf{\color{#35bf28}+6.76\%}$
test_seq_add[eager] 0.1913ms 0.1147ms 8.7153 KOps/s 8.4881 KOps/s $\color{#35bf28}+2.68\%$
test_seq_add[compile] 0.1398ms 62.4292μs 16.0181 KOps/s 15.9808 KOps/s $\color{#35bf28}+0.23\%$
test_seq_add[compile-overhead] 0.1232ms 61.1392μs 16.3561 KOps/s 16.4510 KOps/s $\color{#d91a1a}-0.58\%$
test_seq_wrap[eager] 0.7458ms 0.4547ms 2.1992 KOps/s 2.0770 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_seq_wrap[compile] 0.3475ms 0.2271ms 4.4035 KOps/s 4.3745 KOps/s $\color{#35bf28}+0.66\%$
test_seq_wrap[compile-overhead] 0.3830ms 0.2254ms 4.4375 KOps/s 4.3684 KOps/s $\color{#35bf28}+1.58\%$
test_func_call_runtime[False-eager] 1.0185ms 0.5386ms 1.8567 KOps/s 1.9001 KOps/s $\color{#d91a1a}-2.28\%$
test_func_call_runtime[False-compile] 0.8285ms 0.4243ms 2.3569 KOps/s 2.3695 KOps/s $\color{#d91a1a}-0.53\%$
test_func_call_runtime[False-compile-overhead] 0.7870ms 0.4237ms 2.3599 KOps/s 2.3542 KOps/s $\color{#35bf28}+0.24\%$
test_func_call_runtime[True-eager] 1.6216ms 0.7542ms 1.3258 KOps/s 1.3375 KOps/s $\color{#d91a1a}-0.87\%$
test_func_call_runtime[True-compile] 0.5825ms 0.4634ms 2.1582 KOps/s 2.1351 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_runtime[True-compile-overhead] 0.5856ms 0.4622ms 2.1633 KOps/s 2.1403 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_cm_runtime[False-eager] 0.6431ms 0.5305ms 1.8849 KOps/s 1.9120 KOps/s $\color{#d91a1a}-1.41\%$
test_func_call_cm_runtime[False-compile] 0.7388ms 0.4197ms 2.3825 KOps/s 2.3845 KOps/s $\color{#d91a1a}-0.09\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8783ms 0.4223ms 2.3682 KOps/s 2.3775 KOps/s $\color{#d91a1a}-0.39\%$
test_func_call_cm_runtime[True-eager] 1.2038ms 0.8781ms 1.1388 KOps/s 1.1179 KOps/s $\color{#35bf28}+1.87\%$
test_func_call_cm_runtime[True-compile] 0.9016ms 0.4856ms 2.0593 KOps/s 2.0547 KOps/s $\color{#35bf28}+0.23\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6210ms 0.4879ms 2.0498 KOps/s 2.0061 KOps/s $\color{#35bf28}+2.18\%$
test_vmap_func_call_cm_runtime[eager] 2.8580ms 1.9313ms 517.7922 Ops/s 528.3363 Ops/s $\color{#d91a1a}-2.00\%$
test_vmap_func_call_cm_runtime[compile] 0.9895ms 0.5170ms 1.9341 KOps/s 1.9198 KOps/s $\color{#35bf28}+0.75\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.0842ms 0.5280ms 1.8939 KOps/s 1.9329 KOps/s $\color{#d91a1a}-2.02\%$
test_distributed 0.3601ms 0.1232ms 8.1157 KOps/s 7.7630 KOps/s $\color{#35bf28}+4.54\%$
test_tdmodule 0.1094ms 25.7210μs 38.8787 KOps/s 36.6700 KOps/s $\textbf{\color{#35bf28}+6.02\%}$
test_tdmodule_dispatch 78.8370μs 48.4898μs 20.6229 KOps/s 20.2295 KOps/s $\color{#35bf28}+1.94\%$
test_tdseq 48.3900μs 28.2696μs 35.3737 KOps/s 34.2147 KOps/s $\color{#35bf28}+3.39\%$
test_tdseq_dispatch 86.7620μs 54.2254μs 18.4415 KOps/s 18.4378 KOps/s $\color{#35bf28}+0.02\%$
test_instantiation_functorch 2.5960ms 1.5173ms 659.0593 Ops/s 667.2560 Ops/s $\color{#d91a1a}-1.23\%$
test_exec_functorch 0.3980ms 0.1756ms 5.6956 KOps/s 5.5660 KOps/s $\color{#35bf28}+2.33\%$
test_exec_functional_call 0.2851ms 0.1687ms 5.9268 KOps/s 5.8303 KOps/s $\color{#35bf28}+1.66\%$
test_exec_td_decorator 0.5106ms 0.2286ms 4.3739 KOps/s 4.2672 KOps/s $\color{#35bf28}+2.50\%$
test_vmap_mlp_speed_decorator[True-True] 2.1458ms 0.6695ms 1.4937 KOps/s 1.5345 KOps/s $\color{#d91a1a}-2.66\%$
test_vmap_mlp_speed_decorator[True-False] 0.8515ms 0.6651ms 1.5034 KOps/s 1.5297 KOps/s $\color{#d91a1a}-1.72\%$
test_vmap_mlp_speed_decorator[False-True] 0.8383ms 0.5422ms 1.8445 KOps/s 1.8985 KOps/s $\color{#d91a1a}-2.85\%$
test_vmap_mlp_speed_decorator[False-False] 0.8648ms 0.5389ms 1.8555 KOps/s 1.9034 KOps/s $\color{#d91a1a}-2.52\%$
test_to_module_speed[True] 1.5534ms 1.3350ms 749.0888 Ops/s 754.1608 Ops/s $\color{#d91a1a}-0.67\%$
test_to_module_speed[False] 1.7269ms 1.2997ms 769.4336 Ops/s 772.9486 Ops/s $\color{#d91a1a}-0.45\%$
test_tc_init 87.8330μs 45.9937μs 21.7421 KOps/s 21.1106 KOps/s $\color{#35bf28}+2.99\%$
test_tc_init_nested 0.2799ms 91.7538μs 10.8987 KOps/s 10.7793 KOps/s $\color{#35bf28}+1.11\%$
test_tc_first_layer_tensor 18.7250μs 1.5080μs 663.1437 KOps/s 632.7027 KOps/s $\color{#35bf28}+4.81\%$
test_tc_first_layer_nontensor 28.4730μs 4.5700μs 218.8185 KOps/s 213.7197 KOps/s $\color{#35bf28}+2.39\%$
test_tc_second_layer_tensor 43.9620μs 2.7990μs 357.2725 KOps/s 338.9781 KOps/s $\textbf{\color{#35bf28}+5.40\%}$
test_tc_second_layer_nontensor 35.1360μs 5.9999μs 166.6708 KOps/s 165.6052 KOps/s $\color{#35bf28}+0.64\%$
test_unbind 0.2434s 13.9177ms 71.8510 Ops/s 74.7209 Ops/s $\color{#d91a1a}-3.84\%$
test_full_like 20.7625ms 15.8492ms 63.0945 Ops/s 79.1131 Ops/s $\textbf{\color{#d91a1a}-20.25\%}$
test_zeros_like 13.7654ms 8.1500ms 122.7001 Ops/s 135.1131 Ops/s $\textbf{\color{#d91a1a}-9.19\%}$
test_ones_like 12.8388ms 8.5374ms 117.1311 Ops/s 135.3900 Ops/s $\textbf{\color{#d91a1a}-13.49\%}$
test_clone 13.4490ms 10.6682ms 93.7368 Ops/s 107.2054 Ops/s $\textbf{\color{#d91a1a}-12.56\%}$
test_squeeze 66.9850μs 12.0372μs 83.0758 KOps/s 85.0955 KOps/s $\color{#d91a1a}-2.37\%$
test_unsqueeze 0.3142ms 90.3824μs 11.0641 KOps/s 10.9628 KOps/s $\color{#35bf28}+0.92\%$
test_split 0.3447ms 0.1875ms 5.3325 KOps/s 5.0888 KOps/s $\color{#35bf28}+4.79\%$
test_permute 0.3125ms 0.1985ms 5.0387 KOps/s 5.0405 KOps/s $\color{#d91a1a}-0.03\%$
test_stack 31.4457ms 28.0184ms 35.6909 Ops/s 39.2796 Ops/s $\textbf{\color{#d91a1a}-9.14\%}$
test_cat 35.0638ms 27.3414ms 36.5745 Ops/s 40.7860 Ops/s $\textbf{\color{#d91a1a}-10.33\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}44$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.4310μs 10.9992μs 90.9154 KOps/s 75.4151 KOps/s $\textbf{\color{#35bf28}+20.55\%}$
test_plain_set_stack_nested 38.0010μs 11.3136μs 88.3893 KOps/s 73.7871 KOps/s $\textbf{\color{#35bf28}+19.79\%}$
test_plain_set_nested_inplace 58.6710μs 12.2312μs 81.7582 KOps/s 68.7226 KOps/s $\textbf{\color{#35bf28}+18.97\%}$
test_plain_set_stack_nested_inplace 35.5110μs 12.2871μs 81.3864 KOps/s 68.7982 KOps/s $\textbf{\color{#35bf28}+18.30\%}$
test_items 0.1762ms 2.8925μs 345.7197 KOps/s 340.2915 KOps/s $\color{#35bf28}+1.60\%$
test_items_nested 0.5574ms 0.3693ms 2.7075 KOps/s 2.7359 KOps/s $\color{#d91a1a}-1.04\%$
test_items_nested_locked 0.5735ms 0.3721ms 2.6875 KOps/s 2.7706 KOps/s $\color{#d91a1a}-3.00\%$
test_items_nested_leaf 0.2444ms 58.7061μs 17.0340 KOps/s 16.9734 KOps/s $\color{#35bf28}+0.36\%$
test_items_stack_nested 0.5604ms 0.3693ms 2.7075 KOps/s 2.7451 KOps/s $\color{#d91a1a}-1.37\%$
test_items_stack_nested_leaf 0.1568ms 58.4080μs 17.1209 KOps/s 17.0618 KOps/s $\color{#35bf28}+0.35\%$
test_items_stack_nested_locked 0.4421ms 0.3658ms 2.7337 KOps/s 2.7524 KOps/s $\color{#d91a1a}-0.68\%$
test_keys 32.4600μs 3.4643μs 288.6567 KOps/s 289.9186 KOps/s $\color{#d91a1a}-0.44\%$
test_keys_nested 0.1801ms 87.2976μs 11.4551 KOps/s 11.3935 KOps/s $\color{#35bf28}+0.54\%$
test_keys_nested_locked 0.7575ms 93.4704μs 10.6986 KOps/s 10.5761 KOps/s $\color{#35bf28}+1.16\%$
test_keys_nested_leaf 0.1239ms 78.5533μs 12.7302 KOps/s 12.7056 KOps/s $\color{#35bf28}+0.19\%$
test_keys_stack_nested 0.1431ms 88.5967μs 11.2871 KOps/s 11.3242 KOps/s $\color{#d91a1a}-0.33\%$
test_keys_stack_nested_leaf 0.1227ms 80.1693μs 12.4736 KOps/s 12.6800 KOps/s $\color{#d91a1a}-1.63\%$
test_keys_stack_nested_locked 0.1547ms 94.5318μs 10.5785 KOps/s 10.6400 KOps/s $\color{#d91a1a}-0.58\%$
test_values 4.9852μs 0.8489μs 1.1781 MOps/s 1.1794 MOps/s $\color{#d91a1a}-0.11\%$
test_values_nested 64.6110μs 37.7796μs 26.4693 KOps/s 26.2035 KOps/s $\color{#35bf28}+1.01\%$
test_values_nested_locked 66.6310μs 39.1765μs 25.5255 KOps/s 25.2670 KOps/s $\color{#35bf28}+1.02\%$
test_values_nested_leaf 70.6810μs 41.9586μs 23.8330 KOps/s 23.4916 KOps/s $\color{#35bf28}+1.45\%$
test_values_stack_nested 0.1448ms 38.0709μs 26.2668 KOps/s 26.4032 KOps/s $\color{#d91a1a}-0.52\%$
test_values_stack_nested_leaf 75.1710μs 42.2917μs 23.6453 KOps/s 23.6554 KOps/s $\color{#d91a1a}-0.04\%$
test_values_stack_nested_locked 78.3920μs 39.7936μs 25.1297 KOps/s 25.2648 KOps/s $\color{#d91a1a}-0.53\%$
test_membership 2.7406μs 0.5084μs 1.9670 MOps/s 1.9722 MOps/s $\color{#d91a1a}-0.26\%$
test_membership_nested 17.1700μs 1.9873μs 503.1910 KOps/s 501.9855 KOps/s $\color{#35bf28}+0.24\%$
test_membership_nested_leaf 19.1855μs 1.9831μs 504.2615 KOps/s 504.3797 KOps/s $\color{#d91a1a}-0.02\%$
test_membership_stacked_nested 37.8400μs 2.0777μs 481.3091 KOps/s 479.2067 KOps/s $\color{#35bf28}+0.44\%$
test_membership_stacked_nested_leaf 69.3510μs 2.0555μs 486.4967 KOps/s 485.8151 KOps/s $\color{#35bf28}+0.14\%$
test_membership_nested_last 41.2010μs 3.0811μs 324.5561 KOps/s 322.5163 KOps/s $\color{#35bf28}+0.63\%$
test_membership_nested_leaf_last 38.9610μs 3.1176μs 320.7583 KOps/s 320.3574 KOps/s $\color{#35bf28}+0.13\%$
test_membership_stacked_nested_last 43.5310μs 5.9595μs 167.7996 KOps/s 319.9951 KOps/s $\textbf{\color{#d91a1a}-47.56\%}$
test_membership_stacked_nested_leaf_last 41.1100μs 5.9093μs 169.2249 KOps/s 323.7048 KOps/s $\textbf{\color{#d91a1a}-47.72\%}$
test_nested_getleaf 42.9200μs 6.0842μs 164.3610 KOps/s 164.2049 KOps/s $\color{#35bf28}+0.10\%$
test_nested_get 37.9210μs 5.8463μs 171.0480 KOps/s 171.9833 KOps/s $\color{#d91a1a}-0.54\%$
test_stacked_getleaf 73.6610μs 6.1108μs 163.6445 KOps/s 163.3511 KOps/s $\color{#35bf28}+0.18\%$
test_stacked_get 39.1100μs 5.8326μs 171.4505 KOps/s 172.2528 KOps/s $\color{#d91a1a}-0.47\%$
test_nested_getitemleaf 33.4500μs 6.4230μs 155.6894 KOps/s 153.5065 KOps/s $\color{#35bf28}+1.42\%$
test_nested_getitem 62.4110μs 6.1041μs 163.8237 KOps/s 161.2509 KOps/s $\color{#35bf28}+1.60\%$
test_stacked_getitemleaf 30.1700μs 6.4137μs 155.9159 KOps/s 156.6211 KOps/s $\color{#d91a1a}-0.45\%$
test_stacked_getitem 53.3010μs 6.1195μs 163.4118 KOps/s 164.0252 KOps/s $\color{#d91a1a}-0.37\%$
test_lock_nested 9.0682ms 0.3802ms 2.6305 KOps/s 2.6194 KOps/s $\color{#35bf28}+0.42\%$
test_lock_stack_nested 0.4460ms 0.3393ms 2.9470 KOps/s 2.8901 KOps/s $\color{#35bf28}+1.97\%$
test_unlock_nested 0.6221ms 0.3132ms 3.1930 KOps/s 3.1609 KOps/s $\color{#35bf28}+1.01\%$
test_unlock_stack_nested 0.3336ms 0.2781ms 3.5954 KOps/s 3.5070 KOps/s $\color{#35bf28}+2.52\%$
test_flatten_speed 0.1206ms 74.6966μs 13.3875 KOps/s 13.2455 KOps/s $\color{#35bf28}+1.07\%$
test_unflatten_speed 0.5005ms 0.3170ms 3.1545 KOps/s 3.1428 KOps/s $\color{#35bf28}+0.37\%$
test_common_ops 1.6711ms 0.5643ms 1.7721 KOps/s 1.5141 KOps/s $\textbf{\color{#35bf28}+17.04\%}$
test_creation 0.1697ms 1.7410μs 574.3782 KOps/s 568.7501 KOps/s $\color{#35bf28}+0.99\%$
test_creation_empty 41.1310μs 6.6196μs 151.0656 KOps/s 96.8788 KOps/s $\textbf{\color{#35bf28}+55.93\%}$
test_creation_nested_1 32.2800μs 8.1986μs 121.9727 KOps/s 81.7785 KOps/s $\textbf{\color{#35bf28}+49.15\%}$
test_creation_nested_2 36.1210μs 11.0945μs 90.1350 KOps/s 67.4625 KOps/s $\textbf{\color{#35bf28}+33.61\%}$
test_clone 0.1701ms 10.0265μs 99.7361 KOps/s 99.0541 KOps/s $\color{#35bf28}+0.69\%$
test_getitem[int] 1.3798ms 10.6954μs 93.4978 KOps/s 93.5741 KOps/s $\color{#d91a1a}-0.08\%$
test_getitem[slice_int] 0.2038ms 20.7319μs 48.2349 KOps/s 48.8096 KOps/s $\color{#d91a1a}-1.18\%$
test_getitem[range] 0.1460ms 36.2525μs 27.5843 KOps/s 26.9204 KOps/s $\color{#35bf28}+2.47\%$
test_getitem[tuple] 0.1096ms 18.0826μs 55.3016 KOps/s 54.8481 KOps/s $\color{#35bf28}+0.83\%$
test_getitem[list] 0.2204ms 32.6562μs 30.6221 KOps/s 30.2562 KOps/s $\color{#35bf28}+1.21\%$
test_setitem_dim[int] 28.1010μs 18.8203μs 53.1340 KOps/s 51.5494 KOps/s $\color{#35bf28}+3.07\%$
test_setitem_dim[slice_int] 59.3610μs 37.5900μs 26.6028 KOps/s 25.3518 KOps/s $\color{#35bf28}+4.93\%$
test_setitem_dim[range] 0.1451ms 52.4939μs 19.0498 KOps/s 18.0261 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_setitem_dim[tuple] 67.7010μs 31.1358μs 32.1174 KOps/s 29.9632 KOps/s $\textbf{\color{#35bf28}+7.19\%}$
test_setitem 99.3410μs 13.3235μs 75.0554 KOps/s 58.8323 KOps/s $\textbf{\color{#35bf28}+27.58\%}$
test_set 0.1289ms 12.9149μs 77.4301 KOps/s 61.3566 KOps/s $\textbf{\color{#35bf28}+26.20\%}$
test_set_shared 1.4161ms 0.1508ms 6.6315 KOps/s 6.5481 KOps/s $\color{#35bf28}+1.27\%$
test_update 0.8238ms 14.9109μs 67.0650 KOps/s 51.4220 KOps/s $\textbf{\color{#35bf28}+30.42\%}$
test_update_nested 0.1596ms 20.2695μs 49.3353 KOps/s 40.0178 KOps/s $\textbf{\color{#35bf28}+23.28\%}$
test_update__nested 1.2059ms 25.0171μs 39.9726 KOps/s 40.7990 KOps/s $\color{#d91a1a}-2.03\%$
test_set_nested 81.7210μs 14.1102μs 70.8705 KOps/s 61.0158 KOps/s $\textbf{\color{#35bf28}+16.15\%}$
test_set_nested_new 83.1920μs 16.8220μs 59.4461 KOps/s 53.4716 KOps/s $\textbf{\color{#35bf28}+11.17\%}$
test_select 0.1183ms 28.2974μs 35.3389 KOps/s 32.4458 KOps/s $\textbf{\color{#35bf28}+8.92\%}$
test_select_nested 75.1110μs 44.1641μs 22.6428 KOps/s 22.6823 KOps/s $\color{#d91a1a}-0.17\%$
test_exclude_nested 96.6410μs 64.3323μs 15.5443 KOps/s 15.8307 KOps/s $\color{#d91a1a}-1.81\%$
test_empty[True] 0.3323ms 0.2981ms 3.3544 KOps/s 3.3630 KOps/s $\color{#d91a1a}-0.25\%$
test_empty[False] 3.5270μs 0.8223μs 1.2162 MOps/s 1.2080 MOps/s $\color{#35bf28}+0.68\%$
test_to 86.4410μs 56.3966μs 17.7316 KOps/s 17.3436 KOps/s $\color{#35bf28}+2.24\%$
test_to_nonblocking 0.1966ms 48.0851μs 20.7964 KOps/s 20.9050 KOps/s $\color{#d91a1a}-0.52\%$
test_unbind_speed 1.7699ms 0.2332ms 4.2877 KOps/s 4.2746 KOps/s $\color{#35bf28}+0.31\%$
test_unbind_speed_stack0 0.3026ms 0.2341ms 4.2711 KOps/s 4.2339 KOps/s $\color{#35bf28}+0.88\%$
test_unbind_speed_stack1 92.5435ms 0.6601ms 1.5149 KOps/s 1.4998 KOps/s $\color{#35bf28}+1.01\%$
test_split 93.2348ms 1.5874ms 629.9486 Ops/s 585.6014 Ops/s $\textbf{\color{#35bf28}+7.57\%}$
test_chunk 95.4882ms 1.5921ms 628.0959 Ops/s 628.2792 Ops/s $\color{#d91a1a}-0.03\%$
test_consolidate[False-None] 96.1129ms 2.8978ms 345.0897 Ops/s 372.5370 Ops/s $\textbf{\color{#d91a1a}-7.37\%}$
test_consolidate[default-None] 2.1082ms 1.6783ms 595.8288 Ops/s 599.4027 Ops/s $\color{#d91a1a}-0.60\%$
test_consolidate[reduce-overhead-None] 2.1087ms 1.6973ms 589.1618 Ops/s 585.4651 Ops/s $\color{#35bf28}+0.63\%$
test_consolidate_njt[False-None] 7.0363ms 6.4948ms 153.9685 Ops/s 154.1966 Ops/s $\color{#d91a1a}-0.15\%$
test_to[False-False-None] 1.8711ms 1.7061ms 586.1337 Ops/s 578.4900 Ops/s $\color{#35bf28}+1.32\%$
test_to[True-False-None] 1.5499ms 1.3076ms 764.7463 Ops/s 785.1028 Ops/s $\color{#d91a1a}-2.59\%$
test_to[within-False-None] 4.3476ms 4.0976ms 244.0466 Ops/s 248.2087 Ops/s $\color{#d91a1a}-1.68\%$
test_to[True-default-None] 5.6986ms 5.3530ms 186.8103 Ops/s 193.0260 Ops/s $\color{#d91a1a}-3.22\%$
test_to_njt[False-False-None] 7.0204ms 6.8539ms 145.9025 Ops/s 146.3887 Ops/s $\color{#d91a1a}-0.33\%$
test_to_njt[True-False-None] 5.8694ms 5.4891ms 182.1800 Ops/s 182.6964 Ops/s $\color{#d91a1a}-0.28\%$
test_to_njt[within-False-None] 12.3338ms 12.1806ms 82.0977 Ops/s 82.5426 Ops/s $\color{#d91a1a}-0.54\%$
test_creation[device0] 0.3729ms 79.4100μs 12.5929 KOps/s 12.6256 KOps/s $\color{#d91a1a}-0.26\%$
test_creation_from_tensor 0.4928ms 82.2974μs 12.1511 KOps/s 11.9948 KOps/s $\color{#35bf28}+1.30\%$
test_add_one[memmap_tensor0] 0.4220ms 6.1668μs 162.1583 KOps/s 162.7507 KOps/s $\color{#d91a1a}-0.36\%$
test_contiguous[memmap_tensor0] 2.9950μs 0.4191μs 2.3861 MOps/s 2.4382 MOps/s $\color{#d91a1a}-2.14\%$
test_stack[memmap_tensor0] 28.7800μs 4.2193μs 237.0055 KOps/s 235.3079 KOps/s $\color{#35bf28}+0.72\%$
test_memmaptd_index 1.7260ms 0.2510ms 3.9840 KOps/s 3.9295 KOps/s $\color{#35bf28}+1.39\%$
test_memmaptd_index_astensor 0.9257ms 0.3176ms 3.1487 KOps/s 3.1786 KOps/s $\color{#d91a1a}-0.94\%$
test_memmaptd_index_op 1.0141ms 0.5470ms 1.8281 KOps/s 1.6269 KOps/s $\textbf{\color{#35bf28}+12.37\%}$
test_serialize_model 0.1313s 0.1304s 7.6659 Ops/s 7.5979 Ops/s $\color{#35bf28}+0.89\%$
test_serialize_model_pickle 1.3487s 1.1892s 0.8409 Ops/s 0.8234 Ops/s $\color{#35bf28}+2.13\%$
test_serialize_weights 0.1315s 0.1302s 7.6778 Ops/s 7.6223 Ops/s $\color{#35bf28}+0.73\%$
test_serialize_weights_returnearly 0.4409s 68.0247ms 14.7005 Ops/s 23.6801 Ops/s $\textbf{\color{#d91a1a}-37.92\%}$
test_serialize_weights_pickle 1.3778s 1.1978s 0.8349 Ops/s 0.8202 Ops/s $\color{#35bf28}+1.79\%$
test_reshape_pytree 73.4410μs 22.0602μs 45.3305 KOps/s 44.9672 KOps/s $\color{#35bf28}+0.81\%$
test_reshape_td 65.1510μs 28.7273μs 34.8101 KOps/s 36.0503 KOps/s $\color{#d91a1a}-3.44\%$
test_view_pytree 82.4110μs 23.4057μs 42.7246 KOps/s 45.6447 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_view_td 0.1843ms 33.2403μs 30.0840 KOps/s 30.8309 KOps/s $\color{#d91a1a}-2.42\%$
test_unbind_pytree 0.1877ms 28.1007μs 35.5863 KOps/s 36.2503 KOps/s $\color{#d91a1a}-1.83\%$
test_unbind_td 0.8355ms 35.7880μs 27.9423 KOps/s 27.6622 KOps/s $\color{#35bf28}+1.01\%$
test_split_pytree 68.1910μs 30.3431μs 32.9565 KOps/s 33.4827 KOps/s $\color{#d91a1a}-1.57\%$
test_split_td 1.0628ms 42.1616μs 23.7182 KOps/s 25.3323 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_add_pytree 0.1628ms 34.2571μs 29.1910 KOps/s 30.1979 KOps/s $\color{#d91a1a}-3.33\%$
test_add_td 0.2052ms 46.3508μs 21.5746 KOps/s 19.4523 KOps/s $\textbf{\color{#35bf28}+10.91\%}$
test_compile_add_one_nested[tensordict-compile] 0.2377ms 0.1193ms 8.3834 KOps/s 7.6477 KOps/s $\textbf{\color{#35bf28}+9.62\%}$
test_compile_add_one_nested[tensordict-eager] 0.2803ms 0.1327ms 7.5361 KOps/s 7.2824 KOps/s $\color{#35bf28}+3.48\%$
test_compile_add_one_nested[pytree-compile] 0.1451ms 94.2275μs 10.6126 KOps/s 10.2550 KOps/s $\color{#35bf28}+3.49\%$
test_compile_add_one_nested[pytree-eager] 1.7098ms 0.1476ms 6.7769 KOps/s 6.2800 KOps/s $\textbf{\color{#35bf28}+7.91\%}$
test_compile_copy_nested[tensordict-compile] 0.1569ms 22.6662μs 44.1185 KOps/s 41.9671 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_compile_copy_nested[tensordict-eager] 82.3820μs 29.4154μs 33.9958 KOps/s 34.1579 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_copy_nested[pytree-compile] 0.4608ms 64.4105μs 15.5254 KOps/s 15.6496 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_copy_nested[pytree-eager] 94.8010μs 49.3518μs 20.2627 KOps/s 20.3449 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_add_one_flat[tensordict-compile] 0.2008ms 0.1415ms 7.0659 KOps/s 7.0422 KOps/s $\color{#35bf28}+0.34\%$
test_compile_add_one_flat[tensordict-eager] 0.3148ms 0.2192ms 4.5618 KOps/s 4.5528 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.2568ms 96.7418μs 10.3368 KOps/s 9.7367 KOps/s $\textbf{\color{#35bf28}+6.16\%}$
test_compile_add_one_flat[tensorclass-eager] 0.2050ms 55.6310μs 17.9756 KOps/s 16.9781 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_compile_add_one_flat[pytree-compile] 0.2870ms 0.1346ms 7.4303 KOps/s 7.3241 KOps/s $\color{#35bf28}+1.45\%$
test_compile_add_one_flat[pytree-eager] 0.6156ms 0.4691ms 2.1318 KOps/s 1.9558 KOps/s $\textbf{\color{#35bf28}+9.00\%}$
test_compile_add_self_flat[tensordict-eager] 0.4581ms 0.2614ms 3.8249 KOps/s 3.7794 KOps/s $\color{#35bf28}+1.20\%$
test_compile_add_self_flat[tensordict-compile] 0.2869ms 0.1434ms 6.9753 KOps/s 6.8557 KOps/s $\color{#35bf28}+1.75\%$
test_compile_add_self_flat[tensorclass-eager] 0.2172ms 66.5657μs 15.0228 KOps/s 14.1344 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1349ms 97.9194μs 10.2125 KOps/s 9.9040 KOps/s $\color{#35bf28}+3.12\%$
test_compile_add_self_flat[pytree-eager] 0.5471ms 0.3984ms 2.5098 KOps/s 2.4301 KOps/s $\color{#35bf28}+3.28\%$
test_compile_add_self_flat[pytree-compile] 0.2575ms 0.1358ms 7.3639 KOps/s 7.1277 KOps/s $\color{#35bf28}+3.31\%$
test_compile_copy_flat[tensordict-compile] 0.1370ms 19.7696μs 50.5827 KOps/s 55.2607 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_compile_copy_flat[tensordict-eager] 67.1510μs 31.1572μs 32.0953 KOps/s 32.0279 KOps/s $\color{#35bf28}+0.21\%$
test_compile_copy_flat[pytree-compile] 0.1097ms 71.0372μs 14.0771 KOps/s 14.2830 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_copy_flat[pytree-eager] 0.1768ms 51.3972μs 19.4563 KOps/s 19.3539 KOps/s $\color{#35bf28}+0.53\%$
test_compile_assign_and_add[tensordict-compile] 1.6044ms 0.3867ms 2.5858 KOps/s 2.2636 KOps/s $\textbf{\color{#35bf28}+14.24\%}$
test_compile_assign_and_add[tensordict-eager] 2.7781ms 2.5943ms 385.4648 Ops/s 386.0107 Ops/s $\color{#d91a1a}-0.14\%$
test_compile_assign_and_add[pytree-compile] 1.5486ms 0.3717ms 2.6901 KOps/s 2.2651 KOps/s $\textbf{\color{#35bf28}+18.76\%}$
test_compile_assign_and_add[pytree-eager] 2.7760ms 2.5849ms 386.8608 Ops/s 387.5334 Ops/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[tensor-tensordict-compile] 0.5498ms 0.1169ms 8.5569 KOps/s 8.5266 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[tensor-tensordict-eager] 0.5672ms 79.1985μs 12.6265 KOps/s 11.9905 KOps/s $\textbf{\color{#35bf28}+5.30\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.6413ms 0.1056ms 9.4714 KOps/s 9.2425 KOps/s $\color{#35bf28}+2.48\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2474ms 70.7042μs 14.1434 KOps/s 13.9386 KOps/s $\color{#35bf28}+1.47\%$
test_compile_indexing[tensor-pytree-compile] 0.2859ms 0.1118ms 8.9455 KOps/s 9.1241 KOps/s $\color{#d91a1a}-1.96\%$
test_compile_indexing[tensor-pytree-eager] 0.2427ms 70.3664μs 14.2113 KOps/s 14.0967 KOps/s $\color{#35bf28}+0.81\%$
test_compile_indexing[slice-tensordict-compile] 0.2674ms 99.5632μs 10.0439 KOps/s 9.6402 KOps/s $\color{#35bf28}+4.19\%$
test_compile_indexing[slice-tensordict-eager] 0.1696ms 17.2115μs 58.1007 KOps/s 49.3295 KOps/s $\textbf{\color{#35bf28}+17.78\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2219ms 96.9083μs 10.3190 KOps/s 10.2574 KOps/s $\color{#35bf28}+0.60\%$
test_compile_indexing[slice-tensorclass-eager] 0.1196ms 15.7886μs 63.3370 KOps/s 63.9805 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[slice-pytree-compile] 0.2319ms 94.5386μs 10.5777 KOps/s 9.9028 KOps/s $\textbf{\color{#35bf28}+6.81\%}$
test_compile_indexing[slice-pytree-eager] 54.4610μs 15.6836μs 63.7610 KOps/s 63.7910 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[int-tensordict-compile] 0.2412ms 0.1000ms 9.9978 KOps/s 9.6518 KOps/s $\color{#35bf28}+3.59\%$
test_compile_indexing[int-tensordict-eager] 0.5671ms 16.8956μs 59.1869 KOps/s 57.3021 KOps/s $\color{#35bf28}+3.29\%$
test_compile_indexing[int-tensorclass-compile] 0.1575ms 95.3410μs 10.4887 KOps/s 10.0033 KOps/s $\color{#35bf28}+4.85\%$
test_compile_indexing[int-tensorclass-eager] 0.1652ms 15.6489μs 63.9023 KOps/s 63.4658 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[int-pytree-compile] 0.1942ms 94.3798μs 10.5955 KOps/s 10.3115 KOps/s $\color{#35bf28}+2.75\%$
test_compile_indexing[int-pytree-eager] 0.1634ms 15.9662μs 62.6322 KOps/s 63.4240 KOps/s $\color{#d91a1a}-1.25\%$
test_mod_add[eager] 0.1775ms 36.5889μs 27.3307 KOps/s 25.3299 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_mod_add[compile] 0.1493ms 81.6171μs 12.2523 KOps/s 12.4111 KOps/s $\color{#d91a1a}-1.28\%$
test_mod_add[compile-overhead] 0.3265ms 0.1662ms 6.0162 KOps/s 5.7808 KOps/s $\color{#35bf28}+4.07\%$
test_mod_wrap[eager] 0.3302ms 0.2576ms 3.8819 KOps/s 3.9788 KOps/s $\color{#d91a1a}-2.43\%$
test_mod_wrap[compile] 0.3686ms 0.2906ms 3.4412 KOps/s 3.4041 KOps/s $\color{#35bf28}+1.09\%$
test_mod_wrap[compile-overhead] 7.3998ms 3.7227ms 268.6230 Ops/s 273.2742 Ops/s $\color{#d91a1a}-1.70\%$
test_mod_wrap_and_backward[eager] 1.5333ms 1.3527ms 739.2779 Ops/s 700.6771 Ops/s $\textbf{\color{#35bf28}+5.51\%}$
test_mod_wrap_and_backward[compile] 1.8437ms 1.3707ms 729.5306 Ops/s 787.7796 Ops/s $\textbf{\color{#d91a1a}-7.39\%}$
test_mod_wrap_and_backward[compile-overhead] 1.5006ms 1.0221ms 978.3997 Ops/s 1.0873 KOps/s $\textbf{\color{#d91a1a}-10.01\%}$
test_seq_add[eager] 0.2623ms 0.1130ms 8.8525 KOps/s 8.3688 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_seq_add[compile] 0.3277ms 89.4615μs 11.1780 KOps/s 11.4572 KOps/s $\color{#d91a1a}-2.44\%$
test_seq_add[compile-overhead] 0.2855ms 0.1278ms 7.8245 KOps/s 7.7804 KOps/s $\color{#35bf28}+0.57\%$
test_seq_wrap[eager] 0.5394ms 0.4065ms 2.4601 KOps/s 2.3244 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_seq_wrap[compile] 0.4441ms 0.2987ms 3.3476 KOps/s 3.3167 KOps/s $\color{#35bf28}+0.93\%$
test_seq_wrap[compile-overhead] 0.3103ms 0.2215ms 4.5154 KOps/s 4.4720 KOps/s $\color{#35bf28}+0.97\%$
test_func_call_runtime[False-eager] 0.8590ms 0.7106ms 1.4073 KOps/s 1.3746 KOps/s $\color{#35bf28}+2.38\%$
test_func_call_runtime[False-compile] 0.8399ms 0.7382ms 1.3547 KOps/s 1.3455 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_runtime[False-compile-overhead] 0.4812ms 0.3578ms 2.7952 KOps/s 2.7760 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_runtime[True-eager] 0.9560ms 0.8806ms 1.1356 KOps/s 1.1057 KOps/s $\color{#35bf28}+2.70\%$
test_func_call_runtime[True-compile] 0.8600ms 0.7594ms 1.3168 KOps/s 1.3182 KOps/s $\color{#d91a1a}-0.11\%$
test_func_call_runtime[True-compile-overhead] 0.4953ms 0.3795ms 2.6347 KOps/s 2.6247 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_cm_runtime[False-eager] 0.8711ms 0.7089ms 1.4107 KOps/s 1.3717 KOps/s $\color{#35bf28}+2.84\%$
test_func_call_cm_runtime[False-compile] 0.8942ms 0.7448ms 1.3427 KOps/s 1.2772 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_func_call_cm_runtime[False-compile-overhead] 0.5079ms 0.3596ms 2.7810 KOps/s 2.7505 KOps/s $\color{#35bf28}+1.11\%$
test_func_call_cm_runtime[True-eager] 1.1304ms 0.9795ms 1.0209 KOps/s 996.5943 Ops/s $\color{#35bf28}+2.44\%$
test_func_call_cm_runtime[True-compile] 0.9364ms 0.7896ms 1.2665 KOps/s 1.2126 KOps/s $\color{#35bf28}+4.45\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4762ms 0.4066ms 2.4595 KOps/s 2.3759 KOps/s $\color{#35bf28}+3.52\%$
test_vmap_func_call_cm_runtime[eager] 2.5162ms 2.0278ms 493.1444 Ops/s 484.4594 Ops/s $\color{#35bf28}+1.79\%$
test_vmap_func_call_cm_runtime[compile] 1.2808ms 0.8520ms 1.1737 KOps/s 1.2181 KOps/s $\color{#d91a1a}-3.65\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5248ms 0.4099ms 2.4399 KOps/s 2.4432 KOps/s $\color{#d91a1a}-0.13\%$
test_distributed 5.7550ms 0.2926ms 3.4181 KOps/s 7.8904 KOps/s $\textbf{\color{#d91a1a}-56.68\%}$
test_tdmodule 0.1227ms 19.4413μs 51.4368 KOps/s 47.0504 KOps/s $\textbf{\color{#35bf28}+9.32\%}$
test_tdmodule_dispatch 72.4410μs 34.5589μs 28.9361 KOps/s 25.7622 KOps/s $\textbf{\color{#35bf28}+12.32\%}$
test_tdseq 33.9110μs 19.5684μs 51.1027 KOps/s 44.0314 KOps/s $\textbf{\color{#35bf28}+16.06\%}$
test_tdseq_dispatch 57.4410μs 36.3246μs 27.5295 KOps/s 23.2164 KOps/s $\textbf{\color{#35bf28}+18.58\%}$
test_instantiation_functorch 1.6350ms 1.5503ms 645.0447 Ops/s 638.9040 Ops/s $\color{#35bf28}+0.96\%$
test_exec_functorch 0.2042ms 0.1398ms 7.1552 KOps/s 7.2119 KOps/s $\color{#d91a1a}-0.79\%$
test_exec_functional_call 0.2539ms 0.1291ms 7.7434 KOps/s 7.7231 KOps/s $\color{#35bf28}+0.26\%$
test_exec_td_decorator 0.3663ms 0.1775ms 5.6342 KOps/s 5.5852 KOps/s $\color{#35bf28}+0.88\%$
test_vmap_mlp_speed_decorator[True-True] 0.8076ms 0.6685ms 1.4959 KOps/s 1.4746 KOps/s $\color{#35bf28}+1.44\%$
test_vmap_mlp_speed_decorator[True-False] 0.8501ms 0.6679ms 1.4973 KOps/s 1.4728 KOps/s $\color{#35bf28}+1.66\%$
test_vmap_mlp_speed_decorator[False-True] 0.7269ms 0.5816ms 1.7193 KOps/s 1.6376 KOps/s $\color{#35bf28}+4.99\%$
test_vmap_mlp_speed_decorator[False-False] 0.7266ms 0.5825ms 1.7167 KOps/s 1.6333 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_vmap_transformer_speed_decorator[True-True] 18.7432ms 18.6491ms 53.6220 Ops/s 53.1729 Ops/s $\color{#35bf28}+0.84\%$
test_vmap_transformer_speed_decorator[True-False] 18.9432ms 18.7436ms 53.3516 Ops/s 53.1372 Ops/s $\color{#35bf28}+0.40\%$
test_vmap_transformer_speed_decorator[False-True] 18.8008ms 18.6041ms 53.7516 Ops/s 53.4002 Ops/s $\color{#35bf28}+0.66\%$
test_vmap_transformer_speed_decorator[False-False] 18.6392ms 18.5735ms 53.8401 Ops/s 53.4868 Ops/s $\color{#35bf28}+0.66\%$
test_to_module_speed[True] 1.1066ms 0.9856ms 1.0146 KOps/s 1.0157 KOps/s $\color{#d91a1a}-0.11\%$
test_to_module_speed[False] 1.4400ms 0.9613ms 1.0403 KOps/s 1.0510 KOps/s $\color{#d91a1a}-1.02\%$
test_tc_init 0.1138ms 35.9557μs 27.8120 KOps/s 26.4954 KOps/s $\color{#35bf28}+4.97\%$
test_tc_init_nested 0.3063ms 71.6448μs 13.9577 KOps/s 12.8847 KOps/s $\textbf{\color{#35bf28}+8.33\%}$
test_tc_first_layer_tensor 20.9100μs 0.8067μs 1.2396 MOps/s 1.4661 MOps/s $\textbf{\color{#d91a1a}-15.45\%}$
test_tc_first_layer_nontensor 22.3900μs 2.2694μs 440.6385 KOps/s 443.7174 KOps/s $\color{#d91a1a}-0.69\%$
test_tc_second_layer_tensor 32.2352μs 1.3848μs 722.1416 KOps/s 696.1258 KOps/s $\color{#35bf28}+3.74\%$
test_tc_second_layer_nontensor 26.6810μs 3.0132μs 331.8716 KOps/s 331.3166 KOps/s $\color{#35bf28}+0.17\%$
test_unbind 0.2186s 9.8724ms 101.2930 Ops/s 142.7024 Ops/s $\textbf{\color{#d91a1a}-29.02\%}$
test_full_like 12.0812ms 9.2154ms 108.5135 Ops/s 106.7814 Ops/s $\color{#35bf28}+1.62\%$
test_zeros_like 9.3354ms 7.2564ms 137.8091 Ops/s 115.0167 Ops/s $\textbf{\color{#35bf28}+19.82\%}$
test_ones_like 5.2037ms 4.3280ms 231.0520 Ops/s 232.3958 Ops/s $\color{#d91a1a}-0.58\%$
test_clone 11.2816ms 8.9988ms 111.1255 Ops/s 158.8522 Ops/s $\textbf{\color{#d91a1a}-30.04\%}$
test_squeeze 0.1640ms 9.7360μs 102.7121 KOps/s 106.2243 KOps/s $\color{#d91a1a}-3.31\%$
test_unsqueeze 0.2258ms 76.4581μs 13.0791 KOps/s 13.6609 KOps/s $\color{#d91a1a}-4.26\%$
test_split 0.3699ms 0.1594ms 6.2750 KOps/s 6.2141 KOps/s $\color{#35bf28}+0.98\%$
test_permute 0.2996ms 0.1845ms 5.4187 KOps/s 5.7579 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_stack 53.3321ms 53.0161ms 18.8622 Ops/s 19.5305 Ops/s $\color{#d91a1a}-3.42\%$
test_cat 53.2035ms 52.8701ms 18.9143 Ops/s 19.9802 Ops/s $\textbf{\color{#d91a1a}-5.33\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants