Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] cat and stack_from_tensordict #1018

Merged
merged 1 commit into from
Oct 1, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 1, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: cca23e89c8526b19b4389d15cf9c4e36a151ac15
Pull Request resolved: #1018
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 1, 2024
Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}25$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 48.7010μs 21.4302μs 46.6630 KOps/s 49.7184 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_plain_set_stack_nested 61.8550μs 21.7859μs 45.9012 KOps/s 50.2940 KOps/s $\textbf{\color{#d91a1a}-8.73\%}$
test_plain_set_nested_inplace 89.7980μs 23.3583μs 42.8113 KOps/s 46.3498 KOps/s $\textbf{\color{#d91a1a}-7.63\%}$
test_plain_set_stack_nested_inplace 0.3046ms 23.5510μs 42.4611 KOps/s 46.8031 KOps/s $\textbf{\color{#d91a1a}-9.28\%}$
test_items 87.4340μs 4.1342μs 241.8861 KOps/s 239.6965 KOps/s $\color{#35bf28}+0.91\%$
test_items_nested 0.4663ms 0.3684ms 2.7141 KOps/s 2.7593 KOps/s $\color{#d91a1a}-1.64\%$
test_items_nested_locked 0.5793ms 0.3680ms 2.7175 KOps/s 2.7585 KOps/s $\color{#d91a1a}-1.49\%$
test_items_nested_leaf 0.1409ms 69.6933μs 14.3486 KOps/s 14.5708 KOps/s $\color{#d91a1a}-1.53\%$
test_items_stack_nested 1.5018ms 0.3834ms 2.6083 KOps/s 2.7453 KOps/s $\color{#d91a1a}-4.99\%$
test_items_stack_nested_leaf 0.1451ms 70.8488μs 14.1146 KOps/s 14.0586 KOps/s $\color{#35bf28}+0.40\%$
test_items_stack_nested_locked 0.4547ms 0.3711ms 2.6950 KOps/s 2.7730 KOps/s $\color{#d91a1a}-2.81\%$
test_keys 27.3620μs 3.7780μs 264.6875 KOps/s 261.3771 KOps/s $\color{#35bf28}+1.27\%$
test_keys_nested 0.1786ms 0.1000ms 9.9967 KOps/s 10.1663 KOps/s $\color{#d91a1a}-1.67\%$
test_keys_nested_locked 0.6898ms 0.1049ms 9.5297 KOps/s 9.5916 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_nested_leaf 0.4949ms 82.6317μs 12.1019 KOps/s 12.2854 KOps/s $\color{#d91a1a}-1.49\%$
test_keys_stack_nested 0.1878ms 99.6008μs 10.0401 KOps/s 10.0122 KOps/s $\color{#35bf28}+0.28\%$
test_keys_stack_nested_leaf 0.1514ms 83.0774μs 12.0370 KOps/s 12.0390 KOps/s $\color{#d91a1a}-0.02\%$
test_keys_stack_nested_locked 0.1942ms 0.1044ms 9.5798 KOps/s 9.5666 KOps/s $\color{#35bf28}+0.14\%$
test_values 12.2368μs 1.0300μs 970.8606 KOps/s 942.6081 KOps/s $\color{#35bf28}+3.00\%$
test_values_nested 0.1381ms 75.4346μs 13.2565 KOps/s 13.7625 KOps/s $\color{#d91a1a}-3.68\%$
test_values_nested_locked 0.1298ms 75.1117μs 13.3135 KOps/s 13.7213 KOps/s $\color{#d91a1a}-2.97\%$
test_values_nested_leaf 0.1117ms 62.8725μs 15.9052 KOps/s 16.1981 KOps/s $\color{#d91a1a}-1.81\%$
test_values_stack_nested 0.1372ms 75.9250μs 13.1709 KOps/s 13.6563 KOps/s $\color{#d91a1a}-3.55\%$
test_values_stack_nested_leaf 0.1288ms 62.3903μs 16.0281 KOps/s 16.3840 KOps/s $\color{#d91a1a}-2.17\%$
test_values_stack_nested_locked 0.1353ms 77.0692μs 12.9754 KOps/s 13.6332 KOps/s $\color{#d91a1a}-4.83\%$
test_membership 28.6930μs 0.8904μs 1.1230 MOps/s 1.3799 MOps/s $\textbf{\color{#d91a1a}-18.62\%}$
test_membership_nested 21.3690μs 2.7708μs 360.9013 KOps/s 363.3580 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_nested_leaf 20.4780μs 2.7985μs 357.3297 KOps/s 363.7485 KOps/s $\color{#d91a1a}-1.76\%$
test_membership_stacked_nested 29.6860μs 2.7645μs 361.7293 KOps/s 361.2515 KOps/s $\color{#35bf28}+0.13\%$
test_membership_stacked_nested_leaf 22.4920μs 2.7888μs 358.5822 KOps/s 363.9825 KOps/s $\color{#d91a1a}-1.48\%$
test_membership_nested_last 57.2060μs 3.9377μs 253.9544 KOps/s 254.2762 KOps/s $\color{#d91a1a}-0.13\%$
test_membership_nested_leaf_last 22.7220μs 3.9680μs 252.0144 KOps/s 244.3395 KOps/s $\color{#35bf28}+3.14\%$
test_membership_stacked_nested_last 53.2500μs 4.0311μs 248.0729 KOps/s 249.6996 KOps/s $\color{#d91a1a}-0.65\%$
test_membership_stacked_nested_leaf_last 30.6170μs 3.9867μs 250.8324 KOps/s 250.0340 KOps/s $\color{#35bf28}+0.32\%$
test_nested_getleaf 37.0190μs 10.5568μs 94.7259 KOps/s 95.8009 KOps/s $\color{#d91a1a}-1.12\%$
test_nested_get 52.1880μs 10.2072μs 97.9703 KOps/s 100.6432 KOps/s $\color{#d91a1a}-2.66\%$
test_stacked_getleaf 50.4040μs 10.6432μs 93.9563 KOps/s 95.7127 KOps/s $\color{#d91a1a}-1.84\%$
test_stacked_get 55.4840μs 10.1934μs 98.1025 KOps/s 99.9141 KOps/s $\color{#d91a1a}-1.81\%$
test_nested_getitemleaf 65.3780μs 11.2549μs 88.8503 KOps/s 91.2985 KOps/s $\color{#d91a1a}-2.68\%$
test_nested_getitem 33.8840μs 10.2861μs 97.2184 KOps/s 97.9265 KOps/s $\color{#d91a1a}-0.72\%$
test_stacked_getitemleaf 54.8320μs 11.0261μs 90.6940 KOps/s 92.3414 KOps/s $\color{#d91a1a}-1.78\%$
test_stacked_getitem 31.3390μs 10.1930μs 98.1061 KOps/s 98.8031 KOps/s $\color{#d91a1a}-0.71\%$
test_lock_nested 84.5483ms 0.5732ms 1.7447 KOps/s 2.0384 KOps/s $\textbf{\color{#d91a1a}-14.41\%}$
test_lock_stack_nested 0.5308ms 0.4532ms 2.2064 KOps/s 2.1899 KOps/s $\color{#35bf28}+0.76\%$
test_unlock_nested 86.7829ms 0.4887ms 2.0461 KOps/s 2.4328 KOps/s $\textbf{\color{#d91a1a}-15.90\%}$
test_unlock_stack_nested 0.6394ms 0.3698ms 2.7040 KOps/s 2.6516 KOps/s $\color{#35bf28}+1.98\%$
test_flatten_speed 0.1421ms 89.1073μs 11.2224 KOps/s 11.5173 KOps/s $\color{#d91a1a}-2.56\%$
test_unflatten_speed 0.8188ms 0.4634ms 2.1578 KOps/s 2.1949 KOps/s $\color{#d91a1a}-1.69\%$
test_common_ops 4.1153ms 1.1337ms 882.0979 Ops/s 891.6723 Ops/s $\color{#d91a1a}-1.07\%$
test_creation 0.1037ms 2.1137μs 473.1094 KOps/s 483.0627 KOps/s $\color{#d91a1a}-2.06\%$
test_creation_empty 54.5820μs 19.8471μs 50.3851 KOps/s 60.6157 KOps/s $\textbf{\color{#d91a1a}-16.88\%}$
test_creation_nested_1 66.8540μs 23.1752μs 43.1496 KOps/s 51.0311 KOps/s $\textbf{\color{#d91a1a}-15.44\%}$
test_creation_nested_2 67.8670μs 27.4464μs 36.4346 KOps/s 41.9142 KOps/s $\textbf{\color{#d91a1a}-13.07\%}$
test_clone 0.1309ms 17.0206μs 58.7523 KOps/s 56.9559 KOps/s $\color{#35bf28}+3.15\%$
test_getitem[int] 1.2014ms 17.6537μs 56.6452 KOps/s 59.9245 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_getitem[slice_int] 0.1490ms 31.1670μs 32.0853 KOps/s 32.1699 KOps/s $\color{#d91a1a}-0.26\%$
test_getitem[range] 0.5438ms 61.0262μs 16.3864 KOps/s 16.2624 KOps/s $\color{#35bf28}+0.76\%$
test_getitem[tuple] 0.1710ms 25.4184μs 39.3415 KOps/s 39.6757 KOps/s $\color{#d91a1a}-0.84\%$
test_getitem[list] 0.1846ms 56.4736μs 17.7074 KOps/s 17.6079 KOps/s $\color{#35bf28}+0.57\%$
test_setitem_dim[int] 70.7720μs 34.2439μs 29.2023 KOps/s 29.7747 KOps/s $\color{#d91a1a}-1.92\%$
test_setitem_dim[slice_int] 0.1070ms 63.5883μs 15.7262 KOps/s 16.1464 KOps/s $\color{#d91a1a}-2.60\%$
test_setitem_dim[range] 0.1374ms 86.9064μs 11.5066 KOps/s 11.6887 KOps/s $\color{#d91a1a}-1.56\%$
test_setitem_dim[tuple] 96.3000μs 51.1013μs 19.5690 KOps/s 19.9993 KOps/s $\color{#d91a1a}-2.15\%$
test_setitem 0.3287ms 31.7581μs 31.4881 KOps/s 33.0395 KOps/s $\color{#d91a1a}-4.70\%$
test_set 78.7070μs 30.8676μs 32.3965 KOps/s 33.9413 KOps/s $\color{#d91a1a}-4.55\%$
test_set_shared 3.8020ms 0.2153ms 4.6449 KOps/s 4.6344 KOps/s $\color{#35bf28}+0.23\%$
test_update 0.2866ms 38.2153μs 26.1675 KOps/s 27.8239 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_update_nested 0.1147ms 49.7118μs 20.1159 KOps/s 21.7236 KOps/s $\textbf{\color{#d91a1a}-7.40\%}$
test_update__nested 0.1124ms 35.9779μs 27.7949 KOps/s 28.0122 KOps/s $\color{#d91a1a}-0.78\%$
test_set_nested 0.2396ms 34.0590μs 29.3608 KOps/s 31.7488 KOps/s $\textbf{\color{#d91a1a}-7.52\%}$
test_set_nested_new 0.2182ms 39.5952μs 25.2556 KOps/s 27.5809 KOps/s $\textbf{\color{#d91a1a}-8.43\%}$
test_select 0.2814ms 55.1338μs 18.1377 KOps/s 18.7545 KOps/s $\color{#d91a1a}-3.29\%$
test_select_nested 0.1298ms 59.9664μs 16.6760 KOps/s 16.9014 KOps/s $\color{#d91a1a}-1.33\%$
test_exclude_nested 0.1519ms 74.9006μs 13.3510 KOps/s 13.5600 KOps/s $\color{#d91a1a}-1.54\%$
test_empty[True] 0.4806ms 0.3194ms 3.1304 KOps/s 3.1993 KOps/s $\color{#d91a1a}-2.15\%$
test_empty[False] 12.7740μs 1.2214μs 818.7176 KOps/s 827.3167 KOps/s $\color{#d91a1a}-1.04\%$
test_unbind_speed 0.6849ms 0.3016ms 3.3161 KOps/s 3.2865 KOps/s $\color{#35bf28}+0.90\%$
test_unbind_speed_stack0 0.4356ms 0.2982ms 3.3530 KOps/s 3.3335 KOps/s $\color{#35bf28}+0.58\%$
test_unbind_speed_stack1 98.1605ms 0.8238ms 1.2139 KOps/s 1.3393 KOps/s $\textbf{\color{#d91a1a}-9.36\%}$
test_split 96.5449ms 2.1758ms 459.5925 Ops/s 455.7650 Ops/s $\color{#35bf28}+0.84\%$
test_chunk 2.5932ms 2.0038ms 499.0608 Ops/s 456.1051 Ops/s $\textbf{\color{#35bf28}+9.42\%}$
test_creation[device0] 0.2990ms 0.1186ms 8.4317 KOps/s 8.5487 KOps/s $\color{#d91a1a}-1.37\%$
test_creation_from_tensor 4.9279ms 0.1183ms 8.4552 KOps/s 8.4705 KOps/s $\color{#d91a1a}-0.18\%$
test_add_one[memmap_tensor0] 0.4890ms 7.3356μs 136.3219 KOps/s 126.9265 KOps/s $\textbf{\color{#35bf28}+7.40\%}$
test_contiguous[memmap_tensor0] 29.0740μs 1.8898μs 529.1581 KOps/s 518.6170 KOps/s $\color{#35bf28}+2.03\%$
test_stack[memmap_tensor0] 59.0310μs 5.6102μs 178.2461 KOps/s 172.9641 KOps/s $\color{#35bf28}+3.05\%$
test_memmaptd_index 1.1806ms 0.3994ms 2.5036 KOps/s 2.4804 KOps/s $\color{#35bf28}+0.94\%$
test_memmaptd_index_astensor 0.7367ms 0.4759ms 2.1013 KOps/s 2.0720 KOps/s $\color{#35bf28}+1.42\%$
test_memmaptd_index_op 1.7681ms 1.0456ms 956.4305 Ops/s 978.2566 Ops/s $\color{#d91a1a}-2.23\%$
test_serialize_model 0.1213s 0.1171s 8.5422 Ops/s 8.4625 Ops/s $\color{#35bf28}+0.94\%$
test_serialize_model_pickle 0.4714s 0.3961s 2.5243 Ops/s 2.5600 Ops/s $\color{#d91a1a}-1.39\%$
test_serialize_weights 0.1245s 0.1190s 8.4015 Ops/s 7.6227 Ops/s $\textbf{\color{#35bf28}+10.22\%}$
test_serialize_weights_returnearly 0.1776s 0.1660s 6.0228 Ops/s 6.2361 Ops/s $\color{#d91a1a}-3.42\%$
test_serialize_weights_pickle 1.1083s 0.7510s 1.3316 Ops/s 2.4468 Ops/s $\textbf{\color{#d91a1a}-45.58\%}$
test_serialize_weights_filesystem 0.1498s 0.1451s 6.8927 Ops/s 7.0432 Ops/s $\color{#d91a1a}-2.14\%$
test_serialize_model_filesystem 0.1521s 0.1442s 6.9343 Ops/s 6.1479 Ops/s $\textbf{\color{#35bf28}+12.79\%}$
test_reshape_pytree 91.9210μs 38.8830μs 25.7182 KOps/s 25.8881 KOps/s $\color{#d91a1a}-0.66\%$
test_reshape_td 0.1084ms 46.8752μs 21.3333 KOps/s 21.7073 KOps/s $\color{#d91a1a}-1.72\%$
test_view_pytree 0.1208ms 38.5732μs 25.9247 KOps/s 25.8974 KOps/s $\color{#35bf28}+0.11\%$
test_view_td 0.1148ms 52.9293μs 18.8931 KOps/s 18.7185 KOps/s $\color{#35bf28}+0.93\%$
test_unbind_pytree 94.5370μs 35.5156μs 28.1566 KOps/s 27.7672 KOps/s $\color{#35bf28}+1.40\%$
test_unbind_td 0.2896ms 44.5003μs 22.4717 KOps/s 21.6811 KOps/s $\color{#35bf28}+3.65\%$
test_split_pytree 80.5810μs 37.4955μs 26.6699 KOps/s 25.0599 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_split_td 0.4922ms 56.8780μs 17.5815 KOps/s 17.3518 KOps/s $\color{#35bf28}+1.32\%$
test_add_pytree 92.0010μs 44.7709μs 22.3359 KOps/s 21.6238 KOps/s $\color{#35bf28}+3.29\%$
test_add_td 0.1633ms 83.4073μs 11.9894 KOps/s 12.3500 KOps/s $\color{#d91a1a}-2.92\%$
test_compile_add_one_nested[tensordict-compile] 0.1165ms 56.9763μs 17.5511 KOps/s 17.1629 KOps/s $\color{#35bf28}+2.26\%$
test_compile_add_one_nested[tensordict-eager] 0.2649ms 0.1804ms 5.5435 KOps/s 5.4819 KOps/s $\color{#35bf28}+1.12\%$
test_compile_add_one_nested[pytree-compile] 0.1128ms 56.2096μs 17.7905 KOps/s 17.1382 KOps/s $\color{#35bf28}+3.81\%$
test_compile_add_one_nested[pytree-eager] 0.2459ms 0.1399ms 7.1472 KOps/s 6.9879 KOps/s $\color{#35bf28}+2.28\%$
test_compile_copy_nested[tensordict-compile] 55.4630μs 21.4153μs 46.6956 KOps/s 46.3824 KOps/s $\color{#35bf28}+0.68\%$
test_compile_copy_nested[tensordict-eager] 0.1808ms 67.7967μs 14.7500 KOps/s 15.0945 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_copy_nested[pytree-compile] 0.1471ms 77.6554μs 12.8774 KOps/s 13.5435 KOps/s $\color{#d91a1a}-4.92\%$
test_compile_copy_nested[pytree-eager] 0.1251ms 68.8625μs 14.5217 KOps/s 14.9039 KOps/s $\color{#d91a1a}-2.56\%$
test_compile_add_one_flat[tensordict-compile] 0.7938ms 0.1777ms 5.6285 KOps/s 5.7677 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_add_one_flat[tensordict-eager] 0.3145ms 0.1892ms 5.2845 KOps/s 5.2843 KOps/s $+0.00\%$
test_compile_add_one_flat[tensorclass-compile] 0.1085ms 46.1496μs 21.6687 KOps/s 20.3652 KOps/s $\textbf{\color{#35bf28}+6.40\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1296ms 69.5586μs 14.3764 KOps/s 14.0617 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_flat[pytree-compile] 0.8425ms 0.1795ms 5.5721 KOps/s 5.7841 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_add_one_flat[pytree-eager] 0.4926ms 0.2814ms 3.5534 KOps/s 3.3804 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_compile_add_self_flat[tensordict-eager] 0.2993ms 0.2037ms 4.9100 KOps/s 4.9934 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_add_self_flat[tensordict-compile] 0.8146ms 0.1746ms 5.7260 KOps/s 5.7817 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_add_self_flat[tensorclass-eager] 0.1957ms 63.0746μs 15.8542 KOps/s 15.5257 KOps/s $\color{#35bf28}+2.12\%$
test_compile_add_self_flat[tensorclass-compile] 0.1023ms 48.2997μs 20.7041 KOps/s 20.0314 KOps/s $\color{#35bf28}+3.36\%$
test_compile_add_self_flat[pytree-eager] 0.4975ms 0.2300ms 4.3487 KOps/s 4.2061 KOps/s $\color{#35bf28}+3.39\%$
test_compile_add_self_flat[pytree-compile] 0.3072ms 0.1759ms 5.6857 KOps/s 5.6968 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_copy_flat[tensordict-compile] 0.1829ms 0.1016ms 9.8379 KOps/s 9.7294 KOps/s $\color{#35bf28}+1.12\%$
test_compile_copy_flat[tensordict-eager] 0.1284ms 56.4223μs 17.7235 KOps/s 17.7666 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_copy_flat[pytree-compile] 0.1727ms 79.6088μs 12.5614 KOps/s 13.0275 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_copy_flat[pytree-eager] 0.1369ms 70.1573μs 14.2537 KOps/s 14.6235 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_assign_and_add[tensordict-compile] 0.2994ms 0.1976ms 5.0595 KOps/s 4.8149 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_assign_and_add[tensordict-eager] 2.1374ms 1.6211ms 616.8691 Ops/s 608.3224 Ops/s $\color{#35bf28}+1.40\%$
test_compile_assign_and_add[pytree-compile] 0.2864ms 0.1942ms 5.1506 KOps/s 5.1447 KOps/s $\color{#35bf28}+0.11\%$
test_compile_assign_and_add[pytree-eager] 1.3337ms 1.0788ms 926.9296 Ops/s 879.6870 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_compile_assign_and_add_stack[compile] 0.6082ms 0.4237ms 2.3599 KOps/s 2.4410 KOps/s $\color{#d91a1a}-3.32\%$
test_compile_assign_and_add_stack[eager] 6.8858ms 4.0500ms 246.9122 Ops/s 267.1281 Ops/s $\textbf{\color{#d91a1a}-7.57\%}$
test_compile_indexing[tensor-tensordict-compile] 94.4360μs 35.0427μs 28.5366 KOps/s 28.8577 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_indexing[tensor-tensordict-eager] 0.8825ms 50.5602μs 19.7784 KOps/s 19.8080 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_indexing[tensor-tensorclass-compile] 75.3710μs 30.4831μs 32.8050 KOps/s 33.5962 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_indexing[tensor-tensorclass-eager] 86.1510μs 28.8749μs 34.6322 KOps/s 34.9295 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_indexing[tensor-pytree-compile] 82.9650μs 30.2901μs 33.0141 KOps/s 33.2763 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_indexing[tensor-pytree-eager] 0.1587ms 28.8892μs 34.6150 KOps/s 34.4802 KOps/s $\color{#35bf28}+0.39\%$
test_compile_indexing[slice-tensordict-compile] 0.1348ms 73.9982μs 13.5138 KOps/s 13.5080 KOps/s $\color{#35bf28}+0.04\%$
test_compile_indexing[slice-tensordict-eager] 0.5724ms 27.6006μs 36.2311 KOps/s 35.4004 KOps/s $\color{#35bf28}+2.35\%$
test_compile_indexing[slice-tensorclass-compile] 0.6292ms 67.3832μs 14.8405 KOps/s 14.5975 KOps/s $\color{#35bf28}+1.66\%$
test_compile_indexing[slice-tensorclass-eager] 0.3001ms 23.7461μs 42.1122 KOps/s 43.7815 KOps/s $\color{#d91a1a}-3.81\%$
test_compile_indexing[slice-pytree-compile] 0.1621ms 67.3442μs 14.8491 KOps/s 14.6960 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[slice-pytree-eager] 96.7810μs 23.0495μs 43.3848 KOps/s 43.4828 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[int-tensordict-compile] 0.1763ms 73.6633μs 13.5753 KOps/s 13.2308 KOps/s $\color{#35bf28}+2.60\%$
test_compile_indexing[int-tensordict-eager] 0.8957ms 27.5342μs 36.3185 KOps/s 35.5266 KOps/s $\color{#35bf28}+2.23\%$
test_compile_indexing[int-tensorclass-compile] 0.1507ms 66.5958μs 15.0160 KOps/s 14.7475 KOps/s $\color{#35bf28}+1.82\%$
test_compile_indexing[int-tensorclass-eager] 0.2154ms 22.9136μs 43.6422 KOps/s 44.6715 KOps/s $\color{#d91a1a}-2.30\%$
test_compile_indexing[int-pytree-compile] 0.2078ms 67.2536μs 14.8691 KOps/s 14.5991 KOps/s $\color{#35bf28}+1.85\%$
test_compile_indexing[int-pytree-eager] 0.1190ms 22.8794μs 43.7074 KOps/s 44.3790 KOps/s $\color{#d91a1a}-1.51\%$
test_mod_add[eager] 86.5410μs 26.3390μs 37.9665 KOps/s 38.6339 KOps/s $\color{#d91a1a}-1.73\%$
test_mod_add[compile] 0.1060ms 39.0108μs 25.6340 KOps/s 25.0828 KOps/s $\color{#35bf28}+2.20\%$
test_mod_add[compile-overhead] 0.1099ms 39.6850μs 25.1984 KOps/s 24.8038 KOps/s $\color{#35bf28}+1.59\%$
test_mod_wrap[eager] 0.3636ms 0.2149ms 4.6543 KOps/s 4.7113 KOps/s $\color{#d91a1a}-1.21\%$
test_mod_wrap[compile] 0.4526ms 0.2349ms 4.2568 KOps/s 4.1369 KOps/s $\color{#35bf28}+2.90\%$
test_mod_wrap[compile-overhead] 0.4520ms 0.2314ms 4.3209 KOps/s 4.2238 KOps/s $\color{#35bf28}+2.30\%$
test_mod_wrap_and_backward[eager] 12.6014ms 10.9257ms 91.5274 Ops/s 91.3515 Ops/s $\color{#35bf28}+0.19\%$
test_mod_wrap_and_backward[compile] 12.4757ms 10.9841ms 91.0410 Ops/s 90.5205 Ops/s $\color{#35bf28}+0.58\%$
test_mod_wrap_and_backward[compile-overhead] 12.1475ms 10.9332ms 91.4646 Ops/s 90.4662 Ops/s $\color{#35bf28}+1.10\%$
test_seq_add[eager] 0.2128ms 95.4036μs 10.4818 KOps/s 10.7872 KOps/s $\color{#d91a1a}-2.83\%$
test_seq_add[compile] 0.1233ms 63.6547μs 15.7098 KOps/s 14.8474 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_seq_add[compile-overhead] 0.1562ms 64.1231μs 15.5950 KOps/s 15.1664 KOps/s $\color{#35bf28}+2.83\%$
test_seq_wrap[eager] 0.6666ms 0.3971ms 2.5181 KOps/s 2.5588 KOps/s $\color{#d91a1a}-1.59\%$
test_seq_wrap[compile] 1.3402ms 0.2723ms 3.6719 KOps/s 3.5651 KOps/s $\color{#35bf28}+3.00\%$
test_seq_wrap[compile-overhead] 1.3919ms 0.2696ms 3.7092 KOps/s 3.6319 KOps/s $\color{#35bf28}+2.13\%$
test_func_call_runtime[False-eager] 1.2203ms 0.5454ms 1.8334 KOps/s 1.9016 KOps/s $\color{#d91a1a}-3.59\%$
test_func_call_runtime[False-compile] 0.9141ms 0.5038ms 1.9848 KOps/s 1.9706 KOps/s $\color{#35bf28}+0.72\%$
test_func_call_runtime[False-compile-overhead] 0.8772ms 0.5065ms 1.9742 KOps/s 1.9602 KOps/s $\color{#35bf28}+0.71\%$
test_func_call_runtime[True-eager] 0.9019ms 0.7636ms 1.3096 KOps/s 1.3215 KOps/s $\color{#d91a1a}-0.90\%$
test_func_call_runtime[True-compile] 0.7195ms 0.5106ms 1.9586 KOps/s 1.8936 KOps/s $\color{#35bf28}+3.43\%$
test_func_call_runtime[True-compile-overhead] 0.6100ms 0.5112ms 1.9563 KOps/s 1.8933 KOps/s $\color{#35bf28}+3.33\%$
test_func_call_cm_runtime[False-eager] 0.6896ms 0.5341ms 1.8723 KOps/s 1.8766 KOps/s $\color{#d91a1a}-0.23\%$
test_func_call_cm_runtime[False-compile] 0.6148ms 0.5044ms 1.9826 KOps/s 1.9473 KOps/s $\color{#35bf28}+1.81\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8629ms 0.5025ms 1.9899 KOps/s 1.9499 KOps/s $\color{#35bf28}+2.05\%$
test_func_call_cm_runtime[True-eager] 1.4586ms 0.8933ms 1.1195 KOps/s 1.1142 KOps/s $\color{#35bf28}+0.48\%$
test_func_call_cm_runtime[True-compile] 1.0925ms 0.7556ms 1.3234 KOps/s 1.3156 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_cm_runtime[True-compile-overhead] 1.3258ms 0.7673ms 1.3032 KOps/s 1.3032 KOps/s $-0.00\%$
test_vmap_func_call_cm_runtime[eager] 2.5871ms 1.8840ms 530.7729 Ops/s 516.8353 Ops/s $\color{#35bf28}+2.70\%$
test_vmap_func_call_cm_runtime[compile] 2.8659ms 1.9313ms 517.7857 Ops/s 503.7640 Ops/s $\color{#35bf28}+2.78\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6722ms 1.9314ms 517.7506 Ops/s 501.4666 Ops/s $\color{#35bf28}+3.25\%$
test_distributed 0.4114ms 0.1257ms 7.9544 KOps/s 7.6890 KOps/s $\color{#35bf28}+3.45\%$
test_tdmodule 0.1234ms 19.2535μs 51.9385 KOps/s 54.8969 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_tdmodule_dispatch 54.8020μs 38.2316μs 26.1563 KOps/s 28.1267 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_tdseq 45.8360μs 22.5036μs 44.4372 KOps/s 47.7666 KOps/s $\textbf{\color{#d91a1a}-6.97\%}$
test_tdseq_dispatch 68.2470μs 44.3059μs 22.5704 KOps/s 24.2214 KOps/s $\textbf{\color{#d91a1a}-6.82\%}$
test_instantiation_functorch 1.8546ms 1.5951ms 626.9168 Ops/s 610.4782 Ops/s $\color{#35bf28}+2.69\%$
test_instantiation_td 2.0838ms 1.1735ms 852.1705 Ops/s 821.0503 Ops/s $\color{#35bf28}+3.79\%$
test_exec_functorch 0.4162ms 0.1880ms 5.3190 KOps/s 5.4469 KOps/s $\color{#d91a1a}-2.35\%$
test_exec_functional_call 0.3087ms 0.1754ms 5.7001 KOps/s 5.6040 KOps/s $\color{#35bf28}+1.71\%$
test_exec_td 0.3267ms 0.1748ms 5.7210 KOps/s 5.6556 KOps/s $\color{#35bf28}+1.16\%$
test_exec_td_decorator 0.3990ms 0.2281ms 4.3838 KOps/s 4.2963 KOps/s $\color{#35bf28}+2.04\%$
test_vmap_mlp_speed[True-True] 1.1563ms 0.6705ms 1.4913 KOps/s 1.5182 KOps/s $\color{#d91a1a}-1.77\%$
test_vmap_mlp_speed[True-False] 0.9666ms 0.6524ms 1.5327 KOps/s 1.5308 KOps/s $\color{#35bf28}+0.12\%$
test_vmap_mlp_speed[False-True] 0.8042ms 0.5051ms 1.9799 KOps/s 1.9651 KOps/s $\color{#35bf28}+0.75\%$
test_vmap_mlp_speed[False-False] 0.8306ms 0.5014ms 1.9944 KOps/s 1.9623 KOps/s $\color{#35bf28}+1.63\%$
test_vmap_mlp_speed_decorator[True-True] 1.5963ms 0.6315ms 1.5835 KOps/s 1.5905 KOps/s $\color{#d91a1a}-0.44\%$
test_vmap_mlp_speed_decorator[True-False] 0.8359ms 0.6287ms 1.5905 KOps/s 1.5816 KOps/s $\color{#35bf28}+0.56\%$
test_vmap_mlp_speed_decorator[False-True] 0.8727ms 0.5226ms 1.9134 KOps/s 1.9094 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[False-False] 0.7067ms 0.5198ms 1.9239 KOps/s 1.9111 KOps/s $\color{#35bf28}+0.67\%$
test_to_module_speed[True] 2.2080ms 1.3150ms 760.4332 Ops/s 742.7936 Ops/s $\color{#35bf28}+2.37\%$
test_to_module_speed[False] 4.7380ms 1.3135ms 761.3222 Ops/s 792.1188 Ops/s $\color{#d91a1a}-3.89\%$
test_tc_init 85.0690μs 44.6382μs 22.4023 KOps/s 23.2027 KOps/s $\color{#d91a1a}-3.45\%$
test_tc_init_nested 0.1655ms 90.9373μs 10.9966 KOps/s 11.9189 KOps/s $\textbf{\color{#d91a1a}-7.74\%}$
test_tc_first_layer_tensor 17.8330μs 1.5339μs 651.9493 KOps/s 661.2843 KOps/s $\color{#d91a1a}-1.41\%$
test_tc_first_layer_nontensor 26.4100μs 4.7747μs 209.4373 KOps/s 213.7106 KOps/s $\color{#d91a1a}-2.00\%$
test_tc_second_layer_tensor 35.4730μs 2.8187μs 354.7775 KOps/s 358.2922 KOps/s $\color{#d91a1a}-0.98\%$
test_tc_second_layer_nontensor 46.8340μs 6.0650μs 164.8817 KOps/s 163.7959 KOps/s $\color{#35bf28}+0.66\%$
test_unbind 0.4799s 14.3111ms 69.8759 Ops/s 73.5983 Ops/s $\textbf{\color{#d91a1a}-5.06\%}$
test_full_like 9.5711ms 7.7253ms 129.4447 Ops/s 122.3186 Ops/s $\textbf{\color{#35bf28}+5.83\%}$
test_zeros_like 3.4183ms 2.9608ms 337.7515 Ops/s 335.6319 Ops/s $\color{#35bf28}+0.63\%$
test_ones_like 4.0680ms 3.5039ms 285.3937 Ops/s 293.5554 Ops/s $\color{#d91a1a}-2.78\%$
test_clone 6.6494ms 5.2647ms 189.9451 Ops/s 187.1486 Ops/s $\color{#35bf28}+1.49\%$
test_squeeze 64.1490μs 12.7999μs 78.1259 KOps/s 76.3937 KOps/s $\color{#35bf28}+2.27\%$
test_unsqueeze 0.3406ms 92.3647μs 10.8266 KOps/s 10.2915 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_split 0.4489ms 0.1932ms 5.1766 KOps/s 5.0183 KOps/s $\color{#35bf28}+3.15\%$
test_permute 0.4416ms 0.2239ms 4.4666 KOps/s 4.3733 KOps/s $\color{#35bf28}+2.13\%$
test_stack 28.4453ms 25.7008ms 38.9093 Ops/s 41.0316 Ops/s $\textbf{\color{#d91a1a}-5.17\%}$
test_cat 28.2167ms 25.5638ms 39.1178 Ops/s 41.0104 Ops/s $\color{#d91a1a}-4.61\%$

Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}41$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1213ms 14.8673μs 67.2616 KOps/s 73.3138 KOps/s $\textbf{\color{#d91a1a}-8.26\%}$
test_plain_set_stack_nested 55.1520μs 15.0362μs 66.5061 KOps/s 72.5745 KOps/s $\textbf{\color{#d91a1a}-8.36\%}$
test_plain_set_nested_inplace 49.8010μs 15.7793μs 63.3742 KOps/s 68.5555 KOps/s $\textbf{\color{#d91a1a}-7.56\%}$
test_plain_set_stack_nested_inplace 42.1010μs 15.7044μs 63.6766 KOps/s 68.5066 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_items 45.8120μs 2.8745μs 347.8912 KOps/s 342.8157 KOps/s $\color{#35bf28}+1.48\%$
test_items_nested 0.3667ms 0.3250ms 3.0773 KOps/s 3.0499 KOps/s $\color{#35bf28}+0.90\%$
test_items_nested_locked 0.3832ms 0.3263ms 3.0644 KOps/s 3.0400 KOps/s $\color{#35bf28}+0.80\%$
test_items_nested_leaf 0.1038ms 55.8010μs 17.9208 KOps/s 18.0546 KOps/s $\color{#d91a1a}-0.74\%$
test_items_stack_nested 0.3812ms 0.3306ms 3.0248 KOps/s 3.0923 KOps/s $\color{#d91a1a}-2.18\%$
test_items_stack_nested_leaf 0.1032ms 56.9466μs 17.5603 KOps/s 18.0189 KOps/s $\color{#d91a1a}-2.55\%$
test_items_stack_nested_locked 0.3803ms 0.3318ms 3.0137 KOps/s 3.0306 KOps/s $\color{#d91a1a}-0.56\%$
test_keys 0.6144ms 3.4597μs 289.0456 KOps/s 289.8978 KOps/s $\color{#d91a1a}-0.29\%$
test_keys_nested 86.6120μs 54.8196μs 18.2416 KOps/s 18.0265 KOps/s $\color{#35bf28}+1.19\%$
test_keys_nested_locked 0.9080ms 61.9596μs 16.1396 KOps/s 16.0503 KOps/s $\color{#35bf28}+0.56\%$
test_keys_nested_leaf 71.9920μs 46.8137μs 21.3612 KOps/s 21.8148 KOps/s $\color{#d91a1a}-2.08\%$
test_keys_stack_nested 83.0320μs 57.1960μs 17.4837 KOps/s 17.7858 KOps/s $\color{#d91a1a}-1.70\%$
test_keys_stack_nested_leaf 77.9920μs 48.8450μs 20.4729 KOps/s 21.2171 KOps/s $\color{#d91a1a}-3.51\%$
test_keys_stack_nested_locked 0.1006ms 62.2073μs 16.0753 KOps/s 16.1696 KOps/s $\color{#d91a1a}-0.58\%$
test_values 4.3935μs 0.8332μs 1.2002 MOps/s 1.1921 MOps/s $\color{#35bf28}+0.68\%$
test_values_nested 66.5210μs 40.5280μs 24.6743 KOps/s 24.6813 KOps/s $\color{#d91a1a}-0.03\%$
test_values_nested_locked 71.1120μs 42.4091μs 23.5798 KOps/s 23.5891 KOps/s $\color{#d91a1a}-0.04\%$
test_values_nested_leaf 59.2810μs 35.2105μs 28.4006 KOps/s 28.3368 KOps/s $\color{#35bf28}+0.23\%$
test_values_stack_nested 73.9520μs 41.7290μs 23.9641 KOps/s 24.5690 KOps/s $\color{#d91a1a}-2.46\%$
test_values_stack_nested_leaf 66.4510μs 35.6551μs 28.0465 KOps/s 28.3092 KOps/s $\color{#d91a1a}-0.93\%$
test_values_stack_nested_locked 74.0010μs 43.2940μs 23.0979 KOps/s 23.5970 KOps/s $\color{#d91a1a}-2.12\%$
test_membership 1.9096μs 0.5013μs 1.9947 MOps/s 2.0016 MOps/s $\color{#d91a1a}-0.35\%$
test_membership_nested 15.1550μs 1.8521μs 539.9325 KOps/s 541.8824 KOps/s $\color{#d91a1a}-0.36\%$
test_membership_nested_leaf 10.2503μs 1.8262μs 547.5773 KOps/s 549.7523 KOps/s $\color{#d91a1a}-0.40\%$
test_membership_stacked_nested 38.9010μs 1.9329μs 517.3617 KOps/s 523.5516 KOps/s $\color{#d91a1a}-1.18\%$
test_membership_stacked_nested_leaf 26.2810μs 1.9738μs 506.6437 KOps/s 522.0070 KOps/s $\color{#d91a1a}-2.94\%$
test_membership_nested_last 0.1031ms 2.7699μs 361.0180 KOps/s 360.5066 KOps/s $\color{#35bf28}+0.14\%$
test_membership_nested_leaf_last 43.2210μs 2.8093μs 355.9598 KOps/s 352.8688 KOps/s $\color{#35bf28}+0.88\%$
test_membership_stacked_nested_last 26.9910μs 3.1812μs 314.3427 KOps/s 360.5459 KOps/s $\textbf{\color{#d91a1a}-12.81\%}$
test_membership_stacked_nested_leaf_last 30.0400μs 3.1343μs 319.0518 KOps/s 362.6589 KOps/s $\textbf{\color{#d91a1a}-12.02\%}$
test_nested_getleaf 28.5410μs 6.2037μs 161.1948 KOps/s 164.8019 KOps/s $\color{#d91a1a}-2.19\%$
test_nested_get 31.7710μs 5.7803μs 173.0009 KOps/s 173.3818 KOps/s $\color{#d91a1a}-0.22\%$
test_stacked_getleaf 38.4710μs 5.9944μs 166.8227 KOps/s 165.3175 KOps/s $\color{#35bf28}+0.91\%$
test_stacked_get 35.0510μs 5.7343μs 174.3901 KOps/s 176.1150 KOps/s $\color{#d91a1a}-0.98\%$
test_nested_getitemleaf 30.7810μs 6.0980μs 163.9877 KOps/s 161.9671 KOps/s $\color{#35bf28}+1.25\%$
test_nested_getitem 26.6910μs 5.8019μs 172.3577 KOps/s 173.7914 KOps/s $\color{#d91a1a}-0.82\%$
test_stacked_getitemleaf 26.1210μs 6.1597μs 162.3455 KOps/s 163.7292 KOps/s $\color{#d91a1a}-0.85\%$
test_stacked_getitem 34.9810μs 5.6505μs 176.9763 KOps/s 174.4092 KOps/s $\color{#35bf28}+1.47\%$
test_lock_nested 7.8729ms 0.4189ms 2.3871 KOps/s 2.3923 KOps/s $\color{#d91a1a}-0.22\%$
test_lock_stack_nested 0.4133ms 0.3784ms 2.6429 KOps/s 2.6540 KOps/s $\color{#d91a1a}-0.42\%$
test_unlock_nested 0.7682ms 0.3519ms 2.8415 KOps/s 2.8508 KOps/s $\color{#d91a1a}-0.33\%$
test_unlock_stack_nested 0.3473ms 0.3184ms 3.1403 KOps/s 3.1579 KOps/s $\color{#d91a1a}-0.56\%$
test_flatten_speed 0.1490ms 69.2264μs 14.4454 KOps/s 14.3801 KOps/s $\color{#35bf28}+0.45\%$
test_unflatten_speed 0.3306ms 0.2863ms 3.4931 KOps/s 3.5552 KOps/s $\color{#d91a1a}-1.75\%$
test_common_ops 1.6980ms 1.3106ms 763.0274 Ops/s 809.3252 Ops/s $\textbf{\color{#d91a1a}-5.72\%}$
test_creation 30.0210μs 1.4616μs 684.1803 KOps/s 656.9638 KOps/s $\color{#35bf28}+4.14\%$
test_creation_empty 44.6220μs 17.5217μs 57.0719 KOps/s 68.1152 KOps/s $\textbf{\color{#d91a1a}-16.21\%}$
test_creation_nested_1 46.1620μs 19.0014μs 52.6278 KOps/s 60.9656 KOps/s $\textbf{\color{#d91a1a}-13.68\%}$
test_creation_nested_2 50.4920μs 21.8460μs 45.7750 KOps/s 50.8475 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_clone 59.6610μs 28.2635μs 35.3813 KOps/s 34.5782 KOps/s $\color{#35bf28}+2.32\%$
test_getitem[int] 1.2285ms 15.4457μs 64.7431 KOps/s 63.1939 KOps/s $\color{#35bf28}+2.45\%$
test_getitem[slice_int] 0.1327ms 26.8479μs 37.2469 KOps/s 36.2453 KOps/s $\color{#35bf28}+2.76\%$
test_getitem[range] 0.1601ms 0.1094ms 9.1411 KOps/s 9.1896 KOps/s $\color{#d91a1a}-0.53\%$
test_getitem[tuple] 0.1261ms 22.9348μs 43.6018 KOps/s 43.6408 KOps/s $\color{#d91a1a}-0.09\%$
test_getitem[list] 0.2051ms 0.1030ms 9.7064 KOps/s 10.0884 KOps/s $\color{#d91a1a}-3.79\%$
test_setitem_dim[int] 71.1020μs 47.1718μs 21.1991 KOps/s 22.5315 KOps/s $\textbf{\color{#d91a1a}-5.91\%}$
test_setitem_dim[slice_int] 95.6020μs 69.3186μs 14.4261 KOps/s 15.0648 KOps/s $\color{#d91a1a}-4.24\%$
test_setitem_dim[range] 0.1786ms 0.1346ms 7.4318 KOps/s 7.9055 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_setitem_dim[tuple] 87.6620μs 63.7220μs 15.6932 KOps/s 16.6035 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_setitem 84.3210μs 44.0474μs 22.7028 KOps/s 24.4729 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_set 78.2020μs 42.4142μs 23.5770 KOps/s 24.9899 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_set_shared 0.3508ms 50.4587μs 19.8182 KOps/s 20.1166 KOps/s $\color{#d91a1a}-1.48\%$
test_update 87.5920μs 53.0281μs 18.8579 KOps/s 20.7422 KOps/s $\textbf{\color{#d91a1a}-9.08\%}$
test_update_nested 97.1420μs 60.0035μs 16.6657 KOps/s 18.3103 KOps/s $\textbf{\color{#d91a1a}-8.98\%}$
test_update__nested 91.9410μs 57.6631μs 17.3421 KOps/s 16.9776 KOps/s $\color{#35bf28}+2.15\%$
test_set_nested 83.8310μs 45.9876μs 21.7450 KOps/s 23.4391 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_set_nested_new 83.2720μs 49.4748μs 20.2123 KOps/s 21.8793 KOps/s $\textbf{\color{#d91a1a}-7.62\%}$
test_select 95.6020μs 61.6227μs 16.2278 KOps/s 16.6286 KOps/s $\color{#d91a1a}-2.41\%$
test_select_nested 69.0320μs 42.5654μs 23.4932 KOps/s 23.7056 KOps/s $\color{#d91a1a}-0.90\%$
test_exclude_nested 83.6520μs 59.9503μs 16.6805 KOps/s 16.9234 KOps/s $\color{#d91a1a}-1.44\%$
test_empty[True] 0.2927ms 0.2415ms 4.1414 KOps/s 4.0567 KOps/s $\color{#35bf28}+2.09\%$
test_empty[False] 2.8440μs 0.7417μs 1.3482 MOps/s 1.3666 MOps/s $\color{#d91a1a}-1.34\%$
test_to 52.3610μs 25.1168μs 39.8140 KOps/s 39.6858 KOps/s $\color{#35bf28}+0.32\%$
test_to_nonblocking 64.6510μs 23.3157μs 42.8895 KOps/s 41.3202 KOps/s $\color{#35bf28}+3.80\%$
test_unbind_speed 0.3436ms 0.2715ms 3.6827 KOps/s 3.6610 KOps/s $\color{#35bf28}+0.59\%$
test_unbind_speed_stack0 0.3190ms 0.2687ms 3.7215 KOps/s 3.6667 KOps/s $\color{#35bf28}+1.50\%$
test_unbind_speed_stack1 0.1132s 0.7082ms 1.4119 KOps/s 1.4098 KOps/s $\color{#35bf28}+0.15\%$
test_split 0.1013s 2.1571ms 463.5791 Ops/s 471.2691 Ops/s $\color{#d91a1a}-1.63\%$
test_chunk 0.1107s 2.2085ms 452.7863 Ops/s 469.6869 Ops/s $\color{#d91a1a}-3.60\%$
test_creation[device0] 0.3408ms 0.1262ms 7.9243 KOps/s 7.8923 KOps/s $\color{#35bf28}+0.41\%$
test_creation_from_tensor 0.3955ms 0.1307ms 7.6512 KOps/s 7.7589 KOps/s $\color{#d91a1a}-1.39\%$
test_add_one[memmap_tensor0] 0.1877ms 8.2734μs 120.8690 KOps/s 115.4691 KOps/s $\color{#35bf28}+4.68\%$
test_contiguous[memmap_tensor0] 18.4500μs 2.1148μs 472.8590 KOps/s 472.9309 KOps/s $\color{#d91a1a}-0.02\%$
test_stack[memmap_tensor0] 34.8210μs 6.5621μs 152.3899 KOps/s 155.4620 KOps/s $\color{#d91a1a}-1.98\%$
test_memmaptd_index 1.1841ms 0.4070ms 2.4568 KOps/s 2.4278 KOps/s $\color{#35bf28}+1.19\%$
test_memmaptd_index_astensor 0.9249ms 0.4644ms 2.1535 KOps/s 2.1254 KOps/s $\color{#35bf28}+1.32\%$
test_memmaptd_index_op 1.3995ms 1.0158ms 984.4084 Ops/s 1.0087 KOps/s $\color{#d91a1a}-2.41\%$
test_serialize_model 0.1313s 0.1302s 7.6817 Ops/s 7.7047 Ops/s $\color{#d91a1a}-0.30\%$
test_serialize_model_pickle 1.3493s 1.2135s 0.8241 Ops/s 0.8247 Ops/s $\color{#d91a1a}-0.07\%$
test_serialize_weights 0.1321s 0.1299s 7.6985 Ops/s 7.7281 Ops/s $\color{#d91a1a}-0.38\%$
test_serialize_weights_returnearly 0.2642s 57.6154ms 17.3565 Ops/s 18.1949 Ops/s $\color{#d91a1a}-4.61\%$
test_serialize_weights_pickle 1.3492s 1.2134s 0.8241 Ops/s 0.8177 Ops/s $\color{#35bf28}+0.78\%$
test_reshape_pytree 70.2520μs 34.8289μs 28.7118 KOps/s 28.4656 KOps/s $\color{#35bf28}+0.87\%$
test_reshape_td 82.4920μs 39.9833μs 25.0105 KOps/s 24.3199 KOps/s $\color{#35bf28}+2.84\%$
test_view_pytree 67.6920μs 34.5070μs 28.9796 KOps/s 28.0614 KOps/s $\color{#35bf28}+3.27\%$
test_view_td 76.2420μs 44.2837μs 22.5817 KOps/s 21.4039 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_unbind_pytree 60.2820μs 32.9021μs 30.3932 KOps/s 29.7436 KOps/s $\color{#35bf28}+2.18\%$
test_unbind_td 0.3814ms 41.8977μs 23.8677 KOps/s 23.1024 KOps/s $\color{#35bf28}+3.31\%$
test_split_pytree 0.5667ms 45.2348μs 22.1069 KOps/s 22.1647 KOps/s $\color{#d91a1a}-0.26\%$
test_split_td 0.1828ms 55.1095μs 18.1457 KOps/s 17.3006 KOps/s $\color{#35bf28}+4.88\%$
test_add_pytree 0.1147ms 56.5002μs 17.6990 KOps/s 17.6811 KOps/s $\color{#35bf28}+0.10\%$
test_add_td 0.1475ms 95.6428μs 10.4556 KOps/s 10.7110 KOps/s $\color{#d91a1a}-2.38\%$
test_compile_add_one_nested[tensordict-compile] 0.4088ms 0.2080ms 4.8066 KOps/s 4.6493 KOps/s $\color{#35bf28}+3.38\%$
test_compile_add_one_nested[tensordict-eager] 0.2038ms 0.1519ms 6.5838 KOps/s 6.6875 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_add_one_nested[pytree-compile] 0.1941ms 0.1486ms 6.7298 KOps/s 7.0192 KOps/s $\color{#d91a1a}-4.12\%$
test_compile_add_one_nested[pytree-eager] 0.2474ms 0.1824ms 5.4822 KOps/s 5.5045 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_copy_nested[tensordict-compile] 0.1186ms 22.0943μs 45.2605 KOps/s 46.0033 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_copy_nested[tensordict-eager] 90.8630μs 43.6967μs 22.8850 KOps/s 23.1298 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_copy_nested[pytree-compile] 0.2181ms 64.8892μs 15.4109 KOps/s 15.4761 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_copy_nested[pytree-eager] 92.1820μs 49.3454μs 20.2653 KOps/s 20.2476 KOps/s $\color{#35bf28}+0.09\%$
test_compile_add_one_flat[tensordict-compile] 0.4113ms 0.3127ms 3.1980 KOps/s 3.1905 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_one_flat[tensordict-eager] 0.2909ms 0.2104ms 4.7524 KOps/s 4.7470 KOps/s $\color{#35bf28}+0.11\%$
test_compile_add_one_flat[tensorclass-compile] 0.1816ms 0.1259ms 7.9438 KOps/s 7.8600 KOps/s $\color{#35bf28}+1.07\%$
test_compile_add_one_flat[tensorclass-eager] 0.1257ms 62.2082μs 16.0750 KOps/s 15.8090 KOps/s $\color{#35bf28}+1.68\%$
test_compile_add_one_flat[pytree-compile] 0.3584ms 0.3132ms 3.1931 KOps/s 3.1907 KOps/s $\color{#35bf28}+0.07\%$
test_compile_add_one_flat[pytree-eager] 0.6718ms 0.6071ms 1.6473 KOps/s 1.4940 KOps/s $\textbf{\color{#35bf28}+10.26\%}$
test_compile_add_self_flat[tensordict-eager] 0.3084ms 0.2484ms 4.0254 KOps/s 3.9958 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_self_flat[tensordict-compile] 0.3670ms 0.3111ms 3.2140 KOps/s 3.1615 KOps/s $\color{#35bf28}+1.66\%$
test_compile_add_self_flat[tensorclass-eager] 0.1298ms 71.5136μs 13.9834 KOps/s 13.7746 KOps/s $\color{#35bf28}+1.52\%$
test_compile_add_self_flat[tensorclass-compile] 0.1800ms 0.1266ms 7.9013 KOps/s 7.5074 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_compile_add_self_flat[pytree-eager] 0.6099ms 0.5240ms 1.9083 KOps/s 1.9135 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_add_self_flat[pytree-compile] 0.3791ms 0.3118ms 3.2077 KOps/s 3.1731 KOps/s $\color{#35bf28}+1.09\%$
test_compile_copy_flat[tensordict-compile] 0.1312ms 17.6615μs 56.6204 KOps/s 53.5681 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_compile_copy_flat[tensordict-eager] 58.8910μs 27.8284μs 35.9345 KOps/s 37.3761 KOps/s $\color{#d91a1a}-3.86\%$
test_compile_copy_flat[pytree-compile] 0.1072ms 70.3045μs 14.2238 KOps/s 14.0850 KOps/s $\color{#35bf28}+0.99\%$
test_compile_copy_flat[pytree-eager] 79.8120μs 51.2608μs 19.5081 KOps/s 19.1473 KOps/s $\color{#35bf28}+1.88\%$
test_compile_assign_and_add[tensordict-compile] 2.3801ms 0.8360ms 1.1962 KOps/s 1.1352 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_compile_assign_and_add[tensordict-eager] 3.5843ms 3.2887ms 304.0701 Ops/s 306.1376 Ops/s $\color{#d91a1a}-0.68\%$
test_compile_assign_and_add[pytree-compile] 2.2860ms 0.8024ms 1.2463 KOps/s 1.1461 KOps/s $\textbf{\color{#35bf28}+8.74\%}$
test_compile_assign_and_add[pytree-eager] 3.7384ms 3.3201ms 301.1959 Ops/s 297.5126 Ops/s $\color{#35bf28}+1.24\%$
test_compile_indexing[tensor-tensordict-compile] 0.1578ms 0.1083ms 9.2299 KOps/s 9.1731 KOps/s $\color{#35bf28}+0.62\%$
test_compile_indexing[tensor-tensordict-eager] 0.4637ms 64.4062μs 15.5265 KOps/s 16.6831 KOps/s $\textbf{\color{#d91a1a}-6.93\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.5172ms 0.1088ms 9.1913 KOps/s 9.6870 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_compile_indexing[tensor-tensorclass-eager] 91.9920μs 46.9089μs 21.3179 KOps/s 23.1493 KOps/s $\textbf{\color{#d91a1a}-7.91\%}$
test_compile_indexing[tensor-pytree-compile] 0.5124ms 0.1083ms 9.2294 KOps/s 9.6370 KOps/s $\color{#d91a1a}-4.23\%$
test_compile_indexing[tensor-pytree-eager] 0.4470ms 46.5518μs 21.4814 KOps/s 23.3507 KOps/s $\textbf{\color{#d91a1a}-8.01\%}$
test_compile_indexing[slice-tensordict-compile] 0.1892ms 0.1438ms 6.9561 KOps/s 7.3161 KOps/s $\color{#d91a1a}-4.92\%$
test_compile_indexing[slice-tensordict-eager] 0.4312ms 24.7834μs 40.3496 KOps/s 40.7114 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_indexing[slice-tensorclass-compile] 0.5220ms 0.1366ms 7.3222 KOps/s 7.7456 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_indexing[slice-tensorclass-eager] 68.7520μs 22.1470μs 45.1529 KOps/s 47.9409 KOps/s $\textbf{\color{#d91a1a}-5.82\%}$
test_compile_indexing[slice-pytree-compile] 0.5480ms 0.1379ms 7.2497 KOps/s 7.6612 KOps/s $\textbf{\color{#d91a1a}-5.37\%}$
test_compile_indexing[slice-pytree-eager] 0.4126ms 22.5120μs 44.4208 KOps/s 48.2666 KOps/s $\textbf{\color{#d91a1a}-7.97\%}$
test_compile_indexing[int-tensordict-compile] 0.5417ms 0.1458ms 6.8565 KOps/s 7.2837 KOps/s $\textbf{\color{#d91a1a}-5.86\%}$
test_compile_indexing[int-tensordict-eager] 0.4902ms 26.4110μs 37.8630 KOps/s 40.5402 KOps/s $\textbf{\color{#d91a1a}-6.60\%}$
test_compile_indexing[int-tensorclass-compile] 0.5302ms 0.1374ms 7.2771 KOps/s 7.6610 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_compile_indexing[int-tensorclass-eager] 0.4247ms 22.1126μs 45.2231 KOps/s 48.9250 KOps/s $\textbf{\color{#d91a1a}-7.57\%}$
test_compile_indexing[int-pytree-compile] 0.5315ms 0.1383ms 7.2287 KOps/s 7.6588 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_compile_indexing[int-pytree-eager] 0.4144ms 22.3120μs 44.8189 KOps/s 48.4806 KOps/s $\textbf{\color{#d91a1a}-7.55\%}$
test_mod_add[eager] 0.4439ms 35.1290μs 28.4665 KOps/s 31.2000 KOps/s $\textbf{\color{#d91a1a}-8.76\%}$
test_mod_add[compile] 0.4818ms 75.7698μs 13.1979 KOps/s 14.0515 KOps/s $\textbf{\color{#d91a1a}-6.08\%}$
test_mod_add[compile-overhead] 0.2553ms 0.1327ms 7.5368 KOps/s 6.6097 KOps/s $\textbf{\color{#35bf28}+14.03\%}$
test_mod_wrap[eager] 0.3404ms 0.2580ms 3.8764 KOps/s 4.0474 KOps/s $\color{#d91a1a}-4.22\%$
test_mod_wrap[compile] 1.6928ms 0.3011ms 3.3208 KOps/s 3.3322 KOps/s $\color{#d91a1a}-0.34\%$
test_mod_wrap[compile-overhead] 7.7536ms 4.0386ms 247.6116 Ops/s 257.5535 Ops/s $\color{#d91a1a}-3.86\%$
test_mod_wrap_and_backward[eager] 1.5784ms 1.3734ms 728.1126 Ops/s 688.6230 Ops/s $\textbf{\color{#35bf28}+5.73\%}$
test_mod_wrap_and_backward[compile] 1.5543ms 1.3094ms 763.7364 Ops/s 693.5126 Ops/s $\textbf{\color{#35bf28}+10.13\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3257ms 0.9066ms 1.1030 KOps/s 988.2407 Ops/s $\textbf{\color{#35bf28}+11.61\%}$
test_seq_add[eager] 0.1397ms 96.1850μs 10.3966 KOps/s 10.1680 KOps/s $\color{#35bf28}+2.25\%$
test_seq_add[compile] 0.1623ms 79.1596μs 12.6327 KOps/s 12.2947 KOps/s $\color{#35bf28}+2.75\%$
test_seq_add[compile-overhead] 0.1594ms 0.1135ms 8.8144 KOps/s 8.7245 KOps/s $\color{#35bf28}+1.03\%$
test_seq_wrap[eager] 0.4417ms 0.3797ms 2.6339 KOps/s 2.5973 KOps/s $\color{#35bf28}+1.41\%$
test_seq_wrap[compile] 0.3668ms 0.3055ms 3.2737 KOps/s 3.1441 KOps/s $\color{#35bf28}+4.12\%$
test_seq_wrap[compile-overhead] 0.3154ms 0.2169ms 4.6106 KOps/s 4.5515 KOps/s $\color{#35bf28}+1.30\%$
test_func_call_runtime[False-eager] 0.8552ms 0.7419ms 1.3480 KOps/s 1.3207 KOps/s $\color{#35bf28}+2.06\%$
test_func_call_runtime[False-compile] 0.8398ms 0.7711ms 1.2968 KOps/s 1.2556 KOps/s $\color{#35bf28}+3.28\%$
test_func_call_runtime[False-compile-overhead] 0.4033ms 0.3533ms 2.8307 KOps/s 2.8140 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_runtime[True-eager] 0.9838ms 0.8980ms 1.1135 KOps/s 1.0983 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_runtime[True-compile] 0.8456ms 0.7893ms 1.2669 KOps/s 1.2259 KOps/s $\color{#35bf28}+3.35\%$
test_func_call_runtime[True-compile-overhead] 0.4702ms 0.3759ms 2.6603 KOps/s 2.6448 KOps/s $\color{#35bf28}+0.58\%$
test_func_call_cm_runtime[False-eager] 0.8932ms 0.7459ms 1.3406 KOps/s 1.3372 KOps/s $\color{#35bf28}+0.26\%$
test_func_call_cm_runtime[False-compile] 0.8469ms 0.7710ms 1.2971 KOps/s 1.2023 KOps/s $\textbf{\color{#35bf28}+7.88\%}$
test_func_call_cm_runtime[False-compile-overhead] 0.4070ms 0.3546ms 2.8201 KOps/s 2.7372 KOps/s $\color{#35bf28}+3.03\%$
test_func_call_cm_runtime[True-eager] 1.0927ms 0.9857ms 1.0145 KOps/s 988.3589 Ops/s $\color{#35bf28}+2.65\%$
test_func_call_cm_runtime[True-compile] 0.9730ms 0.8243ms 1.2131 KOps/s 1.1889 KOps/s $\color{#35bf28}+2.04\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4419ms 0.3986ms 2.5090 KOps/s 2.4623 KOps/s $\color{#35bf28}+1.90\%$
test_vmap_func_call_cm_runtime[eager] 2.5274ms 2.0615ms 485.0769 Ops/s 478.0223 Ops/s $\color{#35bf28}+1.48\%$
test_vmap_func_call_cm_runtime[compile] 0.9247ms 0.8375ms 1.1941 KOps/s 1.1531 KOps/s $\color{#35bf28}+3.55\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4463ms 0.4024ms 2.4850 KOps/s 2.4731 KOps/s $\color{#35bf28}+0.48\%$
test_distributed 3.5849ms 0.1326ms 7.5418 KOps/s 8.8073 KOps/s $\textbf{\color{#d91a1a}-14.37\%}$
test_tdmodule 76.2120μs 16.4837μs 60.6660 KOps/s 68.1156 KOps/s $\textbf{\color{#d91a1a}-10.94\%}$
test_tdmodule_dispatch 70.6920μs 32.4636μs 30.8038 KOps/s 34.9408 KOps/s $\textbf{\color{#d91a1a}-11.84\%}$
test_tdseq 31.2110μs 16.8256μs 59.4331 KOps/s 64.8716 KOps/s $\textbf{\color{#d91a1a}-8.38\%}$
test_tdseq_dispatch 72.1220μs 35.5455μs 28.1330 KOps/s 31.8824 KOps/s $\textbf{\color{#d91a1a}-11.76\%}$
test_instantiation_functorch 1.9926ms 1.8591ms 537.8905 Ops/s 536.1681 Ops/s $\color{#35bf28}+0.32\%$
test_instantiation_td 1.8119ms 1.1908ms 839.7750 Ops/s 842.2252 Ops/s $\color{#d91a1a}-0.29\%$
test_exec_functorch 0.2864ms 0.2166ms 4.6168 KOps/s 4.7910 KOps/s $\color{#d91a1a}-3.64\%$
test_exec_functional_call 0.2908ms 0.2192ms 4.5615 KOps/s 4.6516 KOps/s $\color{#d91a1a}-1.94\%$
test_exec_td 0.2776ms 0.2252ms 4.4412 KOps/s 4.2887 KOps/s $\color{#35bf28}+3.55\%$
test_exec_td_decorator 1.0263ms 0.2676ms 3.7367 KOps/s 3.8451 KOps/s $\color{#d91a1a}-2.82\%$
test_vmap_mlp_speed[True-True] 0.8226ms 0.6873ms 1.4549 KOps/s 1.4561 KOps/s $\color{#d91a1a}-0.08\%$
test_vmap_mlp_speed[True-False] 0.7456ms 0.6860ms 1.4578 KOps/s 1.4653 KOps/s $\color{#d91a1a}-0.51\%$
test_vmap_mlp_speed[False-True] 0.6679ms 0.5744ms 1.7409 KOps/s 1.7481 KOps/s $\color{#d91a1a}-0.42\%$
test_vmap_mlp_speed[False-False] 0.6612ms 0.5737ms 1.7432 KOps/s 1.7403 KOps/s $\color{#35bf28}+0.16\%$
test_vmap_mlp_speed_decorator[True-True] 0.8137ms 0.6726ms 1.4867 KOps/s 1.5088 KOps/s $\color{#d91a1a}-1.46\%$
test_vmap_mlp_speed_decorator[True-False] 0.8873ms 0.6702ms 1.4921 KOps/s 1.4962 KOps/s $\color{#d91a1a}-0.28\%$
test_vmap_mlp_speed_decorator[False-True] 0.7042ms 0.5862ms 1.7058 KOps/s 1.6836 KOps/s $\color{#35bf28}+1.32\%$
test_vmap_mlp_speed_decorator[False-False] 0.7042ms 0.5869ms 1.7038 KOps/s 1.6936 KOps/s $\color{#35bf28}+0.61\%$
test_vmap_transformer_speed[True-True] 8.3549ms 8.2778ms 120.8045 Ops/s 119.8472 Ops/s $\color{#35bf28}+0.80\%$
test_vmap_transformer_speed[True-False] 8.3414ms 8.2508ms 121.2001 Ops/s 119.6688 Ops/s $\color{#35bf28}+1.28\%$
test_vmap_transformer_speed[False-True] 8.1266ms 8.0696ms 123.9221 Ops/s 122.9985 Ops/s $\color{#35bf28}+0.75\%$
test_vmap_transformer_speed[False-False] 8.1451ms 8.0942ms 123.5458 Ops/s 122.2167 Ops/s $\color{#35bf28}+1.09\%$
test_vmap_transformer_speed_decorator[True-True] 19.5168ms 19.4188ms 51.4964 Ops/s 51.3787 Ops/s $\color{#35bf28}+0.23\%$
test_vmap_transformer_speed_decorator[True-False] 19.4986ms 19.4006ms 51.5447 Ops/s 51.2134 Ops/s $\color{#35bf28}+0.65\%$
test_vmap_transformer_speed_decorator[False-True] 19.3808ms 19.2978ms 51.8193 Ops/s 51.3231 Ops/s $\color{#35bf28}+0.97\%$
test_vmap_transformer_speed_decorator[False-False] 19.3687ms 19.2904ms 51.8394 Ops/s 50.2932 Ops/s $\color{#35bf28}+3.07\%$
test_to_module_speed[True] 1.4373ms 0.9400ms 1.0638 KOps/s 1.0736 KOps/s $\color{#d91a1a}-0.91\%$
test_to_module_speed[False] 1.2770ms 0.9028ms 1.1076 KOps/s 1.1140 KOps/s $\color{#d91a1a}-0.58\%$
test_tc_init 68.1620μs 35.6984μs 28.0124 KOps/s 29.3356 KOps/s $\color{#d91a1a}-4.51\%$
test_tc_init_nested 0.1657ms 73.3576μs 13.6319 KOps/s 14.4319 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_tc_first_layer_tensor 4.3444μs 0.6770μs 1.4772 MOps/s 1.4869 MOps/s $\color{#d91a1a}-0.66\%$
test_tc_first_layer_nontensor 37.4510μs 2.2377μs 446.8884 KOps/s 450.5686 KOps/s $\color{#d91a1a}-0.82\%$
test_tc_second_layer_tensor 12.3550μs 1.4228μs 702.8345 KOps/s 736.8392 KOps/s $\color{#d91a1a}-4.61\%$
test_tc_second_layer_nontensor 24.6310μs 2.9695μs 336.7514 KOps/s 343.9336 KOps/s $\color{#d91a1a}-2.09\%$
test_unbind 0.1991s 11.9868ms 83.4250 Ops/s 86.6661 Ops/s $\color{#d91a1a}-3.74\%$
test_full_like 0.6567ms 0.5760ms 1.7361 KOps/s 1.7457 KOps/s $\color{#d91a1a}-0.55\%$
test_zeros_like 0.2763ms 0.1979ms 5.0522 KOps/s 5.0545 KOps/s $\color{#d91a1a}-0.05\%$
test_ones_like 0.2345ms 0.1978ms 5.0546 KOps/s 5.0582 KOps/s $\color{#d91a1a}-0.07\%$
test_clone 0.4494ms 0.4145ms 2.4124 KOps/s 2.4137 KOps/s $\color{#d91a1a}-0.05\%$
test_squeeze 52.0810μs 9.6796μs 103.3102 KOps/s 105.9371 KOps/s $\color{#d91a1a}-2.48\%$
test_unsqueeze 0.2946ms 75.7585μs 13.1998 KOps/s 13.2889 KOps/s $\color{#d91a1a}-0.67\%$
test_split 0.2559ms 0.1571ms 6.3671 KOps/s 6.4187 KOps/s $\color{#d91a1a}-0.80\%$
test_permute 0.2215ms 0.1728ms 5.7870 KOps/s 5.7344 KOps/s $\color{#35bf28}+0.92\%$
test_stack 1.2538ms 0.8590ms 1.1641 KOps/s 1.1914 KOps/s $\color{#d91a1a}-2.29\%$
test_cat 1.3520ms 1.2315ms 812.0506 Ops/s 811.6497 Ops/s $\color{#35bf28}+0.05\%$

@vmoens vmoens merged commit ab43118 into gh/vmoens/22/base Oct 1, 2024
50 of 51 checks passed
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: cca23e89c8526b19b4389d15cf9c4e36a151ac15
Pull Request resolved: #1018
@vmoens vmoens deleted the gh/vmoens/22/head branch October 1, 2024 12:54
@vmoens vmoens added the enhancement New feature or request label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants