-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] cat and stack_from_tensordict #1018
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: cca23e89c8526b19b4389d15cf9c4e36a151ac15 Pull Request resolved: #1018
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 48.7010μs | 21.4302μs | 46.6630 KOps/s | 49.7184 KOps/s | |
test_plain_set_stack_nested | 61.8550μs | 21.7859μs | 45.9012 KOps/s | 50.2940 KOps/s | |
test_plain_set_nested_inplace | 89.7980μs | 23.3583μs | 42.8113 KOps/s | 46.3498 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3046ms | 23.5510μs | 42.4611 KOps/s | 46.8031 KOps/s | |
test_items | 87.4340μs | 4.1342μs | 241.8861 KOps/s | 239.6965 KOps/s | |
test_items_nested | 0.4663ms | 0.3684ms | 2.7141 KOps/s | 2.7593 KOps/s | |
test_items_nested_locked | 0.5793ms | 0.3680ms | 2.7175 KOps/s | 2.7585 KOps/s | |
test_items_nested_leaf | 0.1409ms | 69.6933μs | 14.3486 KOps/s | 14.5708 KOps/s | |
test_items_stack_nested | 1.5018ms | 0.3834ms | 2.6083 KOps/s | 2.7453 KOps/s | |
test_items_stack_nested_leaf | 0.1451ms | 70.8488μs | 14.1146 KOps/s | 14.0586 KOps/s | |
test_items_stack_nested_locked | 0.4547ms | 0.3711ms | 2.6950 KOps/s | 2.7730 KOps/s | |
test_keys | 27.3620μs | 3.7780μs | 264.6875 KOps/s | 261.3771 KOps/s | |
test_keys_nested | 0.1786ms | 0.1000ms | 9.9967 KOps/s | 10.1663 KOps/s | |
test_keys_nested_locked | 0.6898ms | 0.1049ms | 9.5297 KOps/s | 9.5916 KOps/s | |
test_keys_nested_leaf | 0.4949ms | 82.6317μs | 12.1019 KOps/s | 12.2854 KOps/s | |
test_keys_stack_nested | 0.1878ms | 99.6008μs | 10.0401 KOps/s | 10.0122 KOps/s | |
test_keys_stack_nested_leaf | 0.1514ms | 83.0774μs | 12.0370 KOps/s | 12.0390 KOps/s | |
test_keys_stack_nested_locked | 0.1942ms | 0.1044ms | 9.5798 KOps/s | 9.5666 KOps/s | |
test_values | 12.2368μs | 1.0300μs | 970.8606 KOps/s | 942.6081 KOps/s | |
test_values_nested | 0.1381ms | 75.4346μs | 13.2565 KOps/s | 13.7625 KOps/s | |
test_values_nested_locked | 0.1298ms | 75.1117μs | 13.3135 KOps/s | 13.7213 KOps/s | |
test_values_nested_leaf | 0.1117ms | 62.8725μs | 15.9052 KOps/s | 16.1981 KOps/s | |
test_values_stack_nested | 0.1372ms | 75.9250μs | 13.1709 KOps/s | 13.6563 KOps/s | |
test_values_stack_nested_leaf | 0.1288ms | 62.3903μs | 16.0281 KOps/s | 16.3840 KOps/s | |
test_values_stack_nested_locked | 0.1353ms | 77.0692μs | 12.9754 KOps/s | 13.6332 KOps/s | |
test_membership | 28.6930μs | 0.8904μs | 1.1230 MOps/s | 1.3799 MOps/s | |
test_membership_nested | 21.3690μs | 2.7708μs | 360.9013 KOps/s | 363.3580 KOps/s | |
test_membership_nested_leaf | 20.4780μs | 2.7985μs | 357.3297 KOps/s | 363.7485 KOps/s | |
test_membership_stacked_nested | 29.6860μs | 2.7645μs | 361.7293 KOps/s | 361.2515 KOps/s | |
test_membership_stacked_nested_leaf | 22.4920μs | 2.7888μs | 358.5822 KOps/s | 363.9825 KOps/s | |
test_membership_nested_last | 57.2060μs | 3.9377μs | 253.9544 KOps/s | 254.2762 KOps/s | |
test_membership_nested_leaf_last | 22.7220μs | 3.9680μs | 252.0144 KOps/s | 244.3395 KOps/s | |
test_membership_stacked_nested_last | 53.2500μs | 4.0311μs | 248.0729 KOps/s | 249.6996 KOps/s | |
test_membership_stacked_nested_leaf_last | 30.6170μs | 3.9867μs | 250.8324 KOps/s | 250.0340 KOps/s | |
test_nested_getleaf | 37.0190μs | 10.5568μs | 94.7259 KOps/s | 95.8009 KOps/s | |
test_nested_get | 52.1880μs | 10.2072μs | 97.9703 KOps/s | 100.6432 KOps/s | |
test_stacked_getleaf | 50.4040μs | 10.6432μs | 93.9563 KOps/s | 95.7127 KOps/s | |
test_stacked_get | 55.4840μs | 10.1934μs | 98.1025 KOps/s | 99.9141 KOps/s | |
test_nested_getitemleaf | 65.3780μs | 11.2549μs | 88.8503 KOps/s | 91.2985 KOps/s | |
test_nested_getitem | 33.8840μs | 10.2861μs | 97.2184 KOps/s | 97.9265 KOps/s | |
test_stacked_getitemleaf | 54.8320μs | 11.0261μs | 90.6940 KOps/s | 92.3414 KOps/s | |
test_stacked_getitem | 31.3390μs | 10.1930μs | 98.1061 KOps/s | 98.8031 KOps/s | |
test_lock_nested | 84.5483ms | 0.5732ms | 1.7447 KOps/s | 2.0384 KOps/s | |
test_lock_stack_nested | 0.5308ms | 0.4532ms | 2.2064 KOps/s | 2.1899 KOps/s | |
test_unlock_nested | 86.7829ms | 0.4887ms | 2.0461 KOps/s | 2.4328 KOps/s | |
test_unlock_stack_nested | 0.6394ms | 0.3698ms | 2.7040 KOps/s | 2.6516 KOps/s | |
test_flatten_speed | 0.1421ms | 89.1073μs | 11.2224 KOps/s | 11.5173 KOps/s | |
test_unflatten_speed | 0.8188ms | 0.4634ms | 2.1578 KOps/s | 2.1949 KOps/s | |
test_common_ops | 4.1153ms | 1.1337ms | 882.0979 Ops/s | 891.6723 Ops/s | |
test_creation | 0.1037ms | 2.1137μs | 473.1094 KOps/s | 483.0627 KOps/s | |
test_creation_empty | 54.5820μs | 19.8471μs | 50.3851 KOps/s | 60.6157 KOps/s | |
test_creation_nested_1 | 66.8540μs | 23.1752μs | 43.1496 KOps/s | 51.0311 KOps/s | |
test_creation_nested_2 | 67.8670μs | 27.4464μs | 36.4346 KOps/s | 41.9142 KOps/s | |
test_clone | 0.1309ms | 17.0206μs | 58.7523 KOps/s | 56.9559 KOps/s | |
test_getitem[int] | 1.2014ms | 17.6537μs | 56.6452 KOps/s | 59.9245 KOps/s | |
test_getitem[slice_int] | 0.1490ms | 31.1670μs | 32.0853 KOps/s | 32.1699 KOps/s | |
test_getitem[range] | 0.5438ms | 61.0262μs | 16.3864 KOps/s | 16.2624 KOps/s | |
test_getitem[tuple] | 0.1710ms | 25.4184μs | 39.3415 KOps/s | 39.6757 KOps/s | |
test_getitem[list] | 0.1846ms | 56.4736μs | 17.7074 KOps/s | 17.6079 KOps/s | |
test_setitem_dim[int] | 70.7720μs | 34.2439μs | 29.2023 KOps/s | 29.7747 KOps/s | |
test_setitem_dim[slice_int] | 0.1070ms | 63.5883μs | 15.7262 KOps/s | 16.1464 KOps/s | |
test_setitem_dim[range] | 0.1374ms | 86.9064μs | 11.5066 KOps/s | 11.6887 KOps/s | |
test_setitem_dim[tuple] | 96.3000μs | 51.1013μs | 19.5690 KOps/s | 19.9993 KOps/s | |
test_setitem | 0.3287ms | 31.7581μs | 31.4881 KOps/s | 33.0395 KOps/s | |
test_set | 78.7070μs | 30.8676μs | 32.3965 KOps/s | 33.9413 KOps/s | |
test_set_shared | 3.8020ms | 0.2153ms | 4.6449 KOps/s | 4.6344 KOps/s | |
test_update | 0.2866ms | 38.2153μs | 26.1675 KOps/s | 27.8239 KOps/s | |
test_update_nested | 0.1147ms | 49.7118μs | 20.1159 KOps/s | 21.7236 KOps/s | |
test_update__nested | 0.1124ms | 35.9779μs | 27.7949 KOps/s | 28.0122 KOps/s | |
test_set_nested | 0.2396ms | 34.0590μs | 29.3608 KOps/s | 31.7488 KOps/s | |
test_set_nested_new | 0.2182ms | 39.5952μs | 25.2556 KOps/s | 27.5809 KOps/s | |
test_select | 0.2814ms | 55.1338μs | 18.1377 KOps/s | 18.7545 KOps/s | |
test_select_nested | 0.1298ms | 59.9664μs | 16.6760 KOps/s | 16.9014 KOps/s | |
test_exclude_nested | 0.1519ms | 74.9006μs | 13.3510 KOps/s | 13.5600 KOps/s | |
test_empty[True] | 0.4806ms | 0.3194ms | 3.1304 KOps/s | 3.1993 KOps/s | |
test_empty[False] | 12.7740μs | 1.2214μs | 818.7176 KOps/s | 827.3167 KOps/s | |
test_unbind_speed | 0.6849ms | 0.3016ms | 3.3161 KOps/s | 3.2865 KOps/s | |
test_unbind_speed_stack0 | 0.4356ms | 0.2982ms | 3.3530 KOps/s | 3.3335 KOps/s | |
test_unbind_speed_stack1 | 98.1605ms | 0.8238ms | 1.2139 KOps/s | 1.3393 KOps/s | |
test_split | 96.5449ms | 2.1758ms | 459.5925 Ops/s | 455.7650 Ops/s | |
test_chunk | 2.5932ms | 2.0038ms | 499.0608 Ops/s | 456.1051 Ops/s | |
test_creation[device0] | 0.2990ms | 0.1186ms | 8.4317 KOps/s | 8.5487 KOps/s | |
test_creation_from_tensor | 4.9279ms | 0.1183ms | 8.4552 KOps/s | 8.4705 KOps/s | |
test_add_one[memmap_tensor0] | 0.4890ms | 7.3356μs | 136.3219 KOps/s | 126.9265 KOps/s | |
test_contiguous[memmap_tensor0] | 29.0740μs | 1.8898μs | 529.1581 KOps/s | 518.6170 KOps/s | |
test_stack[memmap_tensor0] | 59.0310μs | 5.6102μs | 178.2461 KOps/s | 172.9641 KOps/s | |
test_memmaptd_index | 1.1806ms | 0.3994ms | 2.5036 KOps/s | 2.4804 KOps/s | |
test_memmaptd_index_astensor | 0.7367ms | 0.4759ms | 2.1013 KOps/s | 2.0720 KOps/s | |
test_memmaptd_index_op | 1.7681ms | 1.0456ms | 956.4305 Ops/s | 978.2566 Ops/s | |
test_serialize_model | 0.1213s | 0.1171s | 8.5422 Ops/s | 8.4625 Ops/s | |
test_serialize_model_pickle | 0.4714s | 0.3961s | 2.5243 Ops/s | 2.5600 Ops/s | |
test_serialize_weights | 0.1245s | 0.1190s | 8.4015 Ops/s | 7.6227 Ops/s | |
test_serialize_weights_returnearly | 0.1776s | 0.1660s | 6.0228 Ops/s | 6.2361 Ops/s | |
test_serialize_weights_pickle | 1.1083s | 0.7510s | 1.3316 Ops/s | 2.4468 Ops/s | |
test_serialize_weights_filesystem | 0.1498s | 0.1451s | 6.8927 Ops/s | 7.0432 Ops/s | |
test_serialize_model_filesystem | 0.1521s | 0.1442s | 6.9343 Ops/s | 6.1479 Ops/s | |
test_reshape_pytree | 91.9210μs | 38.8830μs | 25.7182 KOps/s | 25.8881 KOps/s | |
test_reshape_td | 0.1084ms | 46.8752μs | 21.3333 KOps/s | 21.7073 KOps/s | |
test_view_pytree | 0.1208ms | 38.5732μs | 25.9247 KOps/s | 25.8974 KOps/s | |
test_view_td | 0.1148ms | 52.9293μs | 18.8931 KOps/s | 18.7185 KOps/s | |
test_unbind_pytree | 94.5370μs | 35.5156μs | 28.1566 KOps/s | 27.7672 KOps/s | |
test_unbind_td | 0.2896ms | 44.5003μs | 22.4717 KOps/s | 21.6811 KOps/s | |
test_split_pytree | 80.5810μs | 37.4955μs | 26.6699 KOps/s | 25.0599 KOps/s | |
test_split_td | 0.4922ms | 56.8780μs | 17.5815 KOps/s | 17.3518 KOps/s | |
test_add_pytree | 92.0010μs | 44.7709μs | 22.3359 KOps/s | 21.6238 KOps/s | |
test_add_td | 0.1633ms | 83.4073μs | 11.9894 KOps/s | 12.3500 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1165ms | 56.9763μs | 17.5511 KOps/s | 17.1629 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2649ms | 0.1804ms | 5.5435 KOps/s | 5.4819 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1128ms | 56.2096μs | 17.7905 KOps/s | 17.1382 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2459ms | 0.1399ms | 7.1472 KOps/s | 6.9879 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 55.4630μs | 21.4153μs | 46.6956 KOps/s | 46.3824 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1808ms | 67.7967μs | 14.7500 KOps/s | 15.0945 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1471ms | 77.6554μs | 12.8774 KOps/s | 13.5435 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1251ms | 68.8625μs | 14.5217 KOps/s | 14.9039 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.7938ms | 0.1777ms | 5.6285 KOps/s | 5.7677 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3145ms | 0.1892ms | 5.2845 KOps/s | 5.2843 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1085ms | 46.1496μs | 21.6687 KOps/s | 20.3652 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1296ms | 69.5586μs | 14.3764 KOps/s | 14.0617 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.8425ms | 0.1795ms | 5.5721 KOps/s | 5.7841 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4926ms | 0.2814ms | 3.5534 KOps/s | 3.3804 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2993ms | 0.2037ms | 4.9100 KOps/s | 4.9934 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.8146ms | 0.1746ms | 5.7260 KOps/s | 5.7817 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1957ms | 63.0746μs | 15.8542 KOps/s | 15.5257 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1023ms | 48.2997μs | 20.7041 KOps/s | 20.0314 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4975ms | 0.2300ms | 4.3487 KOps/s | 4.2061 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3072ms | 0.1759ms | 5.6857 KOps/s | 5.6968 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1829ms | 0.1016ms | 9.8379 KOps/s | 9.7294 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1284ms | 56.4223μs | 17.7235 KOps/s | 17.7666 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1727ms | 79.6088μs | 12.5614 KOps/s | 13.0275 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1369ms | 70.1573μs | 14.2537 KOps/s | 14.6235 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2994ms | 0.1976ms | 5.0595 KOps/s | 4.8149 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1374ms | 1.6211ms | 616.8691 Ops/s | 608.3224 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2864ms | 0.1942ms | 5.1506 KOps/s | 5.1447 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3337ms | 1.0788ms | 926.9296 Ops/s | 879.6870 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6082ms | 0.4237ms | 2.3599 KOps/s | 2.4410 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.8858ms | 4.0500ms | 246.9122 Ops/s | 267.1281 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 94.4360μs | 35.0427μs | 28.5366 KOps/s | 28.8577 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.8825ms | 50.5602μs | 19.7784 KOps/s | 19.8080 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 75.3710μs | 30.4831μs | 32.8050 KOps/s | 33.5962 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 86.1510μs | 28.8749μs | 34.6322 KOps/s | 34.9295 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 82.9650μs | 30.2901μs | 33.0141 KOps/s | 33.2763 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1587ms | 28.8892μs | 34.6150 KOps/s | 34.4802 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1348ms | 73.9982μs | 13.5138 KOps/s | 13.5080 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5724ms | 27.6006μs | 36.2311 KOps/s | 35.4004 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.6292ms | 67.3832μs | 14.8405 KOps/s | 14.5975 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.3001ms | 23.7461μs | 42.1122 KOps/s | 43.7815 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1621ms | 67.3442μs | 14.8491 KOps/s | 14.6960 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 96.7810μs | 23.0495μs | 43.3848 KOps/s | 43.4828 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1763ms | 73.6633μs | 13.5753 KOps/s | 13.2308 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8957ms | 27.5342μs | 36.3185 KOps/s | 35.5266 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1507ms | 66.5958μs | 15.0160 KOps/s | 14.7475 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2154ms | 22.9136μs | 43.6422 KOps/s | 44.6715 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2078ms | 67.2536μs | 14.8691 KOps/s | 14.5991 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1190ms | 22.8794μs | 43.7074 KOps/s | 44.3790 KOps/s | |
test_mod_add[eager] | 86.5410μs | 26.3390μs | 37.9665 KOps/s | 38.6339 KOps/s | |
test_mod_add[compile] | 0.1060ms | 39.0108μs | 25.6340 KOps/s | 25.0828 KOps/s | |
test_mod_add[compile-overhead] | 0.1099ms | 39.6850μs | 25.1984 KOps/s | 24.8038 KOps/s | |
test_mod_wrap[eager] | 0.3636ms | 0.2149ms | 4.6543 KOps/s | 4.7113 KOps/s | |
test_mod_wrap[compile] | 0.4526ms | 0.2349ms | 4.2568 KOps/s | 4.1369 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4520ms | 0.2314ms | 4.3209 KOps/s | 4.2238 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.6014ms | 10.9257ms | 91.5274 Ops/s | 91.3515 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.4757ms | 10.9841ms | 91.0410 Ops/s | 90.5205 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.1475ms | 10.9332ms | 91.4646 Ops/s | 90.4662 Ops/s | |
test_seq_add[eager] | 0.2128ms | 95.4036μs | 10.4818 KOps/s | 10.7872 KOps/s | |
test_seq_add[compile] | 0.1233ms | 63.6547μs | 15.7098 KOps/s | 14.8474 KOps/s | |
test_seq_add[compile-overhead] | 0.1562ms | 64.1231μs | 15.5950 KOps/s | 15.1664 KOps/s | |
test_seq_wrap[eager] | 0.6666ms | 0.3971ms | 2.5181 KOps/s | 2.5588 KOps/s | |
test_seq_wrap[compile] | 1.3402ms | 0.2723ms | 3.6719 KOps/s | 3.5651 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3919ms | 0.2696ms | 3.7092 KOps/s | 3.6319 KOps/s | |
test_func_call_runtime[False-eager] | 1.2203ms | 0.5454ms | 1.8334 KOps/s | 1.9016 KOps/s | |
test_func_call_runtime[False-compile] | 0.9141ms | 0.5038ms | 1.9848 KOps/s | 1.9706 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8772ms | 0.5065ms | 1.9742 KOps/s | 1.9602 KOps/s | |
test_func_call_runtime[True-eager] | 0.9019ms | 0.7636ms | 1.3096 KOps/s | 1.3215 KOps/s | |
test_func_call_runtime[True-compile] | 0.7195ms | 0.5106ms | 1.9586 KOps/s | 1.8936 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6100ms | 0.5112ms | 1.9563 KOps/s | 1.8933 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6896ms | 0.5341ms | 1.8723 KOps/s | 1.8766 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6148ms | 0.5044ms | 1.9826 KOps/s | 1.9473 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.8629ms | 0.5025ms | 1.9899 KOps/s | 1.9499 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4586ms | 0.8933ms | 1.1195 KOps/s | 1.1142 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0925ms | 0.7556ms | 1.3234 KOps/s | 1.3156 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.3258ms | 0.7673ms | 1.3032 KOps/s | 1.3032 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5871ms | 1.8840ms | 530.7729 Ops/s | 516.8353 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.8659ms | 1.9313ms | 517.7857 Ops/s | 503.7640 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6722ms | 1.9314ms | 517.7506 Ops/s | 501.4666 Ops/s | |
test_distributed | 0.4114ms | 0.1257ms | 7.9544 KOps/s | 7.6890 KOps/s | |
test_tdmodule | 0.1234ms | 19.2535μs | 51.9385 KOps/s | 54.8969 KOps/s | |
test_tdmodule_dispatch | 54.8020μs | 38.2316μs | 26.1563 KOps/s | 28.1267 KOps/s | |
test_tdseq | 45.8360μs | 22.5036μs | 44.4372 KOps/s | 47.7666 KOps/s | |
test_tdseq_dispatch | 68.2470μs | 44.3059μs | 22.5704 KOps/s | 24.2214 KOps/s | |
test_instantiation_functorch | 1.8546ms | 1.5951ms | 626.9168 Ops/s | 610.4782 Ops/s | |
test_instantiation_td | 2.0838ms | 1.1735ms | 852.1705 Ops/s | 821.0503 Ops/s | |
test_exec_functorch | 0.4162ms | 0.1880ms | 5.3190 KOps/s | 5.4469 KOps/s | |
test_exec_functional_call | 0.3087ms | 0.1754ms | 5.7001 KOps/s | 5.6040 KOps/s | |
test_exec_td | 0.3267ms | 0.1748ms | 5.7210 KOps/s | 5.6556 KOps/s | |
test_exec_td_decorator | 0.3990ms | 0.2281ms | 4.3838 KOps/s | 4.2963 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1563ms | 0.6705ms | 1.4913 KOps/s | 1.5182 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9666ms | 0.6524ms | 1.5327 KOps/s | 1.5308 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.8042ms | 0.5051ms | 1.9799 KOps/s | 1.9651 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8306ms | 0.5014ms | 1.9944 KOps/s | 1.9623 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.5963ms | 0.6315ms | 1.5835 KOps/s | 1.5905 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8359ms | 0.6287ms | 1.5905 KOps/s | 1.5816 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8727ms | 0.5226ms | 1.9134 KOps/s | 1.9094 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7067ms | 0.5198ms | 1.9239 KOps/s | 1.9111 KOps/s | |
test_to_module_speed[True] | 2.2080ms | 1.3150ms | 760.4332 Ops/s | 742.7936 Ops/s | |
test_to_module_speed[False] | 4.7380ms | 1.3135ms | 761.3222 Ops/s | 792.1188 Ops/s | |
test_tc_init | 85.0690μs | 44.6382μs | 22.4023 KOps/s | 23.2027 KOps/s | |
test_tc_init_nested | 0.1655ms | 90.9373μs | 10.9966 KOps/s | 11.9189 KOps/s | |
test_tc_first_layer_tensor | 17.8330μs | 1.5339μs | 651.9493 KOps/s | 661.2843 KOps/s | |
test_tc_first_layer_nontensor | 26.4100μs | 4.7747μs | 209.4373 KOps/s | 213.7106 KOps/s | |
test_tc_second_layer_tensor | 35.4730μs | 2.8187μs | 354.7775 KOps/s | 358.2922 KOps/s | |
test_tc_second_layer_nontensor | 46.8340μs | 6.0650μs | 164.8817 KOps/s | 163.7959 KOps/s | |
test_unbind | 0.4799s | 14.3111ms | 69.8759 Ops/s | 73.5983 Ops/s | |
test_full_like | 9.5711ms | 7.7253ms | 129.4447 Ops/s | 122.3186 Ops/s | |
test_zeros_like | 3.4183ms | 2.9608ms | 337.7515 Ops/s | 335.6319 Ops/s | |
test_ones_like | 4.0680ms | 3.5039ms | 285.3937 Ops/s | 293.5554 Ops/s | |
test_clone | 6.6494ms | 5.2647ms | 189.9451 Ops/s | 187.1486 Ops/s | |
test_squeeze | 64.1490μs | 12.7999μs | 78.1259 KOps/s | 76.3937 KOps/s | |
test_unsqueeze | 0.3406ms | 92.3647μs | 10.8266 KOps/s | 10.2915 KOps/s | |
test_split | 0.4489ms | 0.1932ms | 5.1766 KOps/s | 5.0183 KOps/s | |
test_permute | 0.4416ms | 0.2239ms | 4.4666 KOps/s | 4.3733 KOps/s | |
test_stack | 28.4453ms | 25.7008ms | 38.9093 Ops/s | 41.0316 Ops/s | |
test_cat | 28.2167ms | 25.5638ms | 39.1178 Ops/s | 41.0104 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1213ms | 14.8673μs | 67.2616 KOps/s | 73.3138 KOps/s | |
test_plain_set_stack_nested | 55.1520μs | 15.0362μs | 66.5061 KOps/s | 72.5745 KOps/s | |
test_plain_set_nested_inplace | 49.8010μs | 15.7793μs | 63.3742 KOps/s | 68.5555 KOps/s | |
test_plain_set_stack_nested_inplace | 42.1010μs | 15.7044μs | 63.6766 KOps/s | 68.5066 KOps/s | |
test_items | 45.8120μs | 2.8745μs | 347.8912 KOps/s | 342.8157 KOps/s | |
test_items_nested | 0.3667ms | 0.3250ms | 3.0773 KOps/s | 3.0499 KOps/s | |
test_items_nested_locked | 0.3832ms | 0.3263ms | 3.0644 KOps/s | 3.0400 KOps/s | |
test_items_nested_leaf | 0.1038ms | 55.8010μs | 17.9208 KOps/s | 18.0546 KOps/s | |
test_items_stack_nested | 0.3812ms | 0.3306ms | 3.0248 KOps/s | 3.0923 KOps/s | |
test_items_stack_nested_leaf | 0.1032ms | 56.9466μs | 17.5603 KOps/s | 18.0189 KOps/s | |
test_items_stack_nested_locked | 0.3803ms | 0.3318ms | 3.0137 KOps/s | 3.0306 KOps/s | |
test_keys | 0.6144ms | 3.4597μs | 289.0456 KOps/s | 289.8978 KOps/s | |
test_keys_nested | 86.6120μs | 54.8196μs | 18.2416 KOps/s | 18.0265 KOps/s | |
test_keys_nested_locked | 0.9080ms | 61.9596μs | 16.1396 KOps/s | 16.0503 KOps/s | |
test_keys_nested_leaf | 71.9920μs | 46.8137μs | 21.3612 KOps/s | 21.8148 KOps/s | |
test_keys_stack_nested | 83.0320μs | 57.1960μs | 17.4837 KOps/s | 17.7858 KOps/s | |
test_keys_stack_nested_leaf | 77.9920μs | 48.8450μs | 20.4729 KOps/s | 21.2171 KOps/s | |
test_keys_stack_nested_locked | 0.1006ms | 62.2073μs | 16.0753 KOps/s | 16.1696 KOps/s | |
test_values | 4.3935μs | 0.8332μs | 1.2002 MOps/s | 1.1921 MOps/s | |
test_values_nested | 66.5210μs | 40.5280μs | 24.6743 KOps/s | 24.6813 KOps/s | |
test_values_nested_locked | 71.1120μs | 42.4091μs | 23.5798 KOps/s | 23.5891 KOps/s | |
test_values_nested_leaf | 59.2810μs | 35.2105μs | 28.4006 KOps/s | 28.3368 KOps/s | |
test_values_stack_nested | 73.9520μs | 41.7290μs | 23.9641 KOps/s | 24.5690 KOps/s | |
test_values_stack_nested_leaf | 66.4510μs | 35.6551μs | 28.0465 KOps/s | 28.3092 KOps/s | |
test_values_stack_nested_locked | 74.0010μs | 43.2940μs | 23.0979 KOps/s | 23.5970 KOps/s | |
test_membership | 1.9096μs | 0.5013μs | 1.9947 MOps/s | 2.0016 MOps/s | |
test_membership_nested | 15.1550μs | 1.8521μs | 539.9325 KOps/s | 541.8824 KOps/s | |
test_membership_nested_leaf | 10.2503μs | 1.8262μs | 547.5773 KOps/s | 549.7523 KOps/s | |
test_membership_stacked_nested | 38.9010μs | 1.9329μs | 517.3617 KOps/s | 523.5516 KOps/s | |
test_membership_stacked_nested_leaf | 26.2810μs | 1.9738μs | 506.6437 KOps/s | 522.0070 KOps/s | |
test_membership_nested_last | 0.1031ms | 2.7699μs | 361.0180 KOps/s | 360.5066 KOps/s | |
test_membership_nested_leaf_last | 43.2210μs | 2.8093μs | 355.9598 KOps/s | 352.8688 KOps/s | |
test_membership_stacked_nested_last | 26.9910μs | 3.1812μs | 314.3427 KOps/s | 360.5459 KOps/s | |
test_membership_stacked_nested_leaf_last | 30.0400μs | 3.1343μs | 319.0518 KOps/s | 362.6589 KOps/s | |
test_nested_getleaf | 28.5410μs | 6.2037μs | 161.1948 KOps/s | 164.8019 KOps/s | |
test_nested_get | 31.7710μs | 5.7803μs | 173.0009 KOps/s | 173.3818 KOps/s | |
test_stacked_getleaf | 38.4710μs | 5.9944μs | 166.8227 KOps/s | 165.3175 KOps/s | |
test_stacked_get | 35.0510μs | 5.7343μs | 174.3901 KOps/s | 176.1150 KOps/s | |
test_nested_getitemleaf | 30.7810μs | 6.0980μs | 163.9877 KOps/s | 161.9671 KOps/s | |
test_nested_getitem | 26.6910μs | 5.8019μs | 172.3577 KOps/s | 173.7914 KOps/s | |
test_stacked_getitemleaf | 26.1210μs | 6.1597μs | 162.3455 KOps/s | 163.7292 KOps/s | |
test_stacked_getitem | 34.9810μs | 5.6505μs | 176.9763 KOps/s | 174.4092 KOps/s | |
test_lock_nested | 7.8729ms | 0.4189ms | 2.3871 KOps/s | 2.3923 KOps/s | |
test_lock_stack_nested | 0.4133ms | 0.3784ms | 2.6429 KOps/s | 2.6540 KOps/s | |
test_unlock_nested | 0.7682ms | 0.3519ms | 2.8415 KOps/s | 2.8508 KOps/s | |
test_unlock_stack_nested | 0.3473ms | 0.3184ms | 3.1403 KOps/s | 3.1579 KOps/s | |
test_flatten_speed | 0.1490ms | 69.2264μs | 14.4454 KOps/s | 14.3801 KOps/s | |
test_unflatten_speed | 0.3306ms | 0.2863ms | 3.4931 KOps/s | 3.5552 KOps/s | |
test_common_ops | 1.6980ms | 1.3106ms | 763.0274 Ops/s | 809.3252 Ops/s | |
test_creation | 30.0210μs | 1.4616μs | 684.1803 KOps/s | 656.9638 KOps/s | |
test_creation_empty | 44.6220μs | 17.5217μs | 57.0719 KOps/s | 68.1152 KOps/s | |
test_creation_nested_1 | 46.1620μs | 19.0014μs | 52.6278 KOps/s | 60.9656 KOps/s | |
test_creation_nested_2 | 50.4920μs | 21.8460μs | 45.7750 KOps/s | 50.8475 KOps/s | |
test_clone | 59.6610μs | 28.2635μs | 35.3813 KOps/s | 34.5782 KOps/s | |
test_getitem[int] | 1.2285ms | 15.4457μs | 64.7431 KOps/s | 63.1939 KOps/s | |
test_getitem[slice_int] | 0.1327ms | 26.8479μs | 37.2469 KOps/s | 36.2453 KOps/s | |
test_getitem[range] | 0.1601ms | 0.1094ms | 9.1411 KOps/s | 9.1896 KOps/s | |
test_getitem[tuple] | 0.1261ms | 22.9348μs | 43.6018 KOps/s | 43.6408 KOps/s | |
test_getitem[list] | 0.2051ms | 0.1030ms | 9.7064 KOps/s | 10.0884 KOps/s | |
test_setitem_dim[int] | 71.1020μs | 47.1718μs | 21.1991 KOps/s | 22.5315 KOps/s | |
test_setitem_dim[slice_int] | 95.6020μs | 69.3186μs | 14.4261 KOps/s | 15.0648 KOps/s | |
test_setitem_dim[range] | 0.1786ms | 0.1346ms | 7.4318 KOps/s | 7.9055 KOps/s | |
test_setitem_dim[tuple] | 87.6620μs | 63.7220μs | 15.6932 KOps/s | 16.6035 KOps/s | |
test_setitem | 84.3210μs | 44.0474μs | 22.7028 KOps/s | 24.4729 KOps/s | |
test_set | 78.2020μs | 42.4142μs | 23.5770 KOps/s | 24.9899 KOps/s | |
test_set_shared | 0.3508ms | 50.4587μs | 19.8182 KOps/s | 20.1166 KOps/s | |
test_update | 87.5920μs | 53.0281μs | 18.8579 KOps/s | 20.7422 KOps/s | |
test_update_nested | 97.1420μs | 60.0035μs | 16.6657 KOps/s | 18.3103 KOps/s | |
test_update__nested | 91.9410μs | 57.6631μs | 17.3421 KOps/s | 16.9776 KOps/s | |
test_set_nested | 83.8310μs | 45.9876μs | 21.7450 KOps/s | 23.4391 KOps/s | |
test_set_nested_new | 83.2720μs | 49.4748μs | 20.2123 KOps/s | 21.8793 KOps/s | |
test_select | 95.6020μs | 61.6227μs | 16.2278 KOps/s | 16.6286 KOps/s | |
test_select_nested | 69.0320μs | 42.5654μs | 23.4932 KOps/s | 23.7056 KOps/s | |
test_exclude_nested | 83.6520μs | 59.9503μs | 16.6805 KOps/s | 16.9234 KOps/s | |
test_empty[True] | 0.2927ms | 0.2415ms | 4.1414 KOps/s | 4.0567 KOps/s | |
test_empty[False] | 2.8440μs | 0.7417μs | 1.3482 MOps/s | 1.3666 MOps/s | |
test_to | 52.3610μs | 25.1168μs | 39.8140 KOps/s | 39.6858 KOps/s | |
test_to_nonblocking | 64.6510μs | 23.3157μs | 42.8895 KOps/s | 41.3202 KOps/s | |
test_unbind_speed | 0.3436ms | 0.2715ms | 3.6827 KOps/s | 3.6610 KOps/s | |
test_unbind_speed_stack0 | 0.3190ms | 0.2687ms | 3.7215 KOps/s | 3.6667 KOps/s | |
test_unbind_speed_stack1 | 0.1132s | 0.7082ms | 1.4119 KOps/s | 1.4098 KOps/s | |
test_split | 0.1013s | 2.1571ms | 463.5791 Ops/s | 471.2691 Ops/s | |
test_chunk | 0.1107s | 2.2085ms | 452.7863 Ops/s | 469.6869 Ops/s | |
test_creation[device0] | 0.3408ms | 0.1262ms | 7.9243 KOps/s | 7.8923 KOps/s | |
test_creation_from_tensor | 0.3955ms | 0.1307ms | 7.6512 KOps/s | 7.7589 KOps/s | |
test_add_one[memmap_tensor0] | 0.1877ms | 8.2734μs | 120.8690 KOps/s | 115.4691 KOps/s | |
test_contiguous[memmap_tensor0] | 18.4500μs | 2.1148μs | 472.8590 KOps/s | 472.9309 KOps/s | |
test_stack[memmap_tensor0] | 34.8210μs | 6.5621μs | 152.3899 KOps/s | 155.4620 KOps/s | |
test_memmaptd_index | 1.1841ms | 0.4070ms | 2.4568 KOps/s | 2.4278 KOps/s | |
test_memmaptd_index_astensor | 0.9249ms | 0.4644ms | 2.1535 KOps/s | 2.1254 KOps/s | |
test_memmaptd_index_op | 1.3995ms | 1.0158ms | 984.4084 Ops/s | 1.0087 KOps/s | |
test_serialize_model | 0.1313s | 0.1302s | 7.6817 Ops/s | 7.7047 Ops/s | |
test_serialize_model_pickle | 1.3493s | 1.2135s | 0.8241 Ops/s | 0.8247 Ops/s | |
test_serialize_weights | 0.1321s | 0.1299s | 7.6985 Ops/s | 7.7281 Ops/s | |
test_serialize_weights_returnearly | 0.2642s | 57.6154ms | 17.3565 Ops/s | 18.1949 Ops/s | |
test_serialize_weights_pickle | 1.3492s | 1.2134s | 0.8241 Ops/s | 0.8177 Ops/s | |
test_reshape_pytree | 70.2520μs | 34.8289μs | 28.7118 KOps/s | 28.4656 KOps/s | |
test_reshape_td | 82.4920μs | 39.9833μs | 25.0105 KOps/s | 24.3199 KOps/s | |
test_view_pytree | 67.6920μs | 34.5070μs | 28.9796 KOps/s | 28.0614 KOps/s | |
test_view_td | 76.2420μs | 44.2837μs | 22.5817 KOps/s | 21.4039 KOps/s | |
test_unbind_pytree | 60.2820μs | 32.9021μs | 30.3932 KOps/s | 29.7436 KOps/s | |
test_unbind_td | 0.3814ms | 41.8977μs | 23.8677 KOps/s | 23.1024 KOps/s | |
test_split_pytree | 0.5667ms | 45.2348μs | 22.1069 KOps/s | 22.1647 KOps/s | |
test_split_td | 0.1828ms | 55.1095μs | 18.1457 KOps/s | 17.3006 KOps/s | |
test_add_pytree | 0.1147ms | 56.5002μs | 17.6990 KOps/s | 17.6811 KOps/s | |
test_add_td | 0.1475ms | 95.6428μs | 10.4556 KOps/s | 10.7110 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4088ms | 0.2080ms | 4.8066 KOps/s | 4.6493 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2038ms | 0.1519ms | 6.5838 KOps/s | 6.6875 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1941ms | 0.1486ms | 6.7298 KOps/s | 7.0192 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2474ms | 0.1824ms | 5.4822 KOps/s | 5.5045 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1186ms | 22.0943μs | 45.2605 KOps/s | 46.0033 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 90.8630μs | 43.6967μs | 22.8850 KOps/s | 23.1298 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2181ms | 64.8892μs | 15.4109 KOps/s | 15.4761 KOps/s | |
test_compile_copy_nested[pytree-eager] | 92.1820μs | 49.3454μs | 20.2653 KOps/s | 20.2476 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4113ms | 0.3127ms | 3.1980 KOps/s | 3.1905 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2909ms | 0.2104ms | 4.7524 KOps/s | 4.7470 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1816ms | 0.1259ms | 7.9438 KOps/s | 7.8600 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1257ms | 62.2082μs | 16.0750 KOps/s | 15.8090 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3584ms | 0.3132ms | 3.1931 KOps/s | 3.1907 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6718ms | 0.6071ms | 1.6473 KOps/s | 1.4940 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3084ms | 0.2484ms | 4.0254 KOps/s | 3.9958 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3670ms | 0.3111ms | 3.2140 KOps/s | 3.1615 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1298ms | 71.5136μs | 13.9834 KOps/s | 13.7746 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1800ms | 0.1266ms | 7.9013 KOps/s | 7.5074 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6099ms | 0.5240ms | 1.9083 KOps/s | 1.9135 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3791ms | 0.3118ms | 3.2077 KOps/s | 3.1731 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1312ms | 17.6615μs | 56.6204 KOps/s | 53.5681 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 58.8910μs | 27.8284μs | 35.9345 KOps/s | 37.3761 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1072ms | 70.3045μs | 14.2238 KOps/s | 14.0850 KOps/s | |
test_compile_copy_flat[pytree-eager] | 79.8120μs | 51.2608μs | 19.5081 KOps/s | 19.1473 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3801ms | 0.8360ms | 1.1962 KOps/s | 1.1352 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5843ms | 3.2887ms | 304.0701 Ops/s | 306.1376 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2860ms | 0.8024ms | 1.2463 KOps/s | 1.1461 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.7384ms | 3.3201ms | 301.1959 Ops/s | 297.5126 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1578ms | 0.1083ms | 9.2299 KOps/s | 9.1731 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4637ms | 64.4062μs | 15.5265 KOps/s | 16.6831 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5172ms | 0.1088ms | 9.1913 KOps/s | 9.6870 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 91.9920μs | 46.9089μs | 21.3179 KOps/s | 23.1493 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.5124ms | 0.1083ms | 9.2294 KOps/s | 9.6370 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4470ms | 46.5518μs | 21.4814 KOps/s | 23.3507 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1892ms | 0.1438ms | 6.9561 KOps/s | 7.3161 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4312ms | 24.7834μs | 40.3496 KOps/s | 40.7114 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.5220ms | 0.1366ms | 7.3222 KOps/s | 7.7456 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 68.7520μs | 22.1470μs | 45.1529 KOps/s | 47.9409 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5480ms | 0.1379ms | 7.2497 KOps/s | 7.6612 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.4126ms | 22.5120μs | 44.4208 KOps/s | 48.2666 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5417ms | 0.1458ms | 6.8565 KOps/s | 7.2837 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4902ms | 26.4110μs | 37.8630 KOps/s | 40.5402 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5302ms | 0.1374ms | 7.2771 KOps/s | 7.6610 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4247ms | 22.1126μs | 45.2231 KOps/s | 48.9250 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5315ms | 0.1383ms | 7.2287 KOps/s | 7.6588 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.4144ms | 22.3120μs | 44.8189 KOps/s | 48.4806 KOps/s | |
test_mod_add[eager] | 0.4439ms | 35.1290μs | 28.4665 KOps/s | 31.2000 KOps/s | |
test_mod_add[compile] | 0.4818ms | 75.7698μs | 13.1979 KOps/s | 14.0515 KOps/s | |
test_mod_add[compile-overhead] | 0.2553ms | 0.1327ms | 7.5368 KOps/s | 6.6097 KOps/s | |
test_mod_wrap[eager] | 0.3404ms | 0.2580ms | 3.8764 KOps/s | 4.0474 KOps/s | |
test_mod_wrap[compile] | 1.6928ms | 0.3011ms | 3.3208 KOps/s | 3.3322 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7536ms | 4.0386ms | 247.6116 Ops/s | 257.5535 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5784ms | 1.3734ms | 728.1126 Ops/s | 688.6230 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5543ms | 1.3094ms | 763.7364 Ops/s | 693.5126 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3257ms | 0.9066ms | 1.1030 KOps/s | 988.2407 Ops/s | |
test_seq_add[eager] | 0.1397ms | 96.1850μs | 10.3966 KOps/s | 10.1680 KOps/s | |
test_seq_add[compile] | 0.1623ms | 79.1596μs | 12.6327 KOps/s | 12.2947 KOps/s | |
test_seq_add[compile-overhead] | 0.1594ms | 0.1135ms | 8.8144 KOps/s | 8.7245 KOps/s | |
test_seq_wrap[eager] | 0.4417ms | 0.3797ms | 2.6339 KOps/s | 2.5973 KOps/s | |
test_seq_wrap[compile] | 0.3668ms | 0.3055ms | 3.2737 KOps/s | 3.1441 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3154ms | 0.2169ms | 4.6106 KOps/s | 4.5515 KOps/s | |
test_func_call_runtime[False-eager] | 0.8552ms | 0.7419ms | 1.3480 KOps/s | 1.3207 KOps/s | |
test_func_call_runtime[False-compile] | 0.8398ms | 0.7711ms | 1.2968 KOps/s | 1.2556 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4033ms | 0.3533ms | 2.8307 KOps/s | 2.8140 KOps/s | |
test_func_call_runtime[True-eager] | 0.9838ms | 0.8980ms | 1.1135 KOps/s | 1.0983 KOps/s | |
test_func_call_runtime[True-compile] | 0.8456ms | 0.7893ms | 1.2669 KOps/s | 1.2259 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4702ms | 0.3759ms | 2.6603 KOps/s | 2.6448 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8932ms | 0.7459ms | 1.3406 KOps/s | 1.3372 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8469ms | 0.7710ms | 1.2971 KOps/s | 1.2023 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4070ms | 0.3546ms | 2.8201 KOps/s | 2.7372 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0927ms | 0.9857ms | 1.0145 KOps/s | 988.3589 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9730ms | 0.8243ms | 1.2131 KOps/s | 1.1889 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4419ms | 0.3986ms | 2.5090 KOps/s | 2.4623 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5274ms | 2.0615ms | 485.0769 Ops/s | 478.0223 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9247ms | 0.8375ms | 1.1941 KOps/s | 1.1531 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4463ms | 0.4024ms | 2.4850 KOps/s | 2.4731 KOps/s | |
test_distributed | 3.5849ms | 0.1326ms | 7.5418 KOps/s | 8.8073 KOps/s | |
test_tdmodule | 76.2120μs | 16.4837μs | 60.6660 KOps/s | 68.1156 KOps/s | |
test_tdmodule_dispatch | 70.6920μs | 32.4636μs | 30.8038 KOps/s | 34.9408 KOps/s | |
test_tdseq | 31.2110μs | 16.8256μs | 59.4331 KOps/s | 64.8716 KOps/s | |
test_tdseq_dispatch | 72.1220μs | 35.5455μs | 28.1330 KOps/s | 31.8824 KOps/s | |
test_instantiation_functorch | 1.9926ms | 1.8591ms | 537.8905 Ops/s | 536.1681 Ops/s | |
test_instantiation_td | 1.8119ms | 1.1908ms | 839.7750 Ops/s | 842.2252 Ops/s | |
test_exec_functorch | 0.2864ms | 0.2166ms | 4.6168 KOps/s | 4.7910 KOps/s | |
test_exec_functional_call | 0.2908ms | 0.2192ms | 4.5615 KOps/s | 4.6516 KOps/s | |
test_exec_td | 0.2776ms | 0.2252ms | 4.4412 KOps/s | 4.2887 KOps/s | |
test_exec_td_decorator | 1.0263ms | 0.2676ms | 3.7367 KOps/s | 3.8451 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8226ms | 0.6873ms | 1.4549 KOps/s | 1.4561 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7456ms | 0.6860ms | 1.4578 KOps/s | 1.4653 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6679ms | 0.5744ms | 1.7409 KOps/s | 1.7481 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6612ms | 0.5737ms | 1.7432 KOps/s | 1.7403 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8137ms | 0.6726ms | 1.4867 KOps/s | 1.5088 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8873ms | 0.6702ms | 1.4921 KOps/s | 1.4962 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7042ms | 0.5862ms | 1.7058 KOps/s | 1.6836 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7042ms | 0.5869ms | 1.7038 KOps/s | 1.6936 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.3549ms | 8.2778ms | 120.8045 Ops/s | 119.8472 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.3414ms | 8.2508ms | 121.2001 Ops/s | 119.6688 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.1266ms | 8.0696ms | 123.9221 Ops/s | 122.9985 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.1451ms | 8.0942ms | 123.5458 Ops/s | 122.2167 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.5168ms | 19.4188ms | 51.4964 Ops/s | 51.3787 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.4986ms | 19.4006ms | 51.5447 Ops/s | 51.2134 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.3808ms | 19.2978ms | 51.8193 Ops/s | 51.3231 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3687ms | 19.2904ms | 51.8394 Ops/s | 50.2932 Ops/s | |
test_to_module_speed[True] | 1.4373ms | 0.9400ms | 1.0638 KOps/s | 1.0736 KOps/s | |
test_to_module_speed[False] | 1.2770ms | 0.9028ms | 1.1076 KOps/s | 1.1140 KOps/s | |
test_tc_init | 68.1620μs | 35.6984μs | 28.0124 KOps/s | 29.3356 KOps/s | |
test_tc_init_nested | 0.1657ms | 73.3576μs | 13.6319 KOps/s | 14.4319 KOps/s | |
test_tc_first_layer_tensor | 4.3444μs | 0.6770μs | 1.4772 MOps/s | 1.4869 MOps/s | |
test_tc_first_layer_nontensor | 37.4510μs | 2.2377μs | 446.8884 KOps/s | 450.5686 KOps/s | |
test_tc_second_layer_tensor | 12.3550μs | 1.4228μs | 702.8345 KOps/s | 736.8392 KOps/s | |
test_tc_second_layer_nontensor | 24.6310μs | 2.9695μs | 336.7514 KOps/s | 343.9336 KOps/s | |
test_unbind | 0.1991s | 11.9868ms | 83.4250 Ops/s | 86.6661 Ops/s | |
test_full_like | 0.6567ms | 0.5760ms | 1.7361 KOps/s | 1.7457 KOps/s | |
test_zeros_like | 0.2763ms | 0.1979ms | 5.0522 KOps/s | 5.0545 KOps/s | |
test_ones_like | 0.2345ms | 0.1978ms | 5.0546 KOps/s | 5.0582 KOps/s | |
test_clone | 0.4494ms | 0.4145ms | 2.4124 KOps/s | 2.4137 KOps/s | |
test_squeeze | 52.0810μs | 9.6796μs | 103.3102 KOps/s | 105.9371 KOps/s | |
test_unsqueeze | 0.2946ms | 75.7585μs | 13.1998 KOps/s | 13.2889 KOps/s | |
test_split | 0.2559ms | 0.1571ms | 6.3671 KOps/s | 6.4187 KOps/s | |
test_permute | 0.2215ms | 0.1728ms | 5.7870 KOps/s | 5.7344 KOps/s | |
test_stack | 1.2538ms | 0.8590ms | 1.1641 KOps/s | 1.1914 KOps/s | |
test_cat | 1.3520ms | 1.2315ms | 812.0506 Ops/s | 811.6497 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: cca23e89c8526b19b4389d15cf9c4e36a151ac15 Pull Request resolved: #1018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):