-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster __setitem__
#985
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 43.7020μs | 19.7683μs | 50.5861 KOps/s | 50.6269 KOps/s | |
test_plain_set_stack_nested | 76.2530μs | 19.6018μs | 51.0158 KOps/s | 49.5060 KOps/s | |
test_plain_set_nested_inplace | 72.2440μs | 21.0585μs | 47.4868 KOps/s | 46.7433 KOps/s | |
test_plain_set_stack_nested_inplace | 54.9220μs | 21.0145μs | 47.5863 KOps/s | 46.8539 KOps/s | |
test_items | 37.0490μs | 4.1491μs | 241.0170 KOps/s | 239.4892 KOps/s | |
test_items_nested | 0.5584ms | 0.3280ms | 3.0491 KOps/s | 3.0665 KOps/s | |
test_items_nested_locked | 0.7369ms | 0.3297ms | 3.0329 KOps/s | 3.0506 KOps/s | |
test_items_nested_leaf | 0.1506ms | 83.6913μs | 11.9487 KOps/s | 11.8463 KOps/s | |
test_items_stack_nested | 0.5518ms | 0.3336ms | 2.9974 KOps/s | 3.0575 KOps/s | |
test_items_stack_nested_leaf | 0.1621ms | 84.9211μs | 11.7756 KOps/s | 12.2768 KOps/s | |
test_items_stack_nested_locked | 0.7023ms | 0.3349ms | 2.9863 KOps/s | 3.0266 KOps/s | |
test_keys | 26.8600μs | 3.5421μs | 282.3166 KOps/s | 285.9806 KOps/s | |
test_keys_nested | 0.2006ms | 95.4475μs | 10.4770 KOps/s | 10.3481 KOps/s | |
test_keys_nested_locked | 0.6447ms | 0.1029ms | 9.7190 KOps/s | 9.8357 KOps/s | |
test_keys_nested_leaf | 0.1448ms | 82.2726μs | 12.1547 KOps/s | 12.6615 KOps/s | |
test_keys_stack_nested | 0.1700ms | 96.9058μs | 10.3193 KOps/s | 10.5638 KOps/s | |
test_keys_stack_nested_leaf | 0.1592ms | 80.1452μs | 12.4773 KOps/s | 12.6525 KOps/s | |
test_keys_stack_nested_locked | 0.2175ms | 0.1010ms | 9.8963 KOps/s | 10.1344 KOps/s | |
test_values | 6.8248μs | 1.0805μs | 925.5293 KOps/s | 930.5236 KOps/s | |
test_values_nested | 92.0120μs | 47.8231μs | 20.9104 KOps/s | 20.8680 KOps/s | |
test_values_nested_locked | 0.1085ms | 47.2998μs | 21.1418 KOps/s | 20.7587 KOps/s | |
test_values_nested_leaf | 82.0930μs | 42.5775μs | 23.4866 KOps/s | 23.4490 KOps/s | |
test_values_stack_nested | 93.2440μs | 47.9519μs | 20.8543 KOps/s | 20.8655 KOps/s | |
test_values_stack_nested_leaf | 83.4450μs | 42.0633μs | 23.7737 KOps/s | 23.8307 KOps/s | |
test_values_stack_nested_locked | 0.1006ms | 48.0192μs | 20.8250 KOps/s | 20.6665 KOps/s | |
test_membership | 5.7607μs | 0.6778μs | 1.4753 MOps/s | 1.1994 MOps/s | |
test_membership_nested | 27.9120μs | 2.5064μs | 398.9718 KOps/s | 372.8840 KOps/s | |
test_membership_nested_leaf | 32.9020μs | 2.5525μs | 391.7796 KOps/s | 372.0231 KOps/s | |
test_membership_stacked_nested | 32.5710μs | 2.5301μs | 395.2409 KOps/s | 371.1607 KOps/s | |
test_membership_stacked_nested_leaf | 20.2480μs | 2.5478μs | 392.4901 KOps/s | 370.5554 KOps/s | |
test_membership_nested_last | 32.5400μs | 3.6900μs | 271.0017 KOps/s | 244.9960 KOps/s | |
test_membership_nested_leaf_last | 28.8840μs | 3.7914μs | 263.7570 KOps/s | 259.2550 KOps/s | |
test_membership_stacked_nested_last | 36.1470μs | 4.2987μs | 232.6302 KOps/s | 74.3801 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.5760μs | 4.3835μs | 228.1285 KOps/s | 76.8339 KOps/s | |
test_nested_getleaf | 57.0770μs | 10.4257μs | 95.9165 KOps/s | 95.3602 KOps/s | |
test_nested_get | 37.5700μs | 10.0169μs | 99.8313 KOps/s | 99.0250 KOps/s | |
test_stacked_getleaf | 34.0730μs | 10.4747μs | 95.4677 KOps/s | 94.0921 KOps/s | |
test_stacked_get | 33.7230μs | 10.1589μs | 98.4359 KOps/s | 99.8489 KOps/s | |
test_nested_getitemleaf | 53.9540μs | 10.8826μs | 91.8897 KOps/s | 89.4438 KOps/s | |
test_nested_getitem | 37.6700μs | 10.1090μs | 98.9220 KOps/s | 98.0792 KOps/s | |
test_stacked_getitemleaf | 53.2500μs | 10.7499μs | 93.0241 KOps/s | 92.8758 KOps/s | |
test_stacked_getitem | 39.9550μs | 9.9541μs | 100.4613 KOps/s | 95.9691 KOps/s | |
test_lock_nested | 91.0414ms | 0.5740ms | 1.7422 KOps/s | 2.1243 KOps/s | |
test_lock_stack_nested | 0.5270ms | 0.4421ms | 2.2618 KOps/s | 2.3285 KOps/s | |
test_unlock_nested | 0.1008s | 0.5022ms | 1.9911 KOps/s | 2.5056 KOps/s | |
test_unlock_stack_nested | 0.4507ms | 0.3652ms | 2.7383 KOps/s | 2.8395 KOps/s | |
test_flatten_speed | 0.2557ms | 0.1041ms | 9.6058 KOps/s | 9.6144 KOps/s | |
test_unflatten_speed | 0.8302ms | 0.4571ms | 2.1878 KOps/s | 2.1921 KOps/s | |
test_common_ops | 1.8974ms | 1.0786ms | 927.1560 Ops/s | 933.7538 Ops/s | |
test_creation | 30.5260μs | 2.0389μs | 490.4577 KOps/s | 481.7022 KOps/s | |
test_creation_empty | 55.2340μs | 15.8990μs | 62.8970 KOps/s | 58.2018 KOps/s | |
test_creation_nested_1 | 1.2741ms | 19.9306μs | 50.1740 KOps/s | 48.7585 KOps/s | |
test_creation_nested_2 | 66.2530μs | 23.1977μs | 43.1078 KOps/s | 41.0878 KOps/s | |
test_clone | 0.1559ms | 16.7753μs | 59.6115 KOps/s | 61.0164 KOps/s | |
test_getitem[int] | 0.7285ms | 16.2275μs | 61.6239 KOps/s | 60.2766 KOps/s | |
test_getitem[slice_int] | 0.1346ms | 29.4544μs | 33.9508 KOps/s | 33.7445 KOps/s | |
test_getitem[range] | 0.1687ms | 58.2448μs | 17.1689 KOps/s | 17.4578 KOps/s | |
test_getitem[tuple] | 0.1387ms | 24.8195μs | 40.2909 KOps/s | 40.9777 KOps/s | |
test_getitem[list] | 0.7665ms | 51.7614μs | 19.3194 KOps/s | 18.8465 KOps/s | |
test_setitem_dim[int] | 72.0640μs | 33.0241μs | 30.2809 KOps/s | 25.1177 KOps/s | |
test_setitem_dim[slice_int] | 0.1124ms | 60.2953μs | 16.5850 KOps/s | 14.7726 KOps/s | |
test_setitem_dim[range] | 0.1366ms | 83.2132μs | 12.0173 KOps/s | 10.9030 KOps/s | |
test_setitem_dim[tuple] | 83.6660μs | 48.9521μs | 20.4281 KOps/s | 18.0067 KOps/s | |
test_setitem | 77.9850μs | 28.5089μs | 35.0768 KOps/s | 35.5112 KOps/s | |
test_set | 0.2151ms | 28.0594μs | 35.6386 KOps/s | 36.1285 KOps/s | |
test_set_shared | 1.4122ms | 0.2119ms | 4.7192 KOps/s | 4.7274 KOps/s | |
test_update | 0.1726ms | 34.4990μs | 28.9864 KOps/s | 29.1837 KOps/s | |
test_update_nested | 1.0129ms | 43.1853μs | 23.1560 KOps/s | 22.4208 KOps/s | |
test_update__nested | 0.2120ms | 35.0952μs | 28.4939 KOps/s | 29.8335 KOps/s | |
test_set_nested | 0.2822ms | 31.0555μs | 32.2004 KOps/s | 33.3184 KOps/s | |
test_set_nested_new | 0.3777ms | 36.5346μs | 27.3713 KOps/s | 28.3331 KOps/s | |
test_select | 0.3679ms | 52.8382μs | 18.9257 KOps/s | 18.7745 KOps/s | |
test_select_nested | 0.1210ms | 61.3582μs | 16.2977 KOps/s | 16.6508 KOps/s | |
test_exclude_nested | 0.1613ms | 76.8307μs | 13.0156 KOps/s | 13.1507 KOps/s | |
test_empty[True] | 0.6906ms | 0.3150ms | 3.1748 KOps/s | 3.1720 KOps/s | |
test_empty[False] | 14.8078μs | 1.2220μs | 818.2985 KOps/s | 833.2730 KOps/s | |
test_unbind_speed | 0.6202ms | 0.2965ms | 3.3724 KOps/s | 3.4029 KOps/s | |
test_unbind_speed_stack0 | 0.4847ms | 0.2913ms | 3.4328 KOps/s | 3.5675 KOps/s | |
test_unbind_speed_stack1 | 89.9919ms | 0.7899ms | 1.2660 KOps/s | 1.4022 KOps/s | |
test_split | 96.2916ms | 2.2939ms | 435.9463 Ops/s | 457.6720 Ops/s | |
test_chunk | 3.0589ms | 2.0084ms | 497.9130 Ops/s | 460.5361 Ops/s | |
test_creation[device0] | 0.2819ms | 0.1164ms | 8.5875 KOps/s | 8.5743 KOps/s | |
test_creation_from_tensor | 5.0547ms | 0.1179ms | 8.4838 KOps/s | 8.5626 KOps/s | |
test_add_one[memmap_tensor0] | 0.4798ms | 7.6876μs | 130.0798 KOps/s | 138.1945 KOps/s | |
test_contiguous[memmap_tensor0] | 24.6760μs | 1.9042μs | 525.1608 KOps/s | 523.8325 KOps/s | |
test_stack[memmap_tensor0] | 0.1213ms | 5.7894μs | 172.7286 KOps/s | 173.1243 KOps/s | |
test_memmaptd_index | 1.1458ms | 0.4001ms | 2.4993 KOps/s | 2.5228 KOps/s | |
test_memmaptd_index_astensor | 0.9639ms | 0.4786ms | 2.0893 KOps/s | 2.1116 KOps/s | |
test_memmaptd_index_op | 1.6073ms | 1.0002ms | 999.7709 Ops/s | 1.0081 KOps/s | |
test_serialize_model | 0.1240s | 0.1150s | 8.6974 Ops/s | 8.3434 Ops/s | |
test_serialize_model_pickle | 0.4555s | 0.3919s | 2.5517 Ops/s | 2.5097 Ops/s | |
test_serialize_weights | 0.1235s | 0.1160s | 8.6181 Ops/s | 8.6203 Ops/s | |
test_serialize_weights_returnearly | 0.1729s | 0.1607s | 6.2246 Ops/s | 6.3347 Ops/s | |
test_serialize_weights_pickle | 0.6776s | 0.4570s | 2.1882 Ops/s | 2.1882 Ops/s | |
test_serialize_weights_filesystem | 0.2333s | 0.1550s | 6.4515 Ops/s | 7.2057 Ops/s | |
test_serialize_model_filesystem | 0.1537s | 0.1452s | 6.8864 Ops/s | 6.6616 Ops/s | |
test_reshape_pytree | 0.1028ms | 38.1067μs | 26.2421 KOps/s | 26.4418 KOps/s | |
test_reshape_td | 0.1102ms | 44.6893μs | 22.3767 KOps/s | 21.8034 KOps/s | |
test_view_pytree | 93.9950μs | 37.8452μs | 26.4234 KOps/s | 26.3711 KOps/s | |
test_view_td | 0.1185ms | 50.6842μs | 19.7300 KOps/s | 19.4391 KOps/s | |
test_unbind_pytree | 75.5810μs | 35.2141μs | 28.3977 KOps/s | 28.4953 KOps/s | |
test_unbind_td | 0.3120ms | 44.2090μs | 22.6198 KOps/s | 22.9988 KOps/s | |
test_split_pytree | 92.9230μs | 37.4009μs | 26.7373 KOps/s | 26.7680 KOps/s | |
test_split_td | 0.2405ms | 56.8919μs | 17.5772 KOps/s | 17.7197 KOps/s | |
test_add_pytree | 0.1169ms | 45.6239μs | 21.9184 KOps/s | 22.9156 KOps/s | |
test_add_td | 0.1760ms | 81.2177μs | 12.3126 KOps/s | 12.5959 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1225ms | 58.0385μs | 17.2300 KOps/s | 17.7245 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3325ms | 0.1878ms | 5.3247 KOps/s | 5.3464 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1060ms | 55.5640μs | 17.9973 KOps/s | 17.7761 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3032ms | 0.1421ms | 7.0374 KOps/s | 7.1743 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 60.7730μs | 20.6559μs | 48.4123 KOps/s | 47.8863 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1251ms | 66.0618μs | 15.1373 KOps/s | 14.9693 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1316ms | 74.2453μs | 13.4689 KOps/s | 13.3731 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1376ms | 66.9531μs | 14.9358 KOps/s | 14.6277 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3532ms | 0.1723ms | 5.8031 KOps/s | 5.8266 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3371ms | 0.1871ms | 5.3434 KOps/s | 5.3584 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1024ms | 47.0718μs | 21.2441 KOps/s | 21.3388 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5379ms | 71.4407μs | 13.9976 KOps/s | 14.2984 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3159ms | 0.1779ms | 5.6203 KOps/s | 5.8571 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5540ms | 0.2896ms | 3.4532 KOps/s | 3.4723 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3287ms | 0.2000ms | 5.0004 KOps/s | 5.0066 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5155ms | 0.1778ms | 5.6250 KOps/s | 5.8474 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1851ms | 63.6767μs | 15.7043 KOps/s | 16.0319 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1219ms | 47.0242μs | 21.2656 KOps/s | 21.2468 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4792ms | 0.2343ms | 4.2675 KOps/s | 4.3083 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2750ms | 0.1747ms | 5.7239 KOps/s | 5.6695 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2264ms | 0.1021ms | 9.7922 KOps/s | 9.9412 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1218ms | 56.8835μs | 17.5798 KOps/s | 17.5285 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1700ms | 75.2417μs | 13.2905 KOps/s | 13.2194 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1393ms | 67.5849μs | 14.7962 KOps/s | 14.6240 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2942ms | 0.1951ms | 5.1265 KOps/s | 5.1406 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1643ms | 1.6214ms | 616.7544 Ops/s | 618.6675 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4128ms | 0.1927ms | 5.1885 KOps/s | 5.1875 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3258ms | 1.1007ms | 908.4744 Ops/s | 913.4392 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.4973ms | 0.4183ms | 2.3908 KOps/s | 2.4503 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.9023ms | 3.6946ms | 270.6636 Ops/s | 271.9167 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 95.2070μs | 35.2333μs | 28.3823 KOps/s | 28.8158 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.2297ms | 47.6594μs | 20.9822 KOps/s | 20.8870 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 92.3020μs | 29.7358μs | 33.6295 KOps/s | 34.5760 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 85.4990μs | 28.3632μs | 35.2570 KOps/s | 33.7551 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 93.2640μs | 29.5073μs | 33.8899 KOps/s | 34.0600 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 90.1580μs | 28.2352μs | 35.4167 KOps/s | 34.1744 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1885ms | 74.8908μs | 13.3528 KOps/s | 13.6885 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3281ms | 26.8651μs | 37.2230 KOps/s | 36.5481 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1613ms | 69.4304μs | 14.4029 KOps/s | 15.0892 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 67.7060μs | 22.8573μs | 43.7498 KOps/s | 43.5615 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1351ms | 68.3876μs | 14.6225 KOps/s | 14.8741 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.7230μs | 22.4759μs | 44.4920 KOps/s | 44.4639 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1480ms | 74.8348μs | 13.3628 KOps/s | 13.8673 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9333ms | 26.7319μs | 37.4085 KOps/s | 37.1529 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1724ms | 68.1718μs | 14.6688 KOps/s | 14.7858 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 67.3550μs | 22.6425μs | 44.1647 KOps/s | 44.0897 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1483ms | 68.1262μs | 14.6786 KOps/s | 15.2317 KOps/s | |
test_compile_indexing[int-pytree-eager] | 69.9310μs | 22.7739μs | 43.9098 KOps/s | 44.5802 KOps/s | |
test_mod_add[eager] | 0.1392ms | 23.5788μs | 42.4110 KOps/s | 42.9368 KOps/s | |
test_mod_add[compile] | 0.1080ms | 39.8781μs | 25.0764 KOps/s | 26.2093 KOps/s | |
test_mod_add[compile-overhead] | 0.1126ms | 39.6392μs | 25.2275 KOps/s | 26.5229 KOps/s | |
test_mod_wrap[eager] | 0.3899ms | 0.2097ms | 4.7679 KOps/s | 4.9419 KOps/s | |
test_mod_wrap[compile] | 0.3578ms | 0.2365ms | 4.2287 KOps/s | 4.4274 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3353ms | 0.2303ms | 4.3423 KOps/s | 4.3930 KOps/s | |
test_mod_wrap_and_backward[eager] | 17.0589ms | 13.2452ms | 75.4992 Ops/s | 95.4156 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.6692ms | 12.6984ms | 78.7501 Ops/s | 88.1438 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 16.2839ms | 13.1753ms | 75.8995 Ops/s | 89.5743 Ops/s | |
test_seq_add[eager] | 0.1699ms | 86.7643μs | 11.5255 KOps/s | 11.8740 KOps/s | |
test_seq_add[compile] | 0.4530ms | 65.4491μs | 15.2790 KOps/s | 16.0141 KOps/s | |
test_seq_add[compile-overhead] | 0.1509ms | 63.9494μs | 15.6374 KOps/s | 16.2485 KOps/s | |
test_seq_wrap[eager] | 0.7135ms | 0.3885ms | 2.5743 KOps/s | 2.7509 KOps/s | |
test_seq_wrap[compile] | 0.4061ms | 0.2685ms | 3.7246 KOps/s | 3.7475 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5014ms | 0.2703ms | 3.6991 KOps/s | 3.7787 KOps/s | |
test_func_call_runtime[False-eager] | 0.7605ms | 0.5357ms | 1.8667 KOps/s | 1.9159 KOps/s | |
test_func_call_runtime[False-compile] | 1.0858ms | 0.5054ms | 1.9785 KOps/s | 2.0283 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6568ms | 0.4971ms | 2.0116 KOps/s | 2.0248 KOps/s | |
test_func_call_runtime[True-eager] | 0.9463ms | 0.7511ms | 1.3314 KOps/s | 1.3481 KOps/s | |
test_func_call_runtime[True-compile] | 0.8524ms | 0.5082ms | 1.9678 KOps/s | 1.9989 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9151ms | 0.5139ms | 1.9459 KOps/s | 1.9624 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9241ms | 0.5268ms | 1.8983 KOps/s | 1.9167 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0504ms | 0.5004ms | 1.9985 KOps/s | 2.0258 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9132ms | 0.5114ms | 1.9555 KOps/s | 2.0314 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1230ms | 0.8860ms | 1.1286 KOps/s | 1.1512 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9970ms | 0.7536ms | 1.3269 KOps/s | 1.3641 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0377ms | 0.7467ms | 1.3393 KOps/s | 1.3536 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5112ms | 1.8657ms | 536.0051 Ops/s | 542.5710 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 3.0246ms | 1.9200ms | 520.8380 Ops/s | 531.6222 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.9175ms | 1.9168ms | 521.6929 Ops/s | 532.6480 Ops/s | |
test_distributed | 0.2698ms | 0.1226ms | 8.1566 KOps/s | 7.9376 KOps/s | |
test_tdmodule | 48.6010μs | 16.7812μs | 59.5906 KOps/s | 61.0870 KOps/s | |
test_tdmodule_dispatch | 81.2820μs | 34.8125μs | 28.7253 KOps/s | 29.3828 KOps/s | |
test_tdseq | 50.5440μs | 19.2436μs | 51.9652 KOps/s | 52.5417 KOps/s | |
test_tdseq_dispatch | 71.2730μs | 39.1104μs | 25.5687 KOps/s | 24.8141 KOps/s | |
test_instantiation_functorch | 2.3729ms | 1.6161ms | 618.7666 Ops/s | 640.6210 Ops/s | |
test_instantiation_td | 2.1783ms | 1.2021ms | 831.8704 Ops/s | 871.9390 Ops/s | |
test_exec_functorch | 0.3428ms | 0.1882ms | 5.3123 KOps/s | 5.4767 KOps/s | |
test_exec_functional_call | 0.3437ms | 0.1782ms | 5.6125 KOps/s | 5.9174 KOps/s | |
test_exec_td | 0.3008ms | 0.1716ms | 5.8272 KOps/s | 5.9905 KOps/s | |
test_exec_td_decorator | 1.0254ms | 0.2259ms | 4.4259 KOps/s | 4.5637 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8972ms | 0.6387ms | 1.5657 KOps/s | 1.5878 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9511ms | 0.6391ms | 1.5648 KOps/s | 1.5931 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7235ms | 0.4976ms | 2.0097 KOps/s | 2.0594 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6860ms | 0.4949ms | 2.0208 KOps/s | 1.9782 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2535ms | 0.6144ms | 1.6275 KOps/s | 1.6381 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8317ms | 0.6145ms | 1.6273 KOps/s | 1.6387 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6535ms | 0.5118ms | 1.9540 KOps/s | 1.9836 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9030ms | 0.5175ms | 1.9324 KOps/s | 1.9881 KOps/s | |
test_to_module_speed[True] | 1.7244ms | 1.2943ms | 772.6091 Ops/s | 781.0626 Ops/s | |
test_to_module_speed[False] | 2.3073ms | 1.2616ms | 792.6664 Ops/s | 800.3475 Ops/s | |
test_tc_init | 0.1259ms | 42.7392μs | 23.3977 KOps/s | 23.1744 KOps/s | |
test_tc_init_nested | 0.1434ms | 85.6149μs | 11.6802 KOps/s | 11.6706 KOps/s | |
test_tc_first_layer_tensor | 26.8100μs | 1.5271μs | 654.8248 KOps/s | 663.7510 KOps/s | |
test_tc_first_layer_nontensor | 45.9760μs | 4.6892μs | 213.2567 KOps/s | 214.8955 KOps/s | |
test_tc_second_layer_tensor | 28.6730μs | 2.8100μs | 355.8693 KOps/s | 357.3217 KOps/s | |
test_tc_second_layer_nontensor | 44.2120μs | 6.0245μs | 165.9882 KOps/s | 168.8504 KOps/s | |
test_unbind | 0.4663s | 12.8796ms | 77.6423 Ops/s | 75.6468 Ops/s | |
test_full_like | 8.1793ms | 7.1215ms | 140.4203 Ops/s | 79.5894 Ops/s | |
test_zeros_like | 12.9668ms | 6.1465ms | 162.6942 Ops/s | 140.3174 Ops/s | |
test_ones_like | 15.1183ms | 7.5548ms | 132.3658 Ops/s | 136.7912 Ops/s | |
test_clone | 16.9625ms | 9.0780ms | 110.1570 Ops/s | 112.9248 Ops/s | |
test_squeeze | 73.4180μs | 12.1198μs | 82.5094 KOps/s | 79.3407 KOps/s | |
test_unsqueeze | 0.1742ms | 90.4131μs | 11.0603 KOps/s | 10.7636 KOps/s | |
test_split | 0.6670ms | 0.1921ms | 5.2046 KOps/s | 5.1297 KOps/s | |
test_permute | 0.3894ms | 0.2174ms | 4.6000 KOps/s | 4.4904 KOps/s | |
test_stack | 32.9162ms | 24.3187ms | 41.1206 Ops/s | 41.1479 Ops/s | |
test_cat | 24.3639ms | 23.8909ms | 41.8570 Ops/s | 41.2670 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1200ms | 14.8297μs | 67.4322 KOps/s | 68.0727 KOps/s | |
test_plain_set_stack_nested | 43.7110μs | 15.0159μs | 66.5963 KOps/s | 67.0584 KOps/s | |
test_plain_set_nested_inplace | 43.5610μs | 15.9875μs | 62.5488 KOps/s | 63.8392 KOps/s | |
test_plain_set_stack_nested_inplace | 45.9110μs | 15.8191μs | 63.2147 KOps/s | 63.6555 KOps/s | |
test_items | 29.3700μs | 2.8406μs | 352.0380 KOps/s | 354.0333 KOps/s | |
test_items_nested | 0.3581ms | 0.3104ms | 3.2212 KOps/s | 3.1928 KOps/s | |
test_items_nested_locked | 0.4019ms | 0.3132ms | 3.1929 KOps/s | 3.1753 KOps/s | |
test_items_nested_leaf | 95.1420μs | 63.0990μs | 15.8481 KOps/s | 15.8570 KOps/s | |
test_items_stack_nested | 0.3665ms | 0.3124ms | 3.2013 KOps/s | 3.0847 KOps/s | |
test_items_stack_nested_leaf | 0.1084ms | 63.9036μs | 15.6486 KOps/s | 15.3273 KOps/s | |
test_items_stack_nested_locked | 0.3736ms | 0.3164ms | 3.1601 KOps/s | 3.1526 KOps/s | |
test_keys | 31.6900μs | 3.4071μs | 293.5025 KOps/s | 293.9711 KOps/s | |
test_keys_nested | 83.3010μs | 53.4223μs | 18.7188 KOps/s | 18.0417 KOps/s | |
test_keys_nested_locked | 2.8405ms | 60.5622μs | 16.5120 KOps/s | 16.4814 KOps/s | |
test_keys_nested_leaf | 80.0120μs | 45.3362μs | 22.0574 KOps/s | 21.3614 KOps/s | |
test_keys_stack_nested | 84.0710μs | 55.5249μs | 18.0099 KOps/s | 18.0620 KOps/s | |
test_keys_stack_nested_leaf | 73.8610μs | 46.7891μs | 21.3725 KOps/s | 20.9663 KOps/s | |
test_keys_stack_nested_locked | 0.1037ms | 59.9268μs | 16.6870 KOps/s | 16.6378 KOps/s | |
test_values | 5.4200μs | 0.8199μs | 1.2197 MOps/s | 1.2171 MOps/s | |
test_values_nested | 57.1410μs | 27.3287μs | 36.5916 KOps/s | 36.4285 KOps/s | |
test_values_nested_locked | 65.0910μs | 29.3882μs | 34.0273 KOps/s | 34.0803 KOps/s | |
test_values_nested_leaf | 58.8710μs | 24.0374μs | 41.6019 KOps/s | 41.7791 KOps/s | |
test_values_stack_nested | 64.6310μs | 27.8314μs | 35.9306 KOps/s | 35.1986 KOps/s | |
test_values_stack_nested_leaf | 53.5210μs | 24.5542μs | 40.7262 KOps/s | 39.9720 KOps/s | |
test_values_stack_nested_locked | 60.3410μs | 29.7774μs | 33.5825 KOps/s | 32.9828 KOps/s | |
test_membership | 1.6580μs | 0.4726μs | 2.1158 MOps/s | 2.1271 MOps/s | |
test_membership_nested | 19.4950μs | 1.7397μs | 574.8278 KOps/s | 568.0451 KOps/s | |
test_membership_nested_leaf | 11.3467μs | 1.7152μs | 583.0190 KOps/s | 580.2504 KOps/s | |
test_membership_stacked_nested | 51.0810μs | 1.7979μs | 556.2179 KOps/s | 555.7376 KOps/s | |
test_membership_stacked_nested_leaf | 31.0800μs | 1.8131μs | 551.5328 KOps/s | 563.6648 KOps/s | |
test_membership_nested_last | 33.4510μs | 2.6618μs | 375.6795 KOps/s | 389.3696 KOps/s | |
test_membership_nested_leaf_last | 33.3500μs | 2.6124μs | 382.7946 KOps/s | 387.2158 KOps/s | |
test_membership_stacked_nested_last | 31.1710μs | 2.7343μs | 365.7180 KOps/s | 389.5338 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.3410μs | 2.6101μs | 383.1219 KOps/s | 389.5223 KOps/s | |
test_nested_getleaf | 35.7410μs | 6.0888μs | 164.2347 KOps/s | 165.7582 KOps/s | |
test_nested_get | 35.2400μs | 5.7033μs | 175.3363 KOps/s | 174.4859 KOps/s | |
test_stacked_getleaf | 30.0110μs | 6.0642μs | 164.9027 KOps/s | 165.5381 KOps/s | |
test_stacked_get | 33.3210μs | 5.6299μs | 177.6237 KOps/s | 177.8825 KOps/s | |
test_nested_getitemleaf | 26.8200μs | 6.1165μs | 163.4929 KOps/s | 163.2545 KOps/s | |
test_nested_getitem | 33.7510μs | 5.7559μs | 173.7337 KOps/s | 174.1686 KOps/s | |
test_stacked_getitemleaf | 28.8100μs | 6.0445μs | 165.4406 KOps/s | 166.2338 KOps/s | |
test_stacked_getitem | 37.3700μs | 5.6866μs | 175.8509 KOps/s | 176.2546 KOps/s | |
test_lock_nested | 7.2247ms | 0.4212ms | 2.3743 KOps/s | 2.3945 KOps/s | |
test_lock_stack_nested | 0.4751ms | 0.3761ms | 2.6592 KOps/s | 2.6376 KOps/s | |
test_unlock_nested | 0.7917ms | 0.3546ms | 2.8200 KOps/s | 2.7994 KOps/s | |
test_unlock_stack_nested | 0.3626ms | 0.3152ms | 3.1729 KOps/s | 3.1329 KOps/s | |
test_flatten_speed | 0.3073ms | 77.9079μs | 12.8357 KOps/s | 12.4444 KOps/s | |
test_unflatten_speed | 0.3642ms | 0.2783ms | 3.5928 KOps/s | 3.5494 KOps/s | |
test_common_ops | 1.4881ms | 1.2798ms | 781.3582 Ops/s | 779.4800 Ops/s | |
test_creation | 25.9900μs | 1.4680μs | 681.1945 KOps/s | 671.9798 KOps/s | |
test_creation_empty | 45.5110μs | 17.2858μs | 57.8510 KOps/s | 58.3379 KOps/s | |
test_creation_nested_1 | 53.9710μs | 19.0512μs | 52.4901 KOps/s | 53.2756 KOps/s | |
test_creation_nested_2 | 45.0310μs | 21.3403μs | 46.8598 KOps/s | 46.9098 KOps/s | |
test_clone | 67.1010μs | 28.3235μs | 35.3064 KOps/s | 34.2664 KOps/s | |
test_getitem[int] | 1.2549ms | 15.7638μs | 63.4365 KOps/s | 63.1333 KOps/s | |
test_getitem[slice_int] | 0.1197ms | 27.8877μs | 35.8581 KOps/s | 35.7906 KOps/s | |
test_getitem[range] | 0.2536ms | 0.1099ms | 9.0976 KOps/s | 9.0866 KOps/s | |
test_getitem[tuple] | 0.1270ms | 23.3232μs | 42.8758 KOps/s | 41.9368 KOps/s | |
test_getitem[list] | 0.1911ms | 98.0441μs | 10.1995 KOps/s | 9.9929 KOps/s | |
test_setitem_dim[int] | 72.9410μs | 45.5708μs | 21.9439 KOps/s | 18.4211 KOps/s | |
test_setitem_dim[slice_int] | 0.1086ms | 68.1134μs | 14.6814 KOps/s | 12.6606 KOps/s | |
test_setitem_dim[range] | 0.1624ms | 0.1281ms | 7.8041 KOps/s | 7.1268 KOps/s | |
test_setitem_dim[tuple] | 86.7620μs | 61.1085μs | 16.3643 KOps/s | 14.0010 KOps/s | |
test_setitem | 77.7310μs | 41.9919μs | 23.8141 KOps/s | 23.2708 KOps/s | |
test_set | 75.5020μs | 41.0420μs | 24.3653 KOps/s | 24.2288 KOps/s | |
test_set_shared | 0.3714ms | 50.3981μs | 19.8420 KOps/s | 19.7111 KOps/s | |
test_update | 0.1913ms | 51.3668μs | 19.4678 KOps/s | 19.4173 KOps/s | |
test_update_nested | 98.6720μs | 58.6334μs | 17.0551 KOps/s | 17.2599 KOps/s | |
test_update__nested | 0.1038ms | 59.9276μs | 16.6868 KOps/s | 16.4635 KOps/s | |
test_set_nested | 99.1620μs | 44.3888μs | 22.5282 KOps/s | 22.3863 KOps/s | |
test_set_nested_new | 87.4320μs | 47.5158μs | 21.0456 KOps/s | 20.9705 KOps/s | |
test_select | 0.1020ms | 59.8102μs | 16.7196 KOps/s | 16.2310 KOps/s | |
test_select_nested | 71.2010μs | 41.4697μs | 24.1140 KOps/s | 23.8855 KOps/s | |
test_exclude_nested | 89.7910μs | 58.4211μs | 17.1171 KOps/s | 16.6145 KOps/s | |
test_empty[True] | 0.3040ms | 0.2412ms | 4.1461 KOps/s | 4.0308 KOps/s | |
test_empty[False] | 3.2811μs | 0.7493μs | 1.3346 MOps/s | 1.3510 MOps/s | |
test_to | 52.7010μs | 25.6785μs | 38.9432 KOps/s | 39.0194 KOps/s | |
test_to_nonblocking | 53.7810μs | 24.1654μs | 41.3814 KOps/s | 42.0600 KOps/s | |
test_unbind_speed | 0.3144ms | 0.2820ms | 3.5467 KOps/s | 3.5649 KOps/s | |
test_unbind_speed_stack0 | 0.3341ms | 0.2776ms | 3.6026 KOps/s | 3.6076 KOps/s | |
test_unbind_speed_stack1 | 93.0509ms | 0.7081ms | 1.4123 KOps/s | 1.5327 KOps/s | |
test_split | 93.7704ms | 2.1777ms | 459.1906 Ops/s | 453.1139 Ops/s | |
test_chunk | 96.0818ms | 2.1877ms | 457.0989 Ops/s | 451.9758 Ops/s | |
test_creation[device0] | 0.4467ms | 0.1262ms | 7.9237 KOps/s | 7.9181 KOps/s | |
test_creation_from_tensor | 0.3261ms | 0.1280ms | 7.8101 KOps/s | 7.7855 KOps/s | |
test_add_one[memmap_tensor0] | 0.2022ms | 8.8516μs | 112.9741 KOps/s | 110.8706 KOps/s | |
test_contiguous[memmap_tensor0] | 27.7310μs | 2.1796μs | 458.7959 KOps/s | 448.1820 KOps/s | |
test_stack[memmap_tensor0] | 36.5100μs | 6.8095μs | 146.8534 KOps/s | 140.9429 KOps/s | |
test_memmaptd_index | 1.0923ms | 0.4159ms | 2.4046 KOps/s | 2.3245 KOps/s | |
test_memmaptd_index_astensor | 0.7132ms | 0.4763ms | 2.0995 KOps/s | 2.0429 KOps/s | |
test_memmaptd_index_op | 1.4881ms | 1.0554ms | 947.5001 Ops/s | 947.6156 Ops/s | |
test_serialize_model | 0.1277s | 0.1266s | 7.9018 Ops/s | 7.8597 Ops/s | |
test_serialize_model_pickle | 1.3503s | 1.2126s | 0.8246 Ops/s | 0.8251 Ops/s | |
test_serialize_weights | 0.2205s | 0.1396s | 7.1624 Ops/s | 7.9129 Ops/s | |
test_serialize_weights_returnearly | 0.2226s | 56.0741ms | 17.8335 Ops/s | 21.8641 Ops/s | |
test_serialize_weights_pickle | 1.4042s | 1.2264s | 0.8154 Ops/s | 0.8240 Ops/s | |
test_reshape_pytree | 71.8510μs | 35.7140μs | 28.0002 KOps/s | 27.2454 KOps/s | |
test_reshape_td | 71.5110μs | 41.9991μs | 23.8100 KOps/s | 23.7589 KOps/s | |
test_view_pytree | 66.6210μs | 35.9774μs | 27.7952 KOps/s | 27.5350 KOps/s | |
test_view_td | 73.1410μs | 45.8877μs | 21.7923 KOps/s | 20.5247 KOps/s | |
test_unbind_pytree | 60.8510μs | 33.8813μs | 29.5148 KOps/s | 28.0654 KOps/s | |
test_unbind_td | 0.3632ms | 42.2218μs | 23.6844 KOps/s | 23.1290 KOps/s | |
test_split_pytree | 76.2520μs | 46.5610μs | 21.4772 KOps/s | 21.1238 KOps/s | |
test_split_td | 0.4642ms | 54.9684μs | 18.1923 KOps/s | 18.2511 KOps/s | |
test_add_pytree | 0.1001ms | 56.3626μs | 17.7422 KOps/s | 17.2559 KOps/s | |
test_add_td | 0.2440ms | 93.5768μs | 10.6864 KOps/s | 10.3785 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4086ms | 0.2083ms | 4.8006 KOps/s | 4.7066 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2501ms | 0.1565ms | 6.3910 KOps/s | 6.3037 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1991ms | 0.1455ms | 6.8749 KOps/s | 6.9415 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2600ms | 0.1848ms | 5.4115 KOps/s | 5.3577 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.6310μs | 21.3802μs | 46.7723 KOps/s | 48.6566 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 87.7010μs | 44.1272μs | 22.6618 KOps/s | 22.9463 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1079ms | 64.9244μs | 15.4025 KOps/s | 15.4863 KOps/s | |
test_compile_copy_nested[pytree-eager] | 93.6020μs | 50.0805μs | 19.9678 KOps/s | 20.0146 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4141ms | 0.3176ms | 3.1484 KOps/s | 3.1447 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2485ms | 0.2083ms | 4.8015 KOps/s | 4.7415 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1629ms | 0.1280ms | 7.8128 KOps/s | 7.8246 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1221ms | 59.1493μs | 16.9064 KOps/s | 16.7395 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3545ms | 0.3152ms | 3.1724 KOps/s | 3.1259 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6723ms | 0.6249ms | 1.6003 KOps/s | 1.5870 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2948ms | 0.2469ms | 4.0505 KOps/s | 3.9497 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4753ms | 0.3185ms | 3.1395 KOps/s | 3.1136 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1450ms | 69.1453μs | 14.4623 KOps/s | 14.1283 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1722ms | 0.1293ms | 7.7325 KOps/s | 7.7454 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7200ms | 0.5311ms | 1.8827 KOps/s | 1.8227 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3840ms | 0.3147ms | 3.1777 KOps/s | 3.1200 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 51.1510μs | 18.1007μs | 55.2465 KOps/s | 55.4944 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 53.9610μs | 26.5225μs | 37.7038 KOps/s | 36.9267 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1054ms | 69.4761μs | 14.3934 KOps/s | 14.4874 KOps/s | |
test_compile_copy_flat[pytree-eager] | 76.9920μs | 52.2613μs | 19.1346 KOps/s | 19.3573 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3278ms | 0.8027ms | 1.2458 KOps/s | 1.1465 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4237ms | 3.1859ms | 313.8836 Ops/s | 307.6631 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3834ms | 0.8158ms | 1.2258 KOps/s | 1.1407 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6649ms | 3.3238ms | 300.8584 Ops/s | 300.2477 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2715ms | 0.1155ms | 8.6556 KOps/s | 9.2715 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1963ms | 63.8147μs | 15.6704 KOps/s | 16.3007 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1708ms | 0.1061ms | 9.4241 KOps/s | 9.6166 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1325ms | 43.6920μs | 22.8875 KOps/s | 22.4554 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2551ms | 0.1054ms | 9.4916 KOps/s | 9.5987 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1821ms | 43.0657μs | 23.2203 KOps/s | 22.4559 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2031ms | 0.1390ms | 7.1918 KOps/s | 7.3276 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1603ms | 25.5399μs | 39.1544 KOps/s | 38.6532 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1776ms | 0.1331ms | 7.5155 KOps/s | 7.5635 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 86.6210μs | 22.0221μs | 45.4089 KOps/s | 46.4546 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1803ms | 0.1342ms | 7.4539 KOps/s | 7.5753 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 56.9610μs | 21.8017μs | 45.8680 KOps/s | 46.7900 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2024ms | 0.1403ms | 7.1297 KOps/s | 7.2605 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4629ms | 25.7291μs | 38.8665 KOps/s | 39.1629 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1751ms | 0.1337ms | 7.4812 KOps/s | 7.5263 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1153ms | 26.7475μs | 37.3867 KOps/s | 47.0430 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2608ms | 0.1399ms | 7.1461 KOps/s | 7.5690 KOps/s | |
test_compile_indexing[int-pytree-eager] | 68.7810μs | 22.6031μs | 44.2417 KOps/s | 46.8959 KOps/s | |
test_mod_add[eager] | 97.5420μs | 33.6794μs | 29.6917 KOps/s | 30.7492 KOps/s | |
test_mod_add[compile] | 0.1130ms | 70.7877μs | 14.1268 KOps/s | 14.2177 KOps/s | |
test_mod_add[compile-overhead] | 0.2601ms | 0.1358ms | 7.3660 KOps/s | 7.0786 KOps/s | |
test_mod_wrap[eager] | 0.3277ms | 0.2529ms | 3.9546 KOps/s | 4.0485 KOps/s | |
test_mod_wrap[compile] | 0.4492ms | 0.2858ms | 3.4993 KOps/s | 3.4290 KOps/s | |
test_mod_wrap[compile-overhead] | 7.8979ms | 4.1234ms | 242.5158 Ops/s | 244.5996 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5137ms | 1.3581ms | 736.3455 Ops/s | 685.7966 Ops/s | |
test_mod_wrap_and_backward[compile] | 2.7048ms | 1.3162ms | 759.7467 Ops/s | 695.9313 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3092ms | 0.8931ms | 1.1197 KOps/s | 1.0186 KOps/s | |
test_seq_add[eager] | 0.2085ms | 96.5223μs | 10.3603 KOps/s | 10.0200 KOps/s | |
test_seq_add[compile] | 0.1218ms | 80.4127μs | 12.4358 KOps/s | 12.3720 KOps/s | |
test_seq_add[compile-overhead] | 0.1666ms | 0.1139ms | 8.7797 KOps/s | 8.7718 KOps/s | |
test_seq_wrap[eager] | 0.5510ms | 0.3786ms | 2.6412 KOps/s | 2.5815 KOps/s | |
test_seq_wrap[compile] | 0.3833ms | 0.3022ms | 3.3094 KOps/s | 3.2511 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2783ms | 0.2096ms | 4.7705 KOps/s | 4.7795 KOps/s | |
test_func_call_runtime[False-eager] | 0.9397ms | 0.7598ms | 1.3161 KOps/s | 1.3315 KOps/s | |
test_func_call_runtime[False-compile] | 0.9082ms | 0.7791ms | 1.2835 KOps/s | 1.2526 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4103ms | 0.3523ms | 2.8386 KOps/s | 2.8469 KOps/s | |
test_func_call_runtime[True-eager] | 0.9739ms | 0.8807ms | 1.1355 KOps/s | 1.1040 KOps/s | |
test_func_call_runtime[True-compile] | 0.9724ms | 0.8164ms | 1.2248 KOps/s | 1.2077 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4554ms | 0.3840ms | 2.6038 KOps/s | 2.6163 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7776ms | 0.7175ms | 1.3938 KOps/s | 1.3460 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8474ms | 0.7865ms | 1.2714 KOps/s | 1.2515 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4489ms | 0.3515ms | 2.8448 KOps/s | 2.8350 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0614ms | 0.9815ms | 1.0189 KOps/s | 1.0014 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9661ms | 0.8446ms | 1.1840 KOps/s | 1.1692 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5303ms | 0.4074ms | 2.4548 KOps/s | 2.4347 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5700ms | 2.0598ms | 485.4781 Ops/s | 474.2609 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9110ms | 0.8565ms | 1.1676 KOps/s | 1.1501 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5292ms | 0.4140ms | 2.4152 KOps/s | 2.4089 KOps/s | |
test_distributed | 0.6137ms | 0.1169ms | 8.5565 KOps/s | 8.8338 KOps/s | |
test_tdmodule | 68.5310μs | 15.5449μs | 64.3298 KOps/s | 65.7818 KOps/s | |
test_tdmodule_dispatch | 68.6310μs | 31.8279μs | 31.4190 KOps/s | 32.6448 KOps/s | |
test_tdseq | 36.3810μs | 16.0948μs | 62.1317 KOps/s | 62.5192 KOps/s | |
test_tdseq_dispatch | 61.7010μs | 34.3316μs | 29.1277 KOps/s | 30.0180 KOps/s | |
test_instantiation_functorch | 1.9693ms | 1.8638ms | 536.5269 Ops/s | 538.8685 Ops/s | |
test_instantiation_td | 1.8129ms | 1.1990ms | 834.0047 Ops/s | 837.0280 Ops/s | |
test_exec_functorch | 0.2580ms | 0.2047ms | 4.8843 KOps/s | 4.6539 KOps/s | |
test_exec_functional_call | 0.2832ms | 0.2072ms | 4.8257 KOps/s | 4.6514 KOps/s | |
test_exec_td | 0.3626ms | 0.2135ms | 4.6844 KOps/s | 4.6170 KOps/s | |
test_exec_td_decorator | 0.9643ms | 0.2543ms | 3.9316 KOps/s | 3.8942 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7995ms | 0.6945ms | 1.4399 KOps/s | 1.4491 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8053ms | 0.6873ms | 1.4549 KOps/s | 1.4511 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6888ms | 0.5712ms | 1.7508 KOps/s | 1.6867 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6846ms | 0.5734ms | 1.7441 KOps/s | 1.7385 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.4529ms | 0.6726ms | 1.4867 KOps/s | 1.4687 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7990ms | 0.6733ms | 1.4852 KOps/s | 1.4831 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7174ms | 0.5863ms | 1.7056 KOps/s | 1.6901 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6833ms | 0.5863ms | 1.7056 KOps/s | 1.6968 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.4045ms | 8.3206ms | 120.1842 Ops/s | 118.4981 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4758ms | 8.3267ms | 120.0952 Ops/s | 118.6619 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.3386ms | 8.1361ms | 122.9085 Ops/s | 121.1420 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.1961ms | 8.1112ms | 123.2869 Ops/s | 121.6334 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.5978ms | 19.4790ms | 51.3374 Ops/s | 50.8691 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.8876ms | 19.5089ms | 51.2586 Ops/s | 50.9203 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.4790ms | 19.3561ms | 51.6632 Ops/s | 51.4193 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.4759ms | 19.3681ms | 51.6313 Ops/s | 51.1614 Ops/s | |
test_to_module_speed[True] | 1.3960ms | 0.9220ms | 1.0846 KOps/s | 1.0824 KOps/s | |
test_to_module_speed[False] | 1.2655ms | 0.9226ms | 1.0838 KOps/s | 1.1019 KOps/s | |
test_tc_init | 69.9410μs | 35.5237μs | 28.1502 KOps/s | 28.9700 KOps/s | |
test_tc_init_nested | 0.1096ms | 75.3917μs | 13.2641 KOps/s | 14.2238 KOps/s | |
test_tc_first_layer_tensor | 3.6959μs | 0.6776μs | 1.4757 MOps/s | 1.4672 MOps/s | |
test_tc_first_layer_nontensor | 32.6400μs | 2.2157μs | 451.3147 KOps/s | 446.9932 KOps/s | |
test_tc_second_layer_tensor | 9.6500μs | 1.3662μs | 731.9514 KOps/s | 732.2887 KOps/s | |
test_tc_second_layer_nontensor | 25.2400μs | 2.9474μs | 339.2806 KOps/s | 342.0065 KOps/s | |
test_unbind | 0.1867s | 12.1442ms | 82.3441 Ops/s | 96.6823 Ops/s | |
test_full_like | 0.6500ms | 0.5738ms | 1.7429 KOps/s | 1.7426 KOps/s | |
test_zeros_like | 0.2896ms | 0.1979ms | 5.0542 KOps/s | 5.0553 KOps/s | |
test_ones_like | 0.2776ms | 0.1976ms | 5.0605 KOps/s | 5.0585 KOps/s | |
test_clone | 0.4560ms | 0.4141ms | 2.4148 KOps/s | 2.4134 KOps/s | |
test_squeeze | 62.1610μs | 9.8100μs | 101.9366 KOps/s | 102.9250 KOps/s | |
test_unsqueeze | 0.2462ms | 73.4990μs | 13.6056 KOps/s | 13.2795 KOps/s | |
test_split | 0.3827ms | 0.1568ms | 6.3756 KOps/s | 6.2518 KOps/s | |
test_permute | 0.2446ms | 0.1808ms | 5.5302 KOps/s | 5.6376 KOps/s | |
test_stack | 1.2534ms | 0.8706ms | 1.1486 KOps/s | 1.1641 KOps/s | |
test_cat | 1.2629ms | 1.2316ms | 811.9481 Ops/s | 812.1283 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.