-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Regular swap_tensor for to_module in dynamo #963
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 52.8990μs | 21.3429μs | 46.8539 KOps/s | 45.7259 KOps/s | |
test_plain_set_stack_nested | 45.3350μs | 21.3829μs | 46.7663 KOps/s | 45.0683 KOps/s | |
test_plain_set_nested_inplace | 67.1050μs | 23.2705μs | 42.9728 KOps/s | 41.9658 KOps/s | |
test_plain_set_stack_nested_inplace | 64.1600μs | 23.2074μs | 43.0896 KOps/s | 41.9913 KOps/s | |
test_items | 25.0460μs | 2.7682μs | 361.2518 KOps/s | 371.8818 KOps/s | |
test_items_nested | 0.6327ms | 0.3361ms | 2.9755 KOps/s | 2.9764 KOps/s | |
test_items_nested_locked | 0.5326ms | 0.3367ms | 2.9697 KOps/s | 2.9597 KOps/s | |
test_items_nested_leaf | 0.1482ms | 84.2304μs | 11.8722 KOps/s | 11.8589 KOps/s | |
test_items_stack_nested | 1.9870ms | 0.3396ms | 2.9442 KOps/s | 2.9780 KOps/s | |
test_items_stack_nested_leaf | 0.1452ms | 84.1260μs | 11.8869 KOps/s | 11.6966 KOps/s | |
test_items_stack_nested_locked | 0.4309ms | 0.3391ms | 2.9488 KOps/s | 2.9742 KOps/s | |
test_keys | 59.2400μs | 4.0757μs | 245.3565 KOps/s | 255.9078 KOps/s | |
test_keys_nested | 0.2404ms | 0.1442ms | 6.9333 KOps/s | 6.8859 KOps/s | |
test_keys_nested_locked | 0.7538ms | 0.1492ms | 6.7014 KOps/s | 6.6375 KOps/s | |
test_keys_nested_leaf | 0.2852ms | 0.1235ms | 8.0939 KOps/s | 8.0642 KOps/s | |
test_keys_stack_nested | 0.2385ms | 0.1448ms | 6.9039 KOps/s | 6.9049 KOps/s | |
test_keys_stack_nested_leaf | 0.2407ms | 0.1223ms | 8.1744 KOps/s | 8.0590 KOps/s | |
test_keys_stack_nested_locked | 0.2474ms | 0.1506ms | 6.6384 KOps/s | 6.6433 KOps/s | |
test_values | 13.3175μs | 1.1714μs | 853.7099 KOps/s | 870.1549 KOps/s | |
test_values_nested | 98.9840μs | 50.9104μs | 19.6423 KOps/s | 19.5580 KOps/s | |
test_values_nested_locked | 0.1031ms | 50.5907μs | 19.7665 KOps/s | 19.6685 KOps/s | |
test_values_nested_leaf | 0.1312ms | 45.2924μs | 22.0788 KOps/s | 21.8222 KOps/s | |
test_values_stack_nested | 0.1134ms | 51.2622μs | 19.5076 KOps/s | 19.2333 KOps/s | |
test_values_stack_nested_leaf | 89.5900μs | 44.7655μs | 22.3386 KOps/s | 20.8290 KOps/s | |
test_values_stack_nested_locked | 0.1076ms | 51.6759μs | 19.3514 KOps/s | 19.3925 KOps/s | |
test_membership | 18.3950μs | 0.9102μs | 1.0987 MOps/s | 1.0967 MOps/s | |
test_membership_nested | 45.5150μs | 2.5798μs | 387.6334 KOps/s | 376.7829 KOps/s | |
test_membership_nested_leaf | 40.5760μs | 2.5870μs | 386.5487 KOps/s | 375.7577 KOps/s | |
test_membership_stacked_nested | 51.2030μs | 2.5918μs | 385.8385 KOps/s | 378.4230 KOps/s | |
test_membership_stacked_nested_leaf | 22.9420μs | 2.6355μs | 379.4396 KOps/s | 378.5577 KOps/s | |
test_membership_nested_last | 48.0900μs | 3.8118μs | 262.3458 KOps/s | 249.0574 KOps/s | |
test_membership_nested_leaf_last | 44.8940μs | 3.8304μs | 261.0707 KOps/s | 249.7584 KOps/s | |
test_membership_stacked_nested_last | 39.5740μs | 9.9563μs | 100.4393 KOps/s | 195.2041 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.7440μs | 10.0876μs | 99.1318 KOps/s | 194.3506 KOps/s | |
test_nested_getleaf | 47.1420μs | 10.2952μs | 97.1325 KOps/s | 91.5819 KOps/s | |
test_nested_get | 53.0990μs | 9.8186μs | 101.8471 KOps/s | 97.9520 KOps/s | |
test_stacked_getleaf | 43.9320μs | 10.3310μs | 96.7963 KOps/s | 93.5424 KOps/s | |
test_stacked_get | 61.3950μs | 9.7943μs | 102.1005 KOps/s | 99.2298 KOps/s | |
test_nested_getitemleaf | 57.0160μs | 10.7712μs | 92.8399 KOps/s | 88.8141 KOps/s | |
test_nested_getitem | 54.5120μs | 9.8843μs | 101.1709 KOps/s | 96.8682 KOps/s | |
test_stacked_getitemleaf | 49.0520μs | 10.8139μs | 92.4736 KOps/s | 90.8416 KOps/s | |
test_stacked_getitem | 62.6970μs | 9.9999μs | 100.0015 KOps/s | 97.6384 KOps/s | |
test_lock_nested | 89.8646ms | 0.5869ms | 1.7040 KOps/s | 1.9761 KOps/s | |
test_lock_stack_nested | 0.8143ms | 0.4556ms | 2.1951 KOps/s | 2.1338 KOps/s | |
test_unlock_nested | 88.7128ms | 0.5066ms | 1.9740 KOps/s | 2.3627 KOps/s | |
test_unlock_stack_nested | 0.7285ms | 0.3722ms | 2.6867 KOps/s | 2.5938 KOps/s | |
test_flatten_speed | 0.6992ms | 0.1026ms | 9.7442 KOps/s | 9.6974 KOps/s | |
test_unflatten_speed | 0.7762ms | 0.4625ms | 2.1622 KOps/s | 2.1310 KOps/s | |
test_common_ops | 1.8129ms | 1.0993ms | 909.7025 Ops/s | 910.4937 Ops/s | |
test_creation | 17.9030μs | 2.0029μs | 499.2838 KOps/s | 489.7121 KOps/s | |
test_creation_empty | 77.8650μs | 16.9061μs | 59.1503 KOps/s | 57.4598 KOps/s | |
test_creation_nested_1 | 82.0710μs | 20.1184μs | 49.7057 KOps/s | 47.9874 KOps/s | |
test_creation_nested_2 | 60.2420μs | 24.4593μs | 40.8842 KOps/s | 40.1054 KOps/s | |
test_clone | 0.1481ms | 16.5200μs | 60.5327 KOps/s | 60.1003 KOps/s | |
test_getitem[int] | 1.1802ms | 16.2442μs | 61.5604 KOps/s | 61.1698 KOps/s | |
test_getitem[slice_int] | 0.1481ms | 31.2837μs | 31.9655 KOps/s | 31.2716 KOps/s | |
test_getitem[range] | 0.1922ms | 59.3641μs | 16.8452 KOps/s | 17.6668 KOps/s | |
test_getitem[tuple] | 0.1217ms | 24.9379μs | 40.0997 KOps/s | 39.5111 KOps/s | |
test_getitem[list] | 0.3228ms | 53.2107μs | 18.7932 KOps/s | 19.0577 KOps/s | |
test_setitem_dim[int] | 86.4410μs | 38.9560μs | 25.6700 KOps/s | 23.9370 KOps/s | |
test_setitem_dim[slice_int] | 0.1126ms | 69.4103μs | 14.4071 KOps/s | 13.7073 KOps/s | |
test_setitem_dim[range] | 0.1373ms | 91.5928μs | 10.9179 KOps/s | 10.7858 KOps/s | |
test_setitem_dim[tuple] | 0.1063ms | 57.0185μs | 17.5382 KOps/s | 16.7751 KOps/s | |
test_setitem | 0.1116ms | 28.5651μs | 35.0078 KOps/s | 33.4413 KOps/s | |
test_set | 97.4020μs | 27.8111μs | 35.9568 KOps/s | 35.6889 KOps/s | |
test_set_shared | 1.3587ms | 0.2129ms | 4.6981 KOps/s | 4.5675 KOps/s | |
test_update | 0.1968ms | 34.5673μs | 28.9290 KOps/s | 28.6907 KOps/s | |
test_update_nested | 0.1511ms | 44.1124μs | 22.6694 KOps/s | 22.1484 KOps/s | |
test_update__nested | 0.1387ms | 33.5314μs | 29.8228 KOps/s | 29.1982 KOps/s | |
test_set_nested | 0.1454ms | 30.4110μs | 32.8828 KOps/s | 32.1023 KOps/s | |
test_set_nested_new | 0.1594ms | 35.0907μs | 28.4976 KOps/s | 27.9653 KOps/s | |
test_select | 0.1283ms | 52.0372μs | 19.2170 KOps/s | 18.8702 KOps/s | |
test_select_nested | 0.1008ms | 58.6830μs | 17.0407 KOps/s | 16.8309 KOps/s | |
test_exclude_nested | 0.1731ms | 78.6119μs | 12.7207 KOps/s | 13.0042 KOps/s | |
test_empty[True] | 0.4140ms | 0.3303ms | 3.0279 KOps/s | 3.0922 KOps/s | |
test_empty[False] | 8.4282μs | 1.1776μs | 849.2126 KOps/s | 865.6728 KOps/s | |
test_unbind_speed | 0.6703ms | 0.3121ms | 3.2039 KOps/s | 3.1899 KOps/s | |
test_unbind_speed_stack0 | 0.5090ms | 0.3040ms | 3.2896 KOps/s | 3.2809 KOps/s | |
test_unbind_speed_stack1 | 87.8774ms | 0.7807ms | 1.2810 KOps/s | 1.3593 KOps/s | |
test_split | 90.1062ms | 2.1722ms | 460.3708 Ops/s | 450.1897 Ops/s | |
test_chunk | 85.0571ms | 2.1748ms | 459.8080 Ops/s | 447.6844 Ops/s | |
test_creation[device0] | 0.2533ms | 0.1176ms | 8.5007 KOps/s | 8.3858 KOps/s | |
test_creation_from_tensor | 4.0727ms | 0.1205ms | 8.3009 KOps/s | 8.1753 KOps/s | |
test_add_one[memmap_tensor0] | 0.1787ms | 7.6819μs | 130.1759 KOps/s | 131.6915 KOps/s | |
test_contiguous[memmap_tensor0] | 14.5270μs | 2.0347μs | 491.4794 KOps/s | 501.7790 KOps/s | |
test_stack[memmap_tensor0] | 53.5000μs | 5.5925μs | 178.8123 KOps/s | 176.3183 KOps/s | |
test_memmaptd_index | 1.4969ms | 0.4075ms | 2.4540 KOps/s | 2.3798 KOps/s | |
test_memmaptd_index_astensor | 1.0083ms | 0.4864ms | 2.0561 KOps/s | 1.9857 KOps/s | |
test_memmaptd_index_op | 1.3951ms | 1.0259ms | 974.7918 Ops/s | 958.2320 Ops/s | |
test_serialize_model | 0.1235s | 0.1196s | 8.3609 Ops/s | 8.5236 Ops/s | |
test_serialize_model_pickle | 0.4611s | 0.3971s | 2.5182 Ops/s | 2.5632 Ops/s | |
test_serialize_weights | 0.1946s | 0.1306s | 7.6567 Ops/s | 7.5826 Ops/s | |
test_serialize_weights_returnearly | 0.1845s | 0.1640s | 6.0991 Ops/s | 6.4555 Ops/s | |
test_serialize_weights_pickle | 0.4575s | 0.3921s | 2.5503 Ops/s | 2.5083 Ops/s | |
test_serialize_weights_filesystem | 0.1508s | 0.1422s | 7.0316 Ops/s | 6.8483 Ops/s | |
test_serialize_model_filesystem | 0.1601s | 0.1514s | 6.6054 Ops/s | 5.9967 Ops/s | |
test_reshape_pytree | 94.4670μs | 38.8932μs | 25.7114 KOps/s | 24.7864 KOps/s | |
test_reshape_td | 0.1093ms | 47.1222μs | 21.2214 KOps/s | 20.9551 KOps/s | |
test_view_pytree | 0.1003ms | 38.9796μs | 25.6544 KOps/s | 24.4970 KOps/s | |
test_view_td | 0.1200ms | 52.4400μs | 19.0694 KOps/s | 18.2499 KOps/s | |
test_unbind_pytree | 0.1029ms | 36.5133μs | 27.3873 KOps/s | 26.7535 KOps/s | |
test_unbind_td | 0.3519ms | 45.2990μs | 22.0756 KOps/s | 21.2806 KOps/s | |
test_split_pytree | 0.1097ms | 39.8242μs | 25.1103 KOps/s | 24.7661 KOps/s | |
test_split_td | 83.7081ms | 68.3730μs | 14.6257 KOps/s | 17.1323 KOps/s | |
test_add_pytree | 0.1040ms | 46.8158μs | 21.3603 KOps/s | 21.5038 KOps/s | |
test_add_td | 0.1673ms | 81.9377μs | 12.2044 KOps/s | 11.4896 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1037ms | 55.1291μs | 18.1392 KOps/s | 17.6865 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4672ms | 0.1932ms | 5.1747 KOps/s | 5.1258 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1784ms | 55.0168μs | 18.1763 KOps/s | 17.6352 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3565ms | 0.1452ms | 6.8858 KOps/s | 6.8339 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 71.1230μs | 20.0302μs | 49.9246 KOps/s | 48.2931 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1239ms | 63.5829μs | 15.7275 KOps/s | 15.3995 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1477ms | 79.9318μs | 12.5107 KOps/s | 12.5797 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1488ms | 71.7106μs | 13.9449 KOps/s | 13.7323 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2814ms | 0.1755ms | 5.6972 KOps/s | 5.6014 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3698ms | 0.1929ms | 5.1843 KOps/s | 4.5293 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1108ms | 38.9678μs | 25.6622 KOps/s | 25.6743 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.8892ms | 70.2737μs | 14.2301 KOps/s | 13.7787 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3934ms | 0.1739ms | 5.7503 KOps/s | 5.6505 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5105ms | 0.2981ms | 3.3550 KOps/s | 3.3925 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3790ms | 0.2089ms | 4.7876 KOps/s | 4.6919 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4347ms | 0.1903ms | 5.2561 KOps/s | 5.6525 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2951ms | 64.1373μs | 15.5915 KOps/s | 15.5210 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1069ms | 39.4877μs | 25.3244 KOps/s | 24.5068 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4146ms | 0.2423ms | 4.1271 KOps/s | 4.1561 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3467ms | 0.1733ms | 5.7714 KOps/s | 5.7444 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1835ms | 0.1070ms | 9.3433 KOps/s | 9.2240 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1281ms | 55.7324μs | 17.9429 KOps/s | 17.6892 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1985ms | 80.1923μs | 12.4700 KOps/s | 12.3218 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1513ms | 72.0100μs | 13.8870 KOps/s | 14.0788 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4001ms | 0.1905ms | 5.2480 KOps/s | 5.1867 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.8091ms | 1.6653ms | 600.4999 Ops/s | 587.4914 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3076ms | 0.1871ms | 5.3443 KOps/s | 5.1817 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.9969ms | 1.0976ms | 911.0473 Ops/s | 931.9512 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5003ms | 0.4059ms | 2.4640 KOps/s | 2.3676 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.7192ms | 3.7459ms | 266.9581 Ops/s | 253.1527 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1008ms | 32.7349μs | 30.5484 KOps/s | 32.0087 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.9875ms | 48.0976μs | 20.7910 KOps/s | 20.3503 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 81.9220μs | 28.2027μs | 35.4577 KOps/s | 34.8546 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 88.2750μs | 30.3390μs | 32.9609 KOps/s | 32.3904 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1173ms | 28.6889μs | 34.8567 KOps/s | 36.0824 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1063ms | 29.9811μs | 33.3544 KOps/s | 32.3200 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1526ms | 72.2247μs | 13.8457 KOps/s | 13.6999 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5476ms | 27.7027μs | 36.0976 KOps/s | 34.3188 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1312ms | 67.1201μs | 14.8987 KOps/s | 14.7747 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 92.0810μs | 25.0362μs | 39.9422 KOps/s | 40.6341 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2226ms | 67.1532μs | 14.8913 KOps/s | 14.7118 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1098ms | 24.5575μs | 40.7208 KOps/s | 40.7279 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1602ms | 71.6419μs | 13.9583 KOps/s | 13.7601 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6481ms | 28.0222μs | 35.6860 KOps/s | 35.2337 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1973ms | 67.3394μs | 14.8501 KOps/s | 14.6064 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 92.7730μs | 24.3992μs | 40.9849 KOps/s | 41.2987 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1417ms | 66.5614μs | 15.0237 KOps/s | 14.5449 KOps/s | |
test_compile_indexing[int-pytree-eager] | 90.8700μs | 24.3900μs | 41.0005 KOps/s | 40.5892 KOps/s | |
test_mod_add[eager] | 83.0850μs | 25.2392μs | 39.6209 KOps/s | 40.3502 KOps/s | |
test_mod_add[compile] | 0.1048ms | 37.2743μs | 26.8281 KOps/s | 28.0613 KOps/s | |
test_mod_add[compile-overhead] | 93.7450μs | 37.5658μs | 26.6200 KOps/s | 27.0179 KOps/s | |
test_mod_wrap[eager] | 0.3111ms | 0.2069ms | 4.8335 KOps/s | 4.7108 KOps/s | |
test_mod_wrap[compile] | 1.4482ms | 0.2313ms | 4.3238 KOps/s | 4.2766 KOps/s | |
test_mod_wrap[compile-overhead] | 5.1297ms | 0.2285ms | 4.3773 KOps/s | 4.3940 KOps/s | |
test_mod_wrap_and_backward[eager] | 11.9454ms | 10.7764ms | 92.7953 Ops/s | 89.6313 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.9356ms | 11.4604ms | 87.2571 Ops/s | 89.6927 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.6579ms | 11.8643ms | 84.2864 Ops/s | 85.0602 Ops/s | |
test_seq_add[eager] | 0.1766ms | 89.3399μs | 11.1932 KOps/s | 11.6263 KOps/s | |
test_seq_add[compile] | 0.1541ms | 61.1412μs | 16.3556 KOps/s | 15.9028 KOps/s | |
test_seq_add[compile-overhead] | 0.1778ms | 60.6940μs | 16.4761 KOps/s | 16.7851 KOps/s | |
test_seq_wrap[eager] | 0.6432ms | 0.3777ms | 2.6473 KOps/s | 2.5759 KOps/s | |
test_seq_wrap[compile] | 0.4565ms | 0.2653ms | 3.7691 KOps/s | 3.7678 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4850ms | 0.2640ms | 3.7875 KOps/s | 3.7821 KOps/s | |
test_func_call_runtime[False-eager] | 0.8267ms | 0.5306ms | 1.8847 KOps/s | 1.8967 KOps/s | |
test_func_call_runtime[False-compile] | 0.6032ms | 0.4969ms | 2.0125 KOps/s | 2.0243 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6673ms | 0.4930ms | 2.0283 KOps/s | 2.0250 KOps/s | |
test_func_call_runtime[True-eager] | 0.8573ms | 0.7530ms | 1.3280 KOps/s | 1.3279 KOps/s | |
test_func_call_runtime[True-compile] | 0.8030ms | 0.5096ms | 1.9622 KOps/s | 1.9633 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6172ms | 0.5096ms | 1.9622 KOps/s | 1.9495 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9087ms | 0.5364ms | 1.8644 KOps/s | 1.8560 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7908ms | 0.4927ms | 2.0296 KOps/s | 1.9828 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6330ms | 0.4983ms | 2.0067 KOps/s | 1.9828 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1814ms | 0.8853ms | 1.1295 KOps/s | 1.1222 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9570ms | 0.8297ms | 1.2053 KOps/s | 1.1812 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.7117ms | 0.8314ms | 1.2028 KOps/s | 1.1739 KOps/s | |
test_distributed | 0.2743ms | 0.1334ms | 7.4976 KOps/s | 7.3645 KOps/s | |
test_tdmodule | 88.7960μs | 17.3491μs | 57.6399 KOps/s | 56.7137 KOps/s | |
test_tdmodule_dispatch | 59.7110μs | 35.8498μs | 27.8942 KOps/s | 27.4685 KOps/s | |
test_tdseq | 33.9330μs | 18.1968μs | 54.9548 KOps/s | 52.9254 KOps/s | |
test_tdseq_dispatch | 79.5180μs | 38.6166μs | 25.8956 KOps/s | 25.2286 KOps/s | |
test_instantiation_functorch | 2.2317ms | 1.6572ms | 603.4452 Ops/s | 601.6416 Ops/s | |
test_instantiation_td | 1.8985ms | 1.1893ms | 840.7989 Ops/s | 808.7518 Ops/s | |
test_exec_functorch | 0.3079ms | 0.1793ms | 5.5770 KOps/s | 5.4024 KOps/s | |
test_exec_functional_call | 0.3270ms | 0.1721ms | 5.8104 KOps/s | 5.7318 KOps/s | |
test_exec_td | 0.2413ms | 0.1708ms | 5.8563 KOps/s | 5.6705 KOps/s | |
test_exec_td_decorator | 1.1585ms | 0.2301ms | 4.3463 KOps/s | 4.3294 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0947ms | 0.6465ms | 1.5467 KOps/s | 1.6865 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9708ms | 0.6433ms | 1.5545 KOps/s | 1.7195 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7260ms | 0.5036ms | 1.9856 KOps/s | 2.0868 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8052ms | 0.5043ms | 1.9828 KOps/s | 2.0549 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8250ms | 0.6206ms | 1.6114 KOps/s | 1.5564 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1913ms | 0.6218ms | 1.6082 KOps/s | 1.5552 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9123ms | 0.5186ms | 1.9282 KOps/s | 1.8821 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7604ms | 0.5222ms | 1.9150 KOps/s | 1.8741 KOps/s | |
test_to_module_speed[True] | 1.6363ms | 1.3109ms | 762.8228 Ops/s | 742.2531 Ops/s | |
test_to_module_speed[False] | 1.9372ms | 1.2861ms | 777.5509 Ops/s | 756.5416 Ops/s | |
test_tc_init | 0.1300ms | 43.4165μs | 23.0327 KOps/s | 22.1187 KOps/s | |
test_tc_init_nested | 0.1826ms | 86.4483μs | 11.5676 KOps/s | 11.1302 KOps/s | |
test_tc_first_layer_tensor | 23.6040μs | 1.4476μs | 690.8198 KOps/s | 673.4738 KOps/s | |
test_tc_first_layer_nontensor | 29.0550μs | 4.2694μs | 234.2257 KOps/s | 228.5267 KOps/s | |
test_tc_second_layer_tensor | 37.7310μs | 2.7010μs | 370.2317 KOps/s | 363.1728 KOps/s | |
test_tc_second_layer_nontensor | 46.3160μs | 5.5144μs | 181.3418 KOps/s | 175.7239 KOps/s | |
test_unbind | 0.4621s | 13.8343ms | 72.2843 Ops/s | 68.1983 Ops/s | |
test_full_like | 8.5079ms | 7.1851ms | 139.1760 Ops/s | 115.2624 Ops/s | |
test_zeros_like | 12.5546ms | 6.4536ms | 154.9524 Ops/s | 122.2659 Ops/s | |
test_ones_like | 15.0306ms | 7.2844ms | 137.2804 Ops/s | 128.0940 Ops/s | |
test_clone | 15.2379ms | 8.9330ms | 111.9450 Ops/s | 105.4171 Ops/s | |
test_squeeze | 74.8090μs | 12.4904μs | 80.0616 KOps/s | 78.7499 KOps/s | |
test_unsqueeze | 0.1694ms | 91.8787μs | 10.8839 KOps/s | 10.4909 KOps/s | |
test_split | 0.3393ms | 0.1934ms | 5.1714 KOps/s | 4.9625 KOps/s | |
test_permute | 0.3638ms | 0.2190ms | 4.5659 KOps/s | 4.5421 KOps/s | |
test_stack | 31.8021ms | 24.1149ms | 41.4682 Ops/s | 39.7415 Ops/s | |
test_cat | 28.5375ms | 23.8357ms | 41.9539 Ops/s | 40.1934 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1237ms | 17.2943μs | 57.8225 KOps/s | 58.5832 KOps/s | |
test_plain_set_stack_nested | 42.3810μs | 17.4644μs | 57.2592 KOps/s | 58.3814 KOps/s | |
test_plain_set_nested_inplace | 38.9910μs | 18.4136μs | 54.3077 KOps/s | 54.9574 KOps/s | |
test_plain_set_stack_nested_inplace | 35.4000μs | 18.3611μs | 54.4630 KOps/s | 54.7384 KOps/s | |
test_items | 20.3300μs | 4.5960μs | 217.5803 KOps/s | 215.1025 KOps/s | |
test_items_nested | 0.4106ms | 0.3600ms | 2.7780 KOps/s | 2.6759 KOps/s | |
test_items_nested_locked | 0.4160ms | 0.3630ms | 2.7547 KOps/s | 2.7035 KOps/s | |
test_items_nested_leaf | 0.1002ms | 83.3096μs | 12.0034 KOps/s | 11.9599 KOps/s | |
test_items_stack_nested | 0.4113ms | 0.3678ms | 2.7189 KOps/s | 2.7200 KOps/s | |
test_items_stack_nested_leaf | 0.1066ms | 85.1307μs | 11.7466 KOps/s | 11.9673 KOps/s | |
test_items_stack_nested_locked | 0.3822ms | 0.3626ms | 2.7580 KOps/s | 2.6778 KOps/s | |
test_keys | 25.5300μs | 4.3820μs | 228.2056 KOps/s | 228.4246 KOps/s | |
test_keys_nested | 91.4910μs | 66.9538μs | 14.9357 KOps/s | 14.8107 KOps/s | |
test_keys_nested_locked | 2.1321ms | 73.0189μs | 13.6951 KOps/s | 13.8239 KOps/s | |
test_keys_nested_leaf | 75.6320μs | 57.6969μs | 17.3319 KOps/s | 17.5988 KOps/s | |
test_keys_stack_nested | 92.0220μs | 67.3893μs | 14.8392 KOps/s | 15.3647 KOps/s | |
test_keys_stack_nested_leaf | 79.7110μs | 56.9258μs | 17.5667 KOps/s | 17.7433 KOps/s | |
test_keys_stack_nested_locked | 93.5310μs | 72.6835μs | 13.7583 KOps/s | 14.0948 KOps/s | |
test_values | 7.1133μs | 1.7629μs | 567.2493 KOps/s | 567.2199 KOps/s | |
test_values_nested | 51.7300μs | 33.7525μs | 29.6275 KOps/s | 29.9421 KOps/s | |
test_values_nested_locked | 59.9800μs | 35.8390μs | 27.9025 KOps/s | 28.1448 KOps/s | |
test_values_nested_leaf | 46.9300μs | 29.9354μs | 33.4053 KOps/s | 33.6756 KOps/s | |
test_values_stack_nested | 54.7810μs | 34.1105μs | 29.3165 KOps/s | 29.3880 KOps/s | |
test_values_stack_nested_leaf | 50.7210μs | 30.3904μs | 32.9051 KOps/s | 32.9717 KOps/s | |
test_values_stack_nested_locked | 56.2110μs | 36.0153μs | 27.7660 KOps/s | 27.8380 KOps/s | |
test_membership | 1.6335μs | 0.5518μs | 1.8124 MOps/s | 1.8656 MOps/s | |
test_membership_nested | 8.1305μs | 1.9319μs | 517.6243 KOps/s | 523.3674 KOps/s | |
test_membership_nested_leaf | 10.3905μs | 1.9252μs | 519.4221 KOps/s | 526.5441 KOps/s | |
test_membership_stacked_nested | 26.0200μs | 1.9837μs | 504.0994 KOps/s | 503.3632 KOps/s | |
test_membership_stacked_nested_leaf | 17.0010μs | 1.9801μs | 505.0175 KOps/s | 500.3674 KOps/s | |
test_membership_nested_last | 16.6100μs | 2.8829μs | 346.8747 KOps/s | 348.1980 KOps/s | |
test_membership_nested_leaf_last | 26.1200μs | 2.9266μs | 341.6981 KOps/s | 350.6302 KOps/s | |
test_membership_stacked_nested_last | 27.6710μs | 2.8848μs | 346.6457 KOps/s | 110.2992 KOps/s | |
test_membership_stacked_nested_leaf_last | 18.1510μs | 2.8905μs | 345.9659 KOps/s | 110.7887 KOps/s | |
test_nested_getleaf | 20.2500μs | 7.9576μs | 125.6663 KOps/s | 128.6162 KOps/s | |
test_nested_get | 22.6000μs | 7.3593μs | 135.8820 KOps/s | 136.6255 KOps/s | |
test_stacked_getleaf | 29.3000μs | 7.8451μs | 127.4674 KOps/s | 127.4740 KOps/s | |
test_stacked_get | 20.4700μs | 7.3524μs | 136.0100 KOps/s | 135.4061 KOps/s | |
test_nested_getitemleaf | 24.6010μs | 8.1164μs | 123.2074 KOps/s | 124.0289 KOps/s | |
test_nested_getitem | 30.7510μs | 7.5905μs | 131.7439 KOps/s | 131.8793 KOps/s | |
test_stacked_getitemleaf | 25.6390μs | 8.1777μs | 122.2840 KOps/s | 123.1307 KOps/s | |
test_stacked_getitem | 25.4100μs | 7.6441μs | 130.8194 KOps/s | 132.1427 KOps/s | |
test_lock_nested | 4.9777ms | 0.4745ms | 2.1077 KOps/s | 2.0901 KOps/s | |
test_lock_stack_nested | 0.4632ms | 0.4323ms | 2.3129 KOps/s | 2.3209 KOps/s | |
test_unlock_nested | 0.8556ms | 0.3885ms | 2.5738 KOps/s | 2.4958 KOps/s | |
test_unlock_stack_nested | 0.4053ms | 0.3506ms | 2.8521 KOps/s | 2.8451 KOps/s | |
test_flatten_speed | 0.3382ms | 0.1050ms | 9.5215 KOps/s | 9.5983 KOps/s | |
test_unflatten_speed | 0.3509ms | 0.3201ms | 3.1244 KOps/s | 3.1788 KOps/s | |
test_common_ops | 1.5513ms | 1.3302ms | 751.7427 Ops/s | 742.4538 Ops/s | |
test_creation | 18.2910μs | 1.6596μs | 602.5670 KOps/s | 610.8341 KOps/s | |
test_creation_empty | 38.2510μs | 17.9817μs | 55.6122 KOps/s | 56.5993 KOps/s | |
test_creation_nested_1 | 38.4010μs | 19.6758μs | 50.8240 KOps/s | 51.8267 KOps/s | |
test_creation_nested_2 | 39.2890μs | 22.4497μs | 44.5440 KOps/s | 44.4420 KOps/s | |
test_clone | 58.9210μs | 29.4504μs | 33.9554 KOps/s | 31.5009 KOps/s | |
test_getitem[int] | 1.1056ms | 17.1599μs | 58.2753 KOps/s | 53.8802 KOps/s | |
test_getitem[slice_int] | 0.1435ms | 29.3129μs | 34.1147 KOps/s | 32.4379 KOps/s | |
test_getitem[range] | 0.2623ms | 0.1150ms | 8.6957 KOps/s | 8.5895 KOps/s | |
test_getitem[tuple] | 0.1473ms | 25.1460μs | 39.7677 KOps/s | 37.5233 KOps/s | |
test_getitem[list] | 0.2358ms | 0.1044ms | 9.5803 KOps/s | 9.2187 KOps/s | |
test_setitem_dim[int] | 77.8720μs | 58.1378μs | 17.2005 KOps/s | 16.8955 KOps/s | |
test_setitem_dim[slice_int] | 99.1820μs | 79.2129μs | 12.6242 KOps/s | 12.4344 KOps/s | |
test_setitem_dim[range] | 0.1730ms | 0.1446ms | 6.9149 KOps/s | 6.8471 KOps/s | |
test_setitem_dim[tuple] | 91.6520μs | 72.4911μs | 13.7948 KOps/s | 13.0739 KOps/s | |
test_setitem | 65.2120μs | 43.3995μs | 23.0417 KOps/s | 22.1611 KOps/s | |
test_set | 80.4610μs | 44.5132μs | 22.4653 KOps/s | 22.7660 KOps/s | |
test_set_shared | 0.4017ms | 54.2697μs | 18.4265 KOps/s | 18.0040 KOps/s | |
test_update | 79.0700μs | 53.3563μs | 18.7419 KOps/s | 18.9191 KOps/s | |
test_update_nested | 96.4220μs | 62.7929μs | 15.9254 KOps/s | 16.4067 KOps/s | |
test_update__nested | 0.4388ms | 62.3899μs | 16.0282 KOps/s | 15.2164 KOps/s | |
test_set_nested | 67.7710μs | 44.9032μs | 22.2701 KOps/s | 21.4332 KOps/s | |
test_set_nested_new | 76.7710μs | 48.5353μs | 20.6036 KOps/s | 20.0416 KOps/s | |
test_select | 94.6010μs | 65.4293μs | 15.2837 KOps/s | 15.3276 KOps/s | |
test_select_nested | 69.9610μs | 50.6681μs | 19.7363 KOps/s | 19.6539 KOps/s | |
test_exclude_nested | 91.2410μs | 67.8892μs | 14.7299 KOps/s | 14.6534 KOps/s | |
test_empty[True] | 0.3085ms | 0.2790ms | 3.5842 KOps/s | 3.5208 KOps/s | |
test_empty[False] | 2.1521μs | 0.8503μs | 1.1761 MOps/s | 1.1783 MOps/s | |
test_to | 46.9900μs | 27.1303μs | 36.8592 KOps/s | 34.1781 KOps/s | |
test_to_nonblocking | 53.6700μs | 26.8140μs | 37.2939 KOps/s | 36.0807 KOps/s | |
test_unbind_speed | 1.3012ms | 0.3005ms | 3.3278 KOps/s | 3.1602 KOps/s | |
test_unbind_speed_stack0 | 0.3255ms | 0.2989ms | 3.3451 KOps/s | 3.2659 KOps/s | |
test_unbind_speed_stack1 | 91.3225ms | 0.7697ms | 1.2992 KOps/s | 1.2992 KOps/s | |
test_split | 92.7893ms | 2.3916ms | 418.1246 Ops/s | 404.5953 Ops/s | |
test_chunk | 94.6936ms | 2.3876ms | 418.8345 Ops/s | 439.1620 Ops/s | |
test_creation[device0] | 0.1564ms | 0.1047ms | 9.5488 KOps/s | 9.2937 KOps/s | |
test_creation_from_tensor | 0.1626ms | 0.1026ms | 9.7489 KOps/s | 9.2561 KOps/s | |
test_add_one[memmap_tensor0] | 68.1810μs | 9.0044μs | 111.0569 KOps/s | 102.7802 KOps/s | |
test_contiguous[memmap_tensor0] | 27.5610μs | 2.2504μs | 444.3663 KOps/s | 436.8684 KOps/s | |
test_stack[memmap_tensor0] | 31.0210μs | 6.6265μs | 150.9092 KOps/s | 134.3627 KOps/s | |
test_memmaptd_index | 1.1766ms | 0.4386ms | 2.2799 KOps/s | 2.1264 KOps/s | |
test_memmaptd_index_astensor | 0.8933ms | 0.5041ms | 1.9836 KOps/s | 1.8463 KOps/s | |
test_memmaptd_index_op | 1.4696ms | 1.0706ms | 934.0844 Ops/s | 887.0782 Ops/s | |
test_serialize_model | 92.9156ms | 89.3193ms | 11.1958 Ops/s | 10.9331 Ops/s | |
test_serialize_model_pickle | 1.3489s | 1.2369s | 0.8085 Ops/s | 0.8071 Ops/s | |
test_serialize_weights | 0.1827s | 96.6039ms | 10.3516 Ops/s | 11.1200 Ops/s | |
test_serialize_weights_returnearly | 72.8317ms | 57.5462ms | 17.3773 Ops/s | 15.6554 Ops/s | |
test_serialize_weights_pickle | 1.3480s | 1.2363s | 0.8089 Ops/s | 0.8087 Ops/s | |
test_reshape_pytree | 62.6220μs | 38.1830μs | 26.1897 KOps/s | 25.2717 KOps/s | |
test_reshape_td | 75.7910μs | 43.4798μs | 22.9992 KOps/s | 21.4556 KOps/s | |
test_view_pytree | 68.7820μs | 37.0489μs | 26.9913 KOps/s | 25.7113 KOps/s | |
test_view_td | 0.2202ms | 51.9136μs | 19.2628 KOps/s | 18.8072 KOps/s | |
test_unbind_pytree | 0.1729ms | 38.3566μs | 26.0711 KOps/s | 24.9925 KOps/s | |
test_unbind_td | 0.4470ms | 45.8860μs | 21.7931 KOps/s | 20.9414 KOps/s | |
test_split_pytree | 85.5120μs | 52.6503μs | 18.9933 KOps/s | 18.7462 KOps/s | |
test_split_td | 0.4983ms | 60.9320μs | 16.4117 KOps/s | 13.3493 KOps/s | |
test_add_pytree | 86.9820μs | 58.9632μs | 16.9597 KOps/s | 16.4495 KOps/s | |
test_add_td | 0.1270ms | 95.8756μs | 10.4302 KOps/s | 10.0828 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4259ms | 0.2200ms | 4.5459 KOps/s | 4.4156 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2749ms | 0.1740ms | 5.7478 KOps/s | 5.5860 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1908ms | 0.1527ms | 6.5479 KOps/s | 6.3987 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2494ms | 0.1926ms | 5.1934 KOps/s | 4.9755 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 44.4710μs | 22.8714μs | 43.7228 KOps/s | 42.1349 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 67.7110μs | 48.2011μs | 20.7464 KOps/s | 20.0492 KOps/s | |
test_compile_copy_nested[pytree-compile] | 97.7710μs | 74.5264μs | 13.4181 KOps/s | 13.4334 KOps/s | |
test_compile_copy_nested[pytree-eager] | 86.0710μs | 59.3268μs | 16.8558 KOps/s | 16.9039 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4702ms | 0.3416ms | 2.9272 KOps/s | 2.8588 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2626ms | 0.2226ms | 4.4934 KOps/s | 4.4401 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1725ms | 0.1354ms | 7.3866 KOps/s | 6.8481 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1222ms | 62.6677μs | 15.9572 KOps/s | 14.5009 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3760ms | 0.3400ms | 2.9415 KOps/s | 2.8785 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6620ms | 0.6274ms | 1.5938 KOps/s | 1.4990 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3102ms | 0.2705ms | 3.6965 KOps/s | 3.6713 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3846ms | 0.3415ms | 2.9278 KOps/s | 2.8426 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1573ms | 75.6815μs | 13.2133 KOps/s | 12.7740 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1899ms | 0.1363ms | 7.3392 KOps/s | 7.1151 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6734ms | 0.5391ms | 1.8548 KOps/s | 1.7532 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3918ms | 0.3383ms | 2.9561 KOps/s | 2.8718 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 42.5520μs | 19.6235μs | 50.9593 KOps/s | 52.1299 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 51.2110μs | 31.9051μs | 31.3430 KOps/s | 31.8969 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1373ms | 77.3197μs | 12.9333 KOps/s | 13.0873 KOps/s | |
test_compile_copy_flat[pytree-eager] | 84.5410μs | 60.2520μs | 16.5970 KOps/s | 16.6735 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.4898ms | 0.8690ms | 1.1508 KOps/s | 1.0335 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4991ms | 3.3061ms | 302.4748 Ops/s | 286.6844 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4484ms | 0.8559ms | 1.1683 KOps/s | 1.0500 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.7007ms | 3.4201ms | 292.3884 Ops/s | 287.5291 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1641ms | 0.1231ms | 8.1233 KOps/s | 8.4024 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.3361ms | 67.5738μs | 14.7986 KOps/s | 15.2058 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2343ms | 0.1128ms | 8.8621 KOps/s | 9.0175 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2743ms | 48.5466μs | 20.5988 KOps/s | 21.1579 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.3168ms | 0.1158ms | 8.6368 KOps/s | 9.0911 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2591ms | 48.5709μs | 20.5885 KOps/s | 20.4244 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2049ms | 0.1511ms | 6.6161 KOps/s | 6.6971 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2299ms | 27.8801μs | 35.8679 KOps/s | 35.3497 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1864ms | 0.1414ms | 7.0743 KOps/s | 7.1377 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.2185ms | 22.7720μs | 43.9137 KOps/s | 42.0613 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.3604ms | 0.1465ms | 6.8241 KOps/s | 7.1202 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.2289ms | 23.9092μs | 41.8249 KOps/s | 41.7351 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.3859ms | 0.1538ms | 6.5015 KOps/s | 6.6666 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4869ms | 28.6623μs | 34.8890 KOps/s | 34.6017 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2750ms | 0.1463ms | 6.8365 KOps/s | 7.0349 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2391ms | 24.1294μs | 41.4432 KOps/s | 42.0446 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.3543ms | 0.1448ms | 6.9073 KOps/s | 7.1149 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2349ms | 24.2976μs | 41.1563 KOps/s | 42.1640 KOps/s | |
test_mod_add[eager] | 0.2710ms | 34.5168μs | 28.9714 KOps/s | 29.2068 KOps/s | |
test_mod_add[compile] | 0.2835ms | 74.9761μs | 13.3376 KOps/s | 12.8031 KOps/s | |
test_mod_add[compile-overhead] | 0.2657ms | 0.1420ms | 7.0446 KOps/s | 6.3195 KOps/s | |
test_mod_wrap[eager] | 0.3214ms | 0.2454ms | 4.0747 KOps/s | 4.0008 KOps/s | |
test_mod_wrap[compile] | 1.2698ms | 0.3067ms | 3.2602 KOps/s | 3.1541 KOps/s | |
test_mod_wrap[compile-overhead] | 8.1347ms | 4.2533ms | 235.1142 Ops/s | 230.3351 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5727ms | 1.4349ms | 696.8991 Ops/s | 679.4357 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5874ms | 1.4579ms | 685.9392 Ops/s | 718.5640 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4720ms | 1.0206ms | 979.7826 Ops/s | 1.0723 KOps/s | |
test_seq_add[eager] | 0.1728ms | 0.1016ms | 9.8424 KOps/s | 9.6223 KOps/s | |
test_seq_add[compile] | 0.1312ms | 86.4792μs | 11.5635 KOps/s | 11.0169 KOps/s | |
test_seq_add[compile-overhead] | 0.1654ms | 0.1222ms | 8.1862 KOps/s | 7.9593 KOps/s | |
test_seq_wrap[eager] | 0.5286ms | 0.3918ms | 2.5524 KOps/s | 2.5040 KOps/s | |
test_seq_wrap[compile] | 0.4087ms | 0.3242ms | 3.0841 KOps/s | 2.9821 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2738ms | 0.2305ms | 4.3377 KOps/s | 4.2151 KOps/s | |
test_func_call_runtime[False-eager] | 0.9817ms | 0.7641ms | 1.3087 KOps/s | 1.3063 KOps/s | |
test_func_call_runtime[False-compile] | 0.9884ms | 0.8127ms | 1.2305 KOps/s | 1.1749 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4373ms | 0.3773ms | 2.6504 KOps/s | 2.6282 KOps/s | |
test_func_call_runtime[True-eager] | 1.0057ms | 0.9295ms | 1.0759 KOps/s | 1.0631 KOps/s | |
test_func_call_runtime[True-compile] | 1.0619ms | 0.8570ms | 1.1669 KOps/s | 1.1132 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4776ms | 0.4162ms | 2.4027 KOps/s | 2.3444 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8594ms | 0.7315ms | 1.3670 KOps/s | 1.3332 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9592ms | 0.8165ms | 1.2248 KOps/s | 1.1696 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4534ms | 0.3790ms | 2.6386 KOps/s | 2.6075 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1438ms | 1.0304ms | 970.5374 Ops/s | 938.8048 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.2651ms | 1.0068ms | 993.2344 Ops/s | 946.3250 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0591ms | 1.0012ms | 998.8075 Ops/s | 955.9910 Ops/s | |
test_distributed | 2.0876ms | 71.4565μs | 13.9945 KOps/s | 11.0647 KOps/s | |
test_tdmodule | 32.2910μs | 16.5266μs | 60.5084 KOps/s | 59.6254 KOps/s | |
test_tdmodule_dispatch | 52.3610μs | 33.9368μs | 29.4665 KOps/s | 30.0680 KOps/s | |
test_tdseq | 33.6300μs | 17.2947μs | 57.8213 KOps/s | 57.9233 KOps/s | |
test_tdseq_dispatch | 54.0300μs | 35.8357μs | 27.9051 KOps/s | 28.2777 KOps/s | |
test_instantiation_functorch | 2.1295ms | 2.0000ms | 500.0080 Ops/s | 482.3030 Ops/s | |
test_instantiation_td | 2.0121ms | 1.3182ms | 758.6190 Ops/s | 746.5488 Ops/s | |
test_exec_functorch | 0.2763ms | 0.2217ms | 4.5105 KOps/s | 4.4661 KOps/s | |
test_exec_functional_call | 0.2783ms | 0.2102ms | 4.7582 KOps/s | 4.6579 KOps/s | |
test_exec_td | 0.3825ms | 0.2188ms | 4.5704 KOps/s | 4.5293 KOps/s | |
test_exec_td_decorator | 0.6267ms | 0.2733ms | 3.6594 KOps/s | 3.6327 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8421ms | 0.7307ms | 1.3686 KOps/s | 1.5335 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7901ms | 0.7160ms | 1.3966 KOps/s | 1.5492 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7534ms | 0.5951ms | 1.6804 KOps/s | 1.7820 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7611ms | 0.6028ms | 1.6588 KOps/s | 1.7627 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.1742s | 0.8301ms | 1.2047 KOps/s | 1.4251 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8064ms | 0.7109ms | 1.4068 KOps/s | 1.4216 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7523ms | 0.6175ms | 1.6195 KOps/s | 1.6360 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7836ms | 0.6134ms | 1.6304 KOps/s | 1.6381 KOps/s | |
test_vmap_transformer_speed[True-True] | 9.9594ms | 9.1021ms | 109.8653 Ops/s | 114.9471 Ops/s | |
test_vmap_transformer_speed[True-False] | 9.3203ms | 9.0477ms | 110.5248 Ops/s | 115.3605 Ops/s | |
test_vmap_transformer_speed[False-True] | 9.1031ms | 8.8503ms | 112.9910 Ops/s | 116.4894 Ops/s | |
test_vmap_transformer_speed[False-False] | 9.1120ms | 8.8673ms | 112.7743 Ops/s | 117.2697 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 21.1100ms | 20.5138ms | 48.7477 Ops/s | 48.8006 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.0559ms | 20.4500ms | 48.8998 Ops/s | 48.6840 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.9249ms | 20.3376ms | 49.1701 Ops/s | 49.1437 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.8835ms | 20.3571ms | 49.1230 Ops/s | 49.0236 Ops/s | |
test_to_module_speed[True] | 1.7548ms | 1.1319ms | 883.4700 Ops/s | 872.2435 Ops/s | |
test_to_module_speed[False] | 1.5571ms | 1.1137ms | 897.9397 Ops/s | 883.4453 Ops/s | |
test_tc_init | 59.6610μs | 39.4136μs | 25.3719 KOps/s | 25.6462 KOps/s | |
test_tc_init_nested | 0.1133ms | 80.0176μs | 12.4973 KOps/s | 12.5123 KOps/s | |
test_tc_first_layer_tensor | 3.6385μs | 0.7901μs | 1.2656 MOps/s | 1.1341 MOps/s | |
test_tc_first_layer_nontensor | 25.1410μs | 2.5484μs | 392.3987 KOps/s | 397.7721 KOps/s | |
test_tc_second_layer_tensor | 6.5667μs | 1.5860μs | 630.5112 KOps/s | 629.0299 KOps/s | |
test_tc_second_layer_nontensor | 18.1110μs | 3.3809μs | 295.7783 KOps/s | 297.4998 KOps/s | |
test_unbind | 0.1859s | 12.8435ms | 77.8602 Ops/s | 84.0519 Ops/s | |
test_full_like | 0.6558ms | 0.5787ms | 1.7280 KOps/s | 1.7334 KOps/s | |
test_zeros_like | 0.2629ms | 0.1976ms | 5.0611 KOps/s | 5.0571 KOps/s | |
test_ones_like | 0.2311ms | 0.1975ms | 5.0632 KOps/s | 5.0602 KOps/s | |
test_clone | 0.4404ms | 0.4132ms | 2.4201 KOps/s | 2.4131 KOps/s | |
test_squeeze | 30.0910μs | 10.7051μs | 93.4137 KOps/s | 92.8977 KOps/s | |
test_unsqueeze | 0.2664ms | 78.9663μs | 12.6636 KOps/s | 11.8236 KOps/s | |
test_split | 0.4497ms | 0.1713ms | 5.8377 KOps/s | 5.6067 KOps/s | |
test_permute | 0.2366ms | 0.1874ms | 5.3354 KOps/s | 5.2635 KOps/s | |
test_stack | 1.2497ms | 0.9069ms | 1.1027 KOps/s | 1.0861 KOps/s | |
test_cat | 1.2520ms | 1.2319ms | 811.7707 Ops/s | 811.7974 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.