-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] densify non tensor stack is a no-op #1194
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 30, 2025
ghstack-source-id: edbc22ce562cd918ce5dd5c0441e47cdadf7d88a Pull Request resolved: #1194
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.0780μs | 20.5420μs | 48.6807 KOps/s | 49.2307 KOps/s | |
test_plain_set_stack_nested | 51.7060μs | 20.6883μs | 48.3364 KOps/s | 48.1049 KOps/s | |
test_plain_set_nested_inplace | 57.9470μs | 22.4154μs | 44.6123 KOps/s | 44.5354 KOps/s | |
test_plain_set_stack_nested_inplace | 83.3650μs | 23.0036μs | 43.4714 KOps/s | 44.3241 KOps/s | |
test_items | 23.2940μs | 4.1756μs | 239.4883 KOps/s | 244.0561 KOps/s | |
test_items_nested | 0.4726ms | 0.3969ms | 2.5196 KOps/s | 2.4930 KOps/s | |
test_items_nested_locked | 0.8309ms | 0.3953ms | 2.5300 KOps/s | 2.4702 KOps/s | |
test_items_nested_leaf | 0.1156ms | 76.8433μs | 13.0135 KOps/s | 13.1258 KOps/s | |
test_items_stack_nested | 0.4694ms | 0.3999ms | 2.5007 KOps/s | 2.4987 KOps/s | |
test_items_stack_nested_leaf | 0.1341ms | 78.6739μs | 12.7107 KOps/s | 12.7460 KOps/s | |
test_items_stack_nested_locked | 0.7223ms | 0.4002ms | 2.4985 KOps/s | 2.4821 KOps/s | |
test_keys | 25.6580μs | 3.5251μs | 283.6810 KOps/s | 286.9621 KOps/s | |
test_keys_nested | 0.2790ms | 0.1659ms | 6.0289 KOps/s | 6.0971 KOps/s | |
test_keys_nested_locked | 1.8038ms | 0.1710ms | 5.8472 KOps/s | 5.8360 KOps/s | |
test_keys_nested_leaf | 0.2224ms | 0.1450ms | 6.8975 KOps/s | 6.9891 KOps/s | |
test_keys_stack_nested | 0.2996ms | 0.1661ms | 6.0192 KOps/s | 6.0634 KOps/s | |
test_keys_stack_nested_leaf | 0.2195ms | 0.1438ms | 6.9557 KOps/s | 6.9842 KOps/s | |
test_keys_stack_nested_locked | 0.3011ms | 0.1714ms | 5.8343 KOps/s | 5.8400 KOps/s | |
test_values | 5.4782μs | 1.0427μs | 959.0921 KOps/s | 967.2685 KOps/s | |
test_values_nested | 0.1273ms | 62.3202μs | 16.0462 KOps/s | 15.6764 KOps/s | |
test_values_nested_locked | 0.5786ms | 64.0450μs | 15.6140 KOps/s | 15.6876 KOps/s | |
test_values_nested_leaf | 0.1307ms | 70.8077μs | 14.1228 KOps/s | 13.9851 KOps/s | |
test_values_stack_nested | 0.1210ms | 62.3439μs | 16.0401 KOps/s | 15.6959 KOps/s | |
test_values_stack_nested_leaf | 0.1355ms | 71.0061μs | 14.0833 KOps/s | 13.9133 KOps/s | |
test_values_stack_nested_locked | 0.1206ms | 62.2792μs | 16.0567 KOps/s | 15.6869 KOps/s | |
test_membership | 1.9376μs | 0.6810μs | 1.4684 MOps/s | 1.4544 MOps/s | |
test_membership_nested | 23.1130μs | 2.9190μs | 342.5828 KOps/s | 345.9014 KOps/s | |
test_membership_nested_leaf | 25.5280μs | 2.9041μs | 344.3377 KOps/s | 349.8750 KOps/s | |
test_membership_stacked_nested | 25.6170μs | 2.9091μs | 343.7456 KOps/s | 341.9289 KOps/s | |
test_membership_stacked_nested_leaf | 0.1356ms | 2.9334μs | 340.9065 KOps/s | 343.4669 KOps/s | |
test_membership_nested_last | 25.8780μs | 4.2912μs | 233.0344 KOps/s | 236.0595 KOps/s | |
test_membership_nested_leaf_last | 29.2940μs | 4.3677μs | 228.9514 KOps/s | 230.5072 KOps/s | |
test_membership_stacked_nested_last | 19.8270μs | 4.3735μs | 228.6513 KOps/s | 233.4568 KOps/s | |
test_membership_stacked_nested_leaf_last | 21.2090μs | 4.3309μs | 230.8981 KOps/s | 231.5667 KOps/s | |
test_nested_getleaf | 33.3920μs | 10.5635μs | 94.6652 KOps/s | 93.6274 KOps/s | |
test_nested_get | 43.2910μs | 10.0405μs | 99.5965 KOps/s | 96.2331 KOps/s | |
test_stacked_getleaf | 35.3960μs | 10.5123μs | 95.1263 KOps/s | 93.7971 KOps/s | |
test_stacked_get | 41.3470μs | 9.9793μs | 100.2070 KOps/s | 97.9029 KOps/s | |
test_nested_getitemleaf | 40.1140μs | 11.2755μs | 88.6877 KOps/s | 89.0383 KOps/s | |
test_nested_getitem | 34.7550μs | 10.6676μs | 93.7422 KOps/s | 93.4886 KOps/s | |
test_stacked_getitemleaf | 34.5540μs | 11.2267μs | 89.0734 KOps/s | 89.3241 KOps/s | |
test_stacked_getitem | 30.9480μs | 10.6406μs | 93.9800 KOps/s | 93.6978 KOps/s | |
test_lock_nested | 0.7417ms | 0.4194ms | 2.3844 KOps/s | 2.4412 KOps/s | |
test_lock_stack_nested | 0.8430ms | 0.4229ms | 2.3648 KOps/s | 2.3534 KOps/s | |
test_unlock_nested | 0.4508ms | 0.3367ms | 2.9703 KOps/s | 2.9498 KOps/s | |
test_unlock_stack_nested | 0.6258ms | 0.3415ms | 2.9279 KOps/s | 2.9104 KOps/s | |
test_flatten_speed | 0.2010ms | 0.1009ms | 9.9125 KOps/s | 10.0051 KOps/s | |
test_unflatten_speed | 0.8228ms | 0.5240ms | 1.9086 KOps/s | 1.9043 KOps/s | |
test_common_ops | 0.9965ms | 0.7925ms | 1.2618 KOps/s | 1.2215 KOps/s | |
test_creation | 38.7520μs | 2.5680μs | 389.4121 KOps/s | 403.0307 KOps/s | |
test_creation_empty | 61.4740μs | 12.2275μs | 81.7826 KOps/s | 82.3257 KOps/s | |
test_creation_nested_1 | 40.2650μs | 15.1992μs | 65.7929 KOps/s | 65.2725 KOps/s | |
test_creation_nested_2 | 49.6130μs | 19.8607μs | 50.3507 KOps/s | 50.6994 KOps/s | |
test_clone | 67.5060μs | 13.2198μs | 75.6443 KOps/s | 74.0525 KOps/s | |
test_getitem[int] | 0.7760ms | 12.8779μs | 77.6524 KOps/s | 79.0666 KOps/s | |
test_getitem[slice_int] | 0.1298ms | 25.0378μs | 39.9396 KOps/s | 41.0050 KOps/s | |
test_getitem[range] | 0.1950ms | 50.9518μs | 19.6264 KOps/s | 20.5744 KOps/s | |
test_getitem[tuple] | 0.1240ms | 20.3900μs | 49.0435 KOps/s | 49.8198 KOps/s | |
test_getitem[list] | 0.1570ms | 46.0524μs | 21.7144 KOps/s | 23.0805 KOps/s | |
test_setitem_dim[int] | 60.9130μs | 25.7540μs | 38.8289 KOps/s | 39.8006 KOps/s | |
test_setitem_dim[slice_int] | 86.3410μs | 50.5667μs | 19.7758 KOps/s | 19.8073 KOps/s | |
test_setitem_dim[range] | 0.1197ms | 77.3776μs | 12.9236 KOps/s | 13.3828 KOps/s | |
test_setitem_dim[tuple] | 76.0920μs | 41.3501μs | 24.1837 KOps/s | 25.3214 KOps/s | |
test_setitem | 68.3170μs | 20.7443μs | 48.2061 KOps/s | 47.7801 KOps/s | |
test_set | 78.2550μs | 20.3021μs | 49.2560 KOps/s | 49.1484 KOps/s | |
test_set_shared | 4.4078ms | 0.1774ms | 5.6354 KOps/s | 5.4107 KOps/s | |
test_update | 0.1300ms | 23.6139μs | 42.3479 KOps/s | 43.3118 KOps/s | |
test_update_nested | 71.6840μs | 33.4229μs | 29.9196 KOps/s | 29.6130 KOps/s | |
test_update__nested | 0.4299ms | 33.1948μs | 30.1252 KOps/s | 30.0525 KOps/s | |
test_set_nested | 64.4600μs | 22.3898μs | 44.6632 KOps/s | 44.3388 KOps/s | |
test_set_nested_new | 94.7660μs | 27.5926μs | 36.2416 KOps/s | 36.1910 KOps/s | |
test_select | 90.0280μs | 44.9377μs | 22.2530 KOps/s | 23.3152 KOps/s | |
test_select_nested | 0.1141ms | 62.4450μs | 16.0141 KOps/s | 15.8480 KOps/s | |
test_exclude_nested | 0.1537ms | 80.6628μs | 12.3973 KOps/s | 12.4629 KOps/s | |
test_empty[True] | 0.7569ms | 0.4050ms | 2.4689 KOps/s | 2.4680 KOps/s | |
test_empty[False] | 11.3735μs | 1.3704μs | 729.7301 KOps/s | 711.5959 KOps/s | |
test_unbind_speed | 0.4464ms | 0.2701ms | 3.7020 KOps/s | 3.7080 KOps/s | |
test_unbind_speed_stack0 | 0.3464ms | 0.2696ms | 3.7095 KOps/s | 3.7233 KOps/s | |
test_unbind_speed_stack1 | 0.7407ms | 0.6625ms | 1.5094 KOps/s | 1.2489 KOps/s | |
test_split | 95.4970ms | 1.7320ms | 577.3639 Ops/s | 581.4807 Ops/s | |
test_chunk | 99.7360ms | 1.7560ms | 569.4829 Ops/s | 636.9851 Ops/s | |
test_consolidate_njt[False-None] | 8.5092ms | 8.1135ms | 123.2507 Ops/s | 110.8625 Ops/s | |
test_creation[device0] | 0.2205ms | 90.9459μs | 10.9955 KOps/s | 10.9841 KOps/s | |
test_creation_from_tensor | 3.8544ms | 94.4793μs | 10.5843 KOps/s | 10.4486 KOps/s | |
test_add_one[memmap_tensor0] | 77.3840μs | 4.7539μs | 210.3545 KOps/s | 195.1737 KOps/s | |
test_contiguous[memmap_tensor0] | 14.2870μs | 0.5133μs | 1.9483 MOps/s | 1.9715 MOps/s | |
test_stack[memmap_tensor0] | 29.0640μs | 3.3446μs | 298.9922 KOps/s | 296.1761 KOps/s | |
test_memmaptd_index | 0.9715ms | 0.2295ms | 4.3576 KOps/s | 4.4272 KOps/s | |
test_memmaptd_index_astensor | 0.4641ms | 0.3131ms | 3.1935 KOps/s | 3.2316 KOps/s | |
test_memmaptd_index_op | 1.4338ms | 0.5775ms | 1.7316 KOps/s | 1.6569 KOps/s | |
test_serialize_model | 0.1258s | 0.1182s | 8.4574 Ops/s | 8.6028 Ops/s | |
test_serialize_model_pickle | 0.4890s | 0.3891s | 2.5702 Ops/s | 2.4965 Ops/s | |
test_serialize_weights | 0.1326s | 0.1173s | 8.5285 Ops/s | 8.8824 Ops/s | |
test_serialize_weights_returnearly | 0.1714s | 0.1603s | 6.2373 Ops/s | 5.8087 Ops/s | |
test_serialize_weights_pickle | 0.5711s | 0.4401s | 2.2722 Ops/s | 2.5027 Ops/s | |
test_serialize_weights_filesystem | 0.2624s | 0.1572s | 6.3594 Ops/s | 7.1057 Ops/s | |
test_serialize_model_filesystem | 0.1530s | 0.1460s | 6.8477 Ops/s | 6.6708 Ops/s | |
test_reshape_pytree | 61.5440μs | 25.8539μs | 38.6790 KOps/s | 38.7824 KOps/s | |
test_reshape_td | 69.2190μs | 32.6355μs | 30.6414 KOps/s | 30.5798 KOps/s | |
test_view_pytree | 67.9270μs | 26.1116μs | 38.2972 KOps/s | 38.7207 KOps/s | |
test_view_td | 77.0030μs | 39.4637μs | 25.3398 KOps/s | 26.3183 KOps/s | |
test_unbind_pytree | 55.4430μs | 29.2128μs | 34.2316 KOps/s | 34.9967 KOps/s | |
test_unbind_td | 0.3286ms | 39.4631μs | 25.3401 KOps/s | 24.7071 KOps/s | |
test_split_pytree | 67.4660μs | 28.9304μs | 34.5657 KOps/s | 35.1847 KOps/s | |
test_split_td | 0.5152ms | 45.1425μs | 22.1521 KOps/s | 22.3423 KOps/s | |
test_add_pytree | 74.8990μs | 34.8435μs | 28.6998 KOps/s | 28.6610 KOps/s | |
test_add_td | 0.1260ms | 58.4817μs | 17.0994 KOps/s | 16.5785 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1412ms | 66.4022μs | 15.0598 KOps/s | 15.1203 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.7648ms | 0.1742ms | 5.7403 KOps/s | 5.7523 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1097ms | 45.5842μs | 21.9374 KOps/s | 21.9514 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2288ms | 0.1165ms | 8.5845 KOps/s | 8.5398 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 81.6290μs | 27.7596μs | 36.0236 KOps/s | 35.5155 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1130ms | 57.8835μs | 17.2761 KOps/s | 16.7840 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1383ms | 77.7572μs | 12.8605 KOps/s | 12.5843 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1291ms | 65.1778μs | 15.3426 KOps/s | 15.0238 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1750ms | 0.1077ms | 9.2886 KOps/s | 9.3925 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4338ms | 0.2159ms | 4.6311 KOps/s | 4.6506 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1132ms | 47.9392μs | 20.8597 KOps/s | 21.8745 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5695ms | 67.4615μs | 14.8233 KOps/s | 14.8253 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2062ms | 0.1008ms | 9.9251 KOps/s | 10.0399 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3632ms | 0.1975ms | 5.0620 KOps/s | 5.0209 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 1.4331ms | 0.2313ms | 4.3241 KOps/s | 4.2904 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3088ms | 0.1128ms | 8.8664 KOps/s | 9.2196 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1392ms | 63.7208μs | 15.6935 KOps/s | 16.3934 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 97.0110μs | 49.2443μs | 20.3069 KOps/s | 20.8323 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2693ms | 0.1542ms | 6.4857 KOps/s | 6.3786 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1751ms | 0.1012ms | 9.8832 KOps/s | 10.0383 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 75.5970μs | 21.0117μs | 47.5926 KOps/s | 46.4130 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1273ms | 67.3495μs | 14.8479 KOps/s | 14.6116 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1596ms | 80.2127μs | 12.4669 KOps/s | 12.3831 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1422ms | 66.1957μs | 15.1067 KOps/s | 14.8778 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2997ms | 0.2174ms | 4.6006 KOps/s | 4.6812 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.9480ms | 1.3488ms | 741.4193 Ops/s | 705.5486 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2951ms | 0.2121ms | 4.7149 KOps/s | 4.7631 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0257ms | 0.8102ms | 1.2342 KOps/s | 1.2160 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5747ms | 0.4599ms | 2.1743 KOps/s | 2.2087 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0417ms | 2.6440ms | 378.2192 Ops/s | 359.8874 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1032ms | 37.6479μs | 26.5619 KOps/s | 26.1251 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5718ms | 34.4948μs | 28.9899 KOps/s | 31.7348 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1257ms | 30.4221μs | 32.8709 KOps/s | 33.3916 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 88.0640μs | 22.8027μs | 43.8545 KOps/s | 44.8045 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 81.4020μs | 32.3403μs | 30.9212 KOps/s | 32.2578 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.5778ms | 22.8448μs | 43.7736 KOps/s | 44.1349 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1156ms | 52.7731μs | 18.9490 KOps/s | 18.2659 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2661ms | 19.9069μs | 50.2338 KOps/s | 49.2297 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1216ms | 45.2485μs | 22.1002 KOps/s | 21.8410 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 77.8340μs | 18.5976μs | 53.7703 KOps/s | 54.9960 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1357ms | 45.5732μs | 21.9427 KOps/s | 21.5626 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 48.9410μs | 18.3903μs | 54.3766 KOps/s | 54.9960 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1321ms | 54.7549μs | 18.2632 KOps/s | 17.8311 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9556ms | 19.7961μs | 50.5150 KOps/s | 50.2486 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1054ms | 45.1024μs | 22.1718 KOps/s | 21.4364 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2679ms | 18.3337μs | 54.5443 KOps/s | 54.4305 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1145ms | 45.1644μs | 22.1413 KOps/s | 21.4922 KOps/s | |
test_compile_indexing[int-pytree-eager] | 72.4750μs | 18.3455μs | 54.5093 KOps/s | 55.3277 KOps/s | |
test_mod_add[eager] | 0.1074ms | 35.2018μs | 28.4076 KOps/s | 28.2585 KOps/s | |
test_mod_add[compile] | 0.1196ms | 64.2159μs | 15.5725 KOps/s | 15.0718 KOps/s | |
test_mod_add[compile-overhead] | 0.1273ms | 63.1371μs | 15.8386 KOps/s | 15.3925 KOps/s | |
test_mod_wrap[eager] | 0.4162ms | 0.2136ms | 4.6823 KOps/s | 4.3908 KOps/s | |
test_mod_wrap[compile] | 1.3248ms | 0.2214ms | 4.5159 KOps/s | 4.2856 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4199ms | 0.2207ms | 4.5320 KOps/s | 4.3236 KOps/s | |
test_mod_wrap_and_backward[eager] | 20.3302ms | 13.3221ms | 75.0635 Ops/s | 75.9755 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.4556ms | 11.9407ms | 83.7470 Ops/s | 87.3052 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 22.7734ms | 11.4755ms | 87.1425 Ops/s | 85.5330 Ops/s | |
test_seq_add[eager] | 0.1916ms | 0.1130ms | 8.8477 KOps/s | 8.2946 KOps/s | |
test_seq_add[compile] | 0.1600ms | 77.6522μs | 12.8779 KOps/s | 13.0122 KOps/s | |
test_seq_add[compile-overhead] | 0.1479ms | 74.0385μs | 13.5065 KOps/s | 13.1446 KOps/s | |
test_seq_wrap[eager] | 0.7111ms | 0.4271ms | 2.3416 KOps/s | 2.2181 KOps/s | |
test_seq_wrap[compile] | 0.5028ms | 0.2361ms | 4.2349 KOps/s | 4.1095 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3903ms | 0.2394ms | 4.1772 KOps/s | 4.1016 KOps/s | |
test_func_call_runtime[False-eager] | 0.9543ms | 0.5189ms | 1.9271 KOps/s | 1.8575 KOps/s | |
test_func_call_runtime[False-compile] | 0.8066ms | 0.4384ms | 2.2810 KOps/s | 2.2358 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5104ms | 0.4337ms | 2.3058 KOps/s | 2.2371 KOps/s | |
test_func_call_runtime[True-eager] | 0.8527ms | 0.7383ms | 1.3544 KOps/s | 1.3262 KOps/s | |
test_func_call_runtime[True-compile] | 0.5880ms | 0.4586ms | 2.1806 KOps/s | 2.1328 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6293ms | 0.4545ms | 2.2004 KOps/s | 2.1491 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7447ms | 0.5119ms | 1.9534 KOps/s | 1.8498 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6025ms | 0.4350ms | 2.2988 KOps/s | 2.2497 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5891ms | 0.4330ms | 2.3097 KOps/s | 2.2375 KOps/s | |
test_func_call_cm_runtime[True-eager] | 0.9717ms | 0.8687ms | 1.1511 KOps/s | 1.1221 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8911ms | 0.7562ms | 1.3223 KOps/s | 1.2591 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 3.0442ms | 0.8076ms | 1.2383 KOps/s | 1.2390 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7040ms | 1.8649ms | 536.2168 Ops/s | 518.6968 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0269ms | 0.5274ms | 1.8962 KOps/s | 1.8483 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9171ms | 0.5226ms | 1.9134 KOps/s | 1.8624 KOps/s | |
test_distributed | 0.3177ms | 0.1261ms | 7.9280 KOps/s | 7.7388 KOps/s | |
test_tdmodule | 64.9210μs | 26.0683μs | 38.3608 KOps/s | 35.6740 KOps/s | |
test_tdmodule_dispatch | 91.2300μs | 50.6190μs | 19.7554 KOps/s | 19.9632 KOps/s | |
test_tdseq | 60.2120μs | 28.2641μs | 35.3805 KOps/s | 31.5195 KOps/s | |
test_tdseq_dispatch | 84.4470μs | 53.3203μs | 18.7546 KOps/s | 17.9543 KOps/s | |
test_instantiation_functorch | 2.3326ms | 1.5115ms | 661.6124 Ops/s | 658.2012 Ops/s | |
test_exec_functorch | 0.3255ms | 0.1737ms | 5.7563 KOps/s | 5.5484 KOps/s | |
test_exec_functional_call | 0.5928ms | 0.1651ms | 6.0579 KOps/s | 5.7023 KOps/s | |
test_exec_td_decorator | 0.4860ms | 0.2229ms | 4.4867 KOps/s | 4.2888 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7791ms | 0.6340ms | 1.5774 KOps/s | 1.4707 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0575ms | 0.6535ms | 1.5302 KOps/s | 1.5001 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7403ms | 0.5114ms | 1.9553 KOps/s | 1.8566 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8749ms | 0.5142ms | 1.9447 KOps/s | 1.8457 KOps/s | |
test_to_module_speed[True] | 2.1308ms | 1.3258ms | 754.2366 Ops/s | 756.3345 Ops/s | |
test_to_module_speed[False] | 1.4408ms | 1.2800ms | 781.2447 Ops/s | 778.8247 Ops/s | |
test_tc_init | 80.9310μs | 46.4482μs | 21.5294 KOps/s | 20.4397 KOps/s | |
test_tc_init_nested | 0.1599ms | 90.8019μs | 11.0130 KOps/s | 10.3856 KOps/s | |
test_tc_first_layer_tensor | 18.0330μs | 1.5135μs | 660.7041 KOps/s | 662.5785 KOps/s | |
test_tc_first_layer_nontensor | 21.6500μs | 4.7247μs | 211.6533 KOps/s | 205.0571 KOps/s | |
test_tc_second_layer_tensor | 46.3660μs | 2.7726μs | 360.6741 KOps/s | 358.9494 KOps/s | |
test_tc_second_layer_nontensor | 28.4830μs | 5.9542μs | 167.9493 KOps/s | 167.9377 KOps/s | |
test_unbind | 0.2235s | 12.5537ms | 79.6579 Ops/s | 69.3513 Ops/s | |
test_full_like | 9.2849ms | 7.2931ms | 137.1161 Ops/s | 146.0094 Ops/s | |
test_zeros_like | 7.5893ms | 4.5004ms | 222.2012 Ops/s | 367.1241 Ops/s | |
test_ones_like | 9.1951ms | 4.9202ms | 203.2440 Ops/s | 324.4400 Ops/s | |
test_clone | 6.4520ms | 4.7847ms | 208.9997 Ops/s | 207.3918 Ops/s | |
test_squeeze | 62.7570μs | 12.1171μs | 82.5279 KOps/s | 77.8473 KOps/s | |
test_unsqueeze | 0.2986ms | 91.6247μs | 10.9141 KOps/s | 11.0036 KOps/s | |
test_split | 0.3216ms | 0.1931ms | 5.1780 KOps/s | 5.1701 KOps/s | |
test_permute | 0.3451ms | 0.1987ms | 5.0325 KOps/s | 5.0453 KOps/s | |
test_stack | 29.3558ms | 25.0065ms | 39.9897 Ops/s | 39.8499 Ops/s | |
test_cat | 28.5185ms | 25.0511ms | 39.9185 Ops/s | 40.1779 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 37.9010μs | 13.3730μs | 74.7777 KOps/s | 76.1648 KOps/s | |
test_plain_set_stack_nested | 39.3810μs | 13.4625μs | 74.2802 KOps/s | 75.3731 KOps/s | |
test_plain_set_nested_inplace | 86.1520μs | 14.5447μs | 68.7537 KOps/s | 70.3086 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1750ms | 14.4241μs | 69.3285 KOps/s | 70.5856 KOps/s | |
test_items | 0.1608ms | 2.8790μs | 347.3475 KOps/s | 339.3105 KOps/s | |
test_items_nested | 0.5483ms | 0.3682ms | 2.7162 KOps/s | 2.6724 KOps/s | |
test_items_nested_locked | 0.5556ms | 0.3707ms | 2.6973 KOps/s | 2.7072 KOps/s | |
test_items_nested_leaf | 0.2480ms | 58.9803μs | 16.9548 KOps/s | 17.2259 KOps/s | |
test_items_stack_nested | 0.5555ms | 0.3726ms | 2.6835 KOps/s | 2.6823 KOps/s | |
test_items_stack_nested_leaf | 0.1138ms | 59.4790μs | 16.8127 KOps/s | 16.4950 KOps/s | |
test_items_stack_nested_locked | 0.4036ms | 0.3664ms | 2.7296 KOps/s | 2.6781 KOps/s | |
test_keys | 58.1610μs | 3.4736μs | 287.8863 KOps/s | 287.7451 KOps/s | |
test_keys_nested | 0.1240ms | 90.2279μs | 11.0830 KOps/s | 11.2132 KOps/s | |
test_keys_nested_locked | 0.6822ms | 95.8840μs | 10.4293 KOps/s | 10.5133 KOps/s | |
test_keys_nested_leaf | 0.1151ms | 80.1586μs | 12.4753 KOps/s | 12.4260 KOps/s | |
test_keys_stack_nested | 0.1471ms | 90.2327μs | 11.0825 KOps/s | 11.0851 KOps/s | |
test_keys_stack_nested_leaf | 0.1367ms | 81.0802μs | 12.3335 KOps/s | 12.2934 KOps/s | |
test_keys_stack_nested_locked | 0.1597ms | 97.6261μs | 10.2432 KOps/s | 10.5014 KOps/s | |
test_values | 6.0800μs | 0.8628μs | 1.1590 MOps/s | 1.1643 MOps/s | |
test_values_nested | 0.1059ms | 37.7457μs | 26.4931 KOps/s | 26.3681 KOps/s | |
test_values_nested_locked | 0.1874ms | 39.5196μs | 25.3039 KOps/s | 25.2691 KOps/s | |
test_values_nested_leaf | 0.2384ms | 42.0798μs | 23.7644 KOps/s | 23.2239 KOps/s | |
test_values_stack_nested | 0.2430ms | 38.0532μs | 26.2790 KOps/s | 26.2711 KOps/s | |
test_values_stack_nested_leaf | 0.1705ms | 42.9504μs | 23.2827 KOps/s | 23.1623 KOps/s | |
test_values_stack_nested_locked | 78.7810μs | 39.9118μs | 25.0552 KOps/s | 25.4543 KOps/s | |
test_membership | 2.1280μs | 0.5000μs | 2.0000 MOps/s | 1.9509 MOps/s | |
test_membership_nested | 40.9600μs | 2.0640μs | 484.4872 KOps/s | 478.0273 KOps/s | |
test_membership_nested_leaf | 17.2905μs | 2.0705μs | 482.9645 KOps/s | 489.9925 KOps/s | |
test_membership_stacked_nested | 35.6500μs | 2.1048μs | 475.0959 KOps/s | 483.3106 KOps/s | |
test_membership_stacked_nested_leaf | 29.5310μs | 2.1557μs | 463.8891 KOps/s | 484.0093 KOps/s | |
test_membership_nested_last | 40.0010μs | 3.1368μs | 318.7988 KOps/s | 323.4532 KOps/s | |
test_membership_nested_leaf_last | 29.3700μs | 3.1223μs | 320.2751 KOps/s | 321.0464 KOps/s | |
test_membership_stacked_nested_last | 31.6500μs | 8.3227μs | 120.1540 KOps/s | 217.7753 KOps/s | |
test_membership_stacked_nested_leaf_last | 30.4210μs | 8.3045μs | 120.4163 KOps/s | 216.9448 KOps/s | |
test_nested_getleaf | 34.0710μs | 6.2416μs | 160.2150 KOps/s | 162.6819 KOps/s | |
test_nested_get | 29.1100μs | 5.9581μs | 167.8396 KOps/s | 170.6432 KOps/s | |
test_stacked_getleaf | 46.3400μs | 6.2057μs | 161.1426 KOps/s | 161.7118 KOps/s | |
test_stacked_get | 41.3910μs | 5.9131μs | 169.1164 KOps/s | 170.7467 KOps/s | |
test_nested_getitemleaf | 27.2400μs | 6.4787μs | 154.3515 KOps/s | 155.1406 KOps/s | |
test_nested_getitem | 35.2710μs | 6.2954μs | 158.8455 KOps/s | 162.4313 KOps/s | |
test_stacked_getitemleaf | 44.6900μs | 6.5207μs | 153.3589 KOps/s | 156.1997 KOps/s | |
test_stacked_getitem | 54.8710μs | 6.2176μs | 160.8350 KOps/s | 165.3718 KOps/s | |
test_lock_nested | 9.4367ms | 0.3543ms | 2.8228 KOps/s | 2.8192 KOps/s | |
test_lock_stack_nested | 0.4445ms | 0.3426ms | 2.9186 KOps/s | 2.8419 KOps/s | |
test_unlock_nested | 0.4661ms | 0.2853ms | 3.5054 KOps/s | 3.4977 KOps/s | |
test_unlock_stack_nested | 0.3206ms | 0.2792ms | 3.5813 KOps/s | 3.5064 KOps/s | |
test_flatten_speed | 0.1110ms | 77.1485μs | 12.9620 KOps/s | 13.0805 KOps/s | |
test_unflatten_speed | 0.3957ms | 0.3250ms | 3.0772 KOps/s | 3.0678 KOps/s | |
test_common_ops | 0.8238ms | 0.6653ms | 1.5032 KOps/s | 1.5141 KOps/s | |
test_creation | 72.0810μs | 1.7839μs | 560.5747 KOps/s | 568.8532 KOps/s | |
test_creation_empty | 0.1864ms | 10.5017μs | 95.2228 KOps/s | 99.0776 KOps/s | |
test_creation_nested_1 | 0.1854ms | 12.1512μs | 82.2967 KOps/s | 85.4093 KOps/s | |
test_creation_nested_2 | 41.9000μs | 14.9647μs | 66.8239 KOps/s | 68.7652 KOps/s | |
test_clone | 46.5910μs | 11.1422μs | 89.7488 KOps/s | 93.6653 KOps/s | |
test_getitem[int] | 1.1185ms | 10.8868μs | 91.8543 KOps/s | 92.1324 KOps/s | |
test_getitem[slice_int] | 0.1818ms | 21.6237μs | 46.2455 KOps/s | 47.2226 KOps/s | |
test_getitem[range] | 0.1392ms | 38.0161μs | 26.3046 KOps/s | 26.5473 KOps/s | |
test_getitem[tuple] | 0.1931ms | 18.8045μs | 53.1787 KOps/s | 54.1508 KOps/s | |
test_getitem[list] | 0.2270ms | 33.8289μs | 29.5605 KOps/s | 30.0767 KOps/s | |
test_setitem_dim[int] | 0.1650ms | 20.6111μs | 48.5176 KOps/s | 51.1451 KOps/s | |
test_setitem_dim[slice_int] | 0.1611ms | 40.1178μs | 24.9266 KOps/s | 25.4888 KOps/s | |
test_setitem_dim[range] | 93.9610μs | 53.9018μs | 18.5523 KOps/s | 18.5587 KOps/s | |
test_setitem_dim[tuple] | 51.7010μs | 33.2861μs | 30.0426 KOps/s | 30.7522 KOps/s | |
test_setitem | 0.1856ms | 16.7265μs | 59.7855 KOps/s | 62.3861 KOps/s | |
test_set | 0.2072ms | 16.1304μs | 61.9948 KOps/s | 65.4916 KOps/s | |
test_set_shared | 0.6091ms | 0.1607ms | 6.2221 KOps/s | 6.1271 KOps/s | |
test_update | 0.2175ms | 20.4181μs | 48.9762 KOps/s | 51.3900 KOps/s | |
test_update_nested | 0.1723ms | 26.4281μs | 37.8385 KOps/s | 39.3734 KOps/s | |
test_update__nested | 0.4547ms | 26.7260μs | 37.4168 KOps/s | 38.3923 KOps/s | |
test_set_nested | 54.9510μs | 17.5691μs | 56.9180 KOps/s | 58.5710 KOps/s | |
test_set_nested_new | 56.2300μs | 19.9677μs | 50.0808 KOps/s | 51.5779 KOps/s | |
test_select | 0.1254ms | 32.2705μs | 30.9880 KOps/s | 32.2265 KOps/s | |
test_select_nested | 88.8410μs | 43.9159μs | 22.7708 KOps/s | 22.6620 KOps/s | |
test_exclude_nested | 94.0820μs | 64.2019μs | 15.5759 KOps/s | 15.7244 KOps/s | |
test_empty[True] | 0.3303ms | 0.2974ms | 3.3623 KOps/s | 3.3432 KOps/s | |
test_empty[False] | 10.3301μs | 0.8285μs | 1.2070 MOps/s | 1.1986 MOps/s | |
test_to | 87.2420μs | 60.5637μs | 16.5115 KOps/s | 17.9785 KOps/s | |
test_to_nonblocking | 0.2056ms | 51.2008μs | 19.5309 KOps/s | 20.5415 KOps/s | |
test_unbind_speed | 0.3118ms | 0.2463ms | 4.0597 KOps/s | 4.0782 KOps/s | |
test_unbind_speed_stack0 | 0.2927ms | 0.2395ms | 4.1754 KOps/s | 4.1572 KOps/s | |
test_unbind_speed_stack1 | 93.4356ms | 0.7359ms | 1.3588 KOps/s | 1.3762 KOps/s | |
test_split | 94.6099ms | 1.6175ms | 618.2533 Ops/s | 626.2152 Ops/s | |
test_chunk | 96.7445ms | 1.6281ms | 614.2177 Ops/s | 631.0836 Ops/s | |
test_consolidate[False-None] | 3.3253ms | 2.6537ms | 376.8254 Ops/s | 377.1785 Ops/s | |
test_consolidate[default-None] | 2.0754ms | 1.6584ms | 602.9897 Ops/s | 590.7521 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8693ms | 1.7056ms | 586.2940 Ops/s | 573.2517 Ops/s | |
test_consolidate_njt[False-None] | 6.8401ms | 6.6019ms | 151.4715 Ops/s | 153.1792 Ops/s | |
test_to[False-False-None] | 1.9431ms | 1.7050ms | 586.5048 Ops/s | 592.7012 Ops/s | |
test_to[True-False-None] | 1.5970ms | 1.3361ms | 748.4216 Ops/s | 772.5013 Ops/s | |
test_to[within-False-None] | 4.4048ms | 4.1906ms | 238.6317 Ops/s | 242.9627 Ops/s | |
test_to[True-default-None] | 5.8605ms | 5.3887ms | 185.5721 Ops/s | 188.5870 Ops/s | |
test_to_njt[False-False-None] | 7.2994ms | 6.9765ms | 143.3388 Ops/s | 146.3708 Ops/s | |
test_to_njt[True-False-None] | 5.7940ms | 5.5522ms | 180.1090 Ops/s | 174.0736 Ops/s | |
test_to_njt[within-False-None] | 12.6836ms | 12.2585ms | 81.5763 Ops/s | 81.1129 Ops/s | |
test_creation[device0] | 0.4545ms | 80.3571μs | 12.4445 KOps/s | 11.7125 KOps/s | |
test_creation_from_tensor | 0.5762ms | 84.3729μs | 11.8521 KOps/s | 11.3150 KOps/s | |
test_add_one[memmap_tensor0] | 0.3617ms | 6.8685μs | 145.5918 KOps/s | 149.3842 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8350μs | 0.4209μs | 2.3760 MOps/s | 2.3528 MOps/s | |
test_stack[memmap_tensor0] | 39.8100μs | 4.5877μs | 217.9762 KOps/s | 223.7463 KOps/s | |
test_memmaptd_index | 1.4298ms | 0.2430ms | 4.1156 KOps/s | 3.6518 KOps/s | |
test_memmaptd_index_astensor | 0.4610ms | 0.3023ms | 3.3080 KOps/s | 3.0964 KOps/s | |
test_memmaptd_index_op | 0.7774ms | 0.6178ms | 1.6186 KOps/s | 1.6456 KOps/s | |
test_serialize_model | 0.4190s | 0.1718s | 5.8206 Ops/s | 7.6541 Ops/s | |
test_serialize_model_pickle | 1.3485s | 1.2106s | 0.8261 Ops/s | 0.8234 Ops/s | |
test_serialize_weights | 0.1318s | 0.1307s | 7.6503 Ops/s | 7.7204 Ops/s | |
test_serialize_weights_returnearly | 0.3200s | 55.4791ms | 18.0248 Ops/s | 15.2478 Ops/s | |
test_serialize_weights_pickle | 1.3761s | 1.2188s | 0.8205 Ops/s | 0.8429 Ops/s | |
test_reshape_pytree | 0.1156ms | 22.2442μs | 44.9556 KOps/s | 44.4980 KOps/s | |
test_reshape_td | 51.6800μs | 26.9477μs | 37.1090 KOps/s | 32.6088 KOps/s | |
test_view_pytree | 0.1809ms | 23.1739μs | 43.1520 KOps/s | 44.9611 KOps/s | |
test_view_td | 0.1652ms | 34.3912μs | 29.0772 KOps/s | 30.6487 KOps/s | |
test_unbind_pytree | 0.1452ms | 28.2018μs | 35.4587 KOps/s | 35.1284 KOps/s | |
test_unbind_td | 0.7291ms | 37.9994μs | 26.3162 KOps/s | 26.9564 KOps/s | |
test_split_pytree | 0.1665ms | 30.4372μs | 32.8546 KOps/s | 31.9632 KOps/s | |
test_split_td | 0.9081ms | 39.7476μs | 25.1588 KOps/s | 25.5536 KOps/s | |
test_add_pytree | 0.2314ms | 35.7060μs | 28.0065 KOps/s | 28.5543 KOps/s | |
test_add_td | 0.2242ms | 54.1243μs | 18.4760 KOps/s | 18.1073 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2736ms | 0.1250ms | 8.0029 KOps/s | 7.5217 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3411ms | 0.1366ms | 7.3207 KOps/s | 7.4908 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2728ms | 99.1319μs | 10.0876 KOps/s | 10.2505 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.3000ms | 0.1502ms | 6.6599 KOps/s | 6.7545 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1922ms | 25.2953μs | 39.5330 KOps/s | 34.8154 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1749ms | 29.4633μs | 33.9406 KOps/s | 34.1049 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3795ms | 64.7050μs | 15.4547 KOps/s | 15.1579 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1798ms | 49.8508μs | 20.0599 KOps/s | 19.9884 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2912ms | 0.1426ms | 7.0113 KOps/s | 7.1095 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3593ms | 0.2183ms | 4.5800 KOps/s | 4.7097 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2489ms | 97.8352μs | 10.2213 KOps/s | 10.3532 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2075ms | 56.0556μs | 17.8394 KOps/s | 17.8924 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2842ms | 0.1363ms | 7.3352 KOps/s | 7.4335 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6468ms | 0.4769ms | 2.0969 KOps/s | 2.1273 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3974ms | 0.2612ms | 3.8288 KOps/s | 3.8655 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2953ms | 0.1494ms | 6.6926 KOps/s | 6.7953 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2431ms | 69.1276μs | 14.4660 KOps/s | 14.1524 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2421ms | 99.2874μs | 10.0718 KOps/s | 10.1876 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5560ms | 0.3956ms | 2.5276 KOps/s | 2.5336 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2919ms | 0.1350ms | 7.4064 KOps/s | 7.2738 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2159ms | 25.5147μs | 39.1931 KOps/s | 54.9095 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 98.4310μs | 31.0415μs | 32.2149 KOps/s | 31.9662 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1484ms | 70.2484μs | 14.2352 KOps/s | 14.1743 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1463ms | 51.4979μs | 19.4183 KOps/s | 19.4710 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.5943ms | 0.3846ms | 2.6000 KOps/s | 2.1949 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9982ms | 2.6702ms | 374.5070 Ops/s | 390.8065 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6111ms | 0.4352ms | 2.2978 KOps/s | 2.1444 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.9863ms | 2.6388ms | 378.9668 Ops/s | 390.2352 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5690ms | 0.1127ms | 8.8761 KOps/s | 8.4126 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5936ms | 79.0110μs | 12.6565 KOps/s | 11.7907 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.4533ms | 0.1065ms | 9.3904 KOps/s | 9.6730 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2517ms | 67.4147μs | 14.8336 KOps/s | 14.3644 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2923ms | 0.1044ms | 9.5754 KOps/s | 9.5402 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2761ms | 69.7663μs | 14.3336 KOps/s | 15.0086 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2692ms | 99.9981μs | 10.0002 KOps/s | 9.9698 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2167ms | 17.3341μs | 57.6896 KOps/s | 55.5642 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2465ms | 96.6857μs | 10.3428 KOps/s | 10.3754 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1458ms | 15.9856μs | 62.5562 KOps/s | 62.5715 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2477ms | 97.2665μs | 10.2810 KOps/s | 10.3661 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1564ms | 16.0099μs | 62.4612 KOps/s | 62.7867 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2510ms | 0.1017ms | 9.8307 KOps/s | 9.8898 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6010ms | 17.6630μs | 56.6154 KOps/s | 56.4579 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2471ms | 96.9655μs | 10.3129 KOps/s | 10.0389 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2082ms | 16.1653μs | 61.8610 KOps/s | 62.7000 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2686ms | 0.1011ms | 9.8921 KOps/s | 10.3439 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1056ms | 15.9462μs | 62.7107 KOps/s | 62.2687 KOps/s | |
test_mod_add[eager] | 0.1894ms | 39.8440μs | 25.0979 KOps/s | 25.3164 KOps/s | |
test_mod_add[compile] | 0.2980ms | 79.6275μs | 12.5585 KOps/s | 12.3349 KOps/s | |
test_mod_add[compile-overhead] | 0.3263ms | 0.1676ms | 5.9677 KOps/s | 5.6618 KOps/s | |
test_mod_wrap[eager] | 0.3980ms | 0.2509ms | 3.9849 KOps/s | 3.8250 KOps/s | |
test_mod_wrap[compile] | 0.4291ms | 0.2852ms | 3.5068 KOps/s | 3.4755 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0727ms | 3.8169ms | 261.9905 Ops/s | 272.1282 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5558ms | 1.3710ms | 729.3822 Ops/s | 687.0618 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4967ms | 1.2658ms | 789.9848 Ops/s | 732.8737 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3895ms | 0.9205ms | 1.0864 KOps/s | 1.0804 KOps/s | |
test_seq_add[eager] | 0.5122ms | 0.1185ms | 8.4357 KOps/s | 7.8980 KOps/s | |
test_seq_add[compile] | 0.5177ms | 88.7147μs | 11.2721 KOps/s | 10.6848 KOps/s | |
test_seq_add[compile-overhead] | 0.2767ms | 0.1297ms | 7.7088 KOps/s | 7.6743 KOps/s | |
test_seq_wrap[eager] | 0.8431ms | 0.4281ms | 2.3361 KOps/s | 2.2899 KOps/s | |
test_seq_wrap[compile] | 0.4917ms | 0.3017ms | 3.3142 KOps/s | 3.1994 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3709ms | 0.2253ms | 4.4390 KOps/s | 4.4354 KOps/s | |
test_func_call_runtime[False-eager] | 1.1952ms | 0.7707ms | 1.2974 KOps/s | 1.2866 KOps/s | |
test_func_call_runtime[False-compile] | 1.1569ms | 0.7460ms | 1.3405 KOps/s | 1.3494 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5138ms | 0.3608ms | 2.7715 KOps/s | 2.7479 KOps/s | |
test_func_call_runtime[True-eager] | 1.3171ms | 0.9035ms | 1.1069 KOps/s | 1.0876 KOps/s | |
test_func_call_runtime[True-compile] | 0.9560ms | 0.7648ms | 1.3075 KOps/s | 1.2978 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7883ms | 0.3830ms | 2.6109 KOps/s | 2.5933 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1426ms | 0.7345ms | 1.3615 KOps/s | 1.3092 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1418ms | 0.7493ms | 1.3345 KOps/s | 1.3086 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5119ms | 0.3631ms | 2.7544 KOps/s | 2.7319 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3914ms | 1.0087ms | 991.4114 Ops/s | 984.4480 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.4964ms | 0.9919ms | 1.0082 KOps/s | 995.4480 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1505ms | 0.9898ms | 1.0103 KOps/s | 994.1787 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5546ms | 2.1143ms | 472.9779 Ops/s | 472.6084 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9973ms | 0.8119ms | 1.2317 KOps/s | 1.2218 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5648ms | 0.4135ms | 2.4182 KOps/s | 2.3999 KOps/s | |
test_distributed | 4.6896ms | 0.1772ms | 5.6435 KOps/s | 8.5193 KOps/s | |
test_tdmodule | 0.3465ms | 21.6575μs | 46.1734 KOps/s | 46.2880 KOps/s | |
test_tdmodule_dispatch | 68.8910μs | 38.8325μs | 25.7516 KOps/s | 26.1425 KOps/s | |
test_tdseq | 41.8600μs | 22.1019μs | 45.2449 KOps/s | 44.0489 KOps/s | |
test_tdseq_dispatch | 64.3600μs | 41.6625μs | 24.0024 KOps/s | 23.3701 KOps/s | |
test_instantiation_functorch | 1.7246ms | 1.5774ms | 633.9368 Ops/s | 639.9383 Ops/s | |
test_exec_functorch | 0.2894ms | 0.1496ms | 6.6830 KOps/s | 6.8115 KOps/s | |
test_exec_functional_call | 0.2520ms | 0.1431ms | 6.9871 KOps/s | 7.2643 KOps/s | |
test_exec_td_decorator | 0.3770ms | 0.1940ms | 5.1543 KOps/s | 5.3557 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8651ms | 0.6968ms | 1.4352 KOps/s | 1.4177 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8611ms | 0.6940ms | 1.4409 KOps/s | 1.3707 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7780ms | 0.6017ms | 1.6620 KOps/s | 1.6273 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7656ms | 0.6025ms | 1.6596 KOps/s | 1.5843 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.8726ms | 19.2928ms | 51.8328 Ops/s | 51.5442 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2403ms | 19.3681ms | 51.6313 Ops/s | 51.8778 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.2899ms | 19.6614ms | 50.8611 Ops/s | 51.8772 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7732ms | 19.1411ms | 52.2436 Ops/s | 51.7595 Ops/s | |
test_to_module_speed[True] | 1.4940ms | 0.9743ms | 1.0263 KOps/s | 1.0257 KOps/s | |
test_to_module_speed[False] | 1.1118ms | 0.9626ms | 1.0389 KOps/s | 1.0433 KOps/s | |
test_tc_init | 68.7610μs | 38.9512μs | 25.6732 KOps/s | 25.9773 KOps/s | |
test_tc_init_nested | 0.1715ms | 79.5040μs | 12.5780 KOps/s | 13.0459 KOps/s | |
test_tc_first_layer_tensor | 4.9001μs | 0.7060μs | 1.4165 MOps/s | 1.2096 MOps/s | |
test_tc_first_layer_nontensor | 26.3400μs | 2.2635μs | 441.7900 KOps/s | 436.8048 KOps/s | |
test_tc_second_layer_tensor | 29.2770μs | 1.4478μs | 690.6995 KOps/s | 662.6664 KOps/s | |
test_tc_second_layer_nontensor | 24.2300μs | 2.9936μs | 334.0425 KOps/s | 335.1573 KOps/s | |
test_unbind | 0.2182s | 12.2132ms | 81.8784 Ops/s | 140.5989 Ops/s | |
test_full_like | 9.7176ms | 9.2804ms | 107.7542 Ops/s | 105.8721 Ops/s | |
test_zeros_like | 5.3121ms | 4.3625ms | 229.2248 Ops/s | 232.7111 Ops/s | |
test_ones_like | 5.0148ms | 4.3792ms | 228.3547 Ops/s | 229.3277 Ops/s | |
test_clone | 7.1334ms | 6.5445ms | 152.7996 Ops/s | 107.1599 Ops/s | |
test_squeeze | 0.1519ms | 9.7750μs | 102.3014 KOps/s | 103.6555 KOps/s | |
test_unsqueeze | 0.2192ms | 73.0767μs | 13.6843 KOps/s | 13.8746 KOps/s | |
test_split | 0.3760ms | 0.1621ms | 6.1708 KOps/s | 6.0872 KOps/s | |
test_permute | 0.5704ms | 0.1769ms | 5.6520 KOps/s | 5.6087 KOps/s | |
test_stack | 51.1600ms | 50.7411ms | 19.7079 Ops/s | 19.5860 Ops/s | |
test_cat | 53.4453ms | 50.9004ms | 19.6462 Ops/s | 19.5119 Ops/s |
vmoens
added a commit
that referenced
this pull request
Jan 30, 2025
ghstack-source-id: edbc22ce562cd918ce5dd5c0441e47cdadf7d88a Pull Request resolved: #1194
vmoens
added a commit
that referenced
this pull request
Jan 30, 2025
ghstack-source-id: edbc22ce562cd918ce5dd5c0441e47cdadf7d88a Pull Request resolved: #1194
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):