-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix grad tests #1070
Merged
Merged
[BugFix] Fix grad tests #1070
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: 32c9ee0690db01b54de2e9cc0d83b595b87ac527 Pull Request resolved: #1070
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: 32c9ee0690db01b54de2e9cc0d83b595b87ac527 Pull Request resolved: #1070
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 46.8480μs | 21.5240μs | 46.4598 KOps/s | 46.4350 KOps/s | |
test_plain_set_stack_nested | 76.1730μs | 21.4512μs | 46.6174 KOps/s | 45.9817 KOps/s | |
test_plain_set_nested_inplace | 56.1650μs | 23.4488μs | 42.6461 KOps/s | 42.5386 KOps/s | |
test_plain_set_stack_nested_inplace | 85.3890μs | 23.5006μs | 42.5522 KOps/s | 42.9428 KOps/s | |
test_items | 23.2940μs | 4.1982μs | 238.1982 KOps/s | 247.4973 KOps/s | |
test_items_nested | 0.5563ms | 0.3417ms | 2.9265 KOps/s | 2.9714 KOps/s | |
test_items_nested_locked | 0.6299ms | 0.3439ms | 2.9077 KOps/s | 2.9625 KOps/s | |
test_items_nested_leaf | 0.1405ms | 71.2034μs | 14.0443 KOps/s | 14.2746 KOps/s | |
test_items_stack_nested | 0.5657ms | 0.3427ms | 2.9181 KOps/s | 2.9731 KOps/s | |
test_items_stack_nested_leaf | 0.3963ms | 75.7714μs | 13.1976 KOps/s | 14.0145 KOps/s | |
test_items_stack_nested_locked | 0.4810ms | 0.3438ms | 2.9085 KOps/s | 2.9588 KOps/s | |
test_keys | 29.3050μs | 3.4760μs | 287.6897 KOps/s | 274.9849 KOps/s | |
test_keys_nested | 0.2567ms | 0.1376ms | 7.2660 KOps/s | 7.4336 KOps/s | |
test_keys_nested_locked | 1.8075ms | 0.1432ms | 6.9845 KOps/s | 7.1693 KOps/s | |
test_keys_nested_leaf | 0.4615ms | 0.1142ms | 8.7536 KOps/s | 8.6071 KOps/s | |
test_keys_stack_nested | 0.3753ms | 0.1367ms | 7.3129 KOps/s | 7.4709 KOps/s | |
test_keys_stack_nested_leaf | 0.2029ms | 0.1181ms | 8.4663 KOps/s | 8.6858 KOps/s | |
test_keys_stack_nested_locked | 0.2566ms | 0.1424ms | 7.0229 KOps/s | 7.2559 KOps/s | |
test_values | 9.2274μs | 1.0346μs | 966.5888 KOps/s | 943.8994 KOps/s | |
test_values_nested | 0.1058ms | 54.5981μs | 18.3156 KOps/s | 18.6436 KOps/s | |
test_values_nested_locked | 0.1081ms | 54.6727μs | 18.2907 KOps/s | 18.3575 KOps/s | |
test_values_nested_leaf | 0.1231ms | 59.6592μs | 16.7619 KOps/s | 16.8823 KOps/s | |
test_values_stack_nested | 0.1070ms | 55.7846μs | 17.9261 KOps/s | 18.3316 KOps/s | |
test_values_stack_nested_leaf | 0.1333ms | 60.6432μs | 16.4899 KOps/s | 17.0413 KOps/s | |
test_values_stack_nested_locked | 0.1081ms | 55.9728μs | 17.8658 KOps/s | 18.2625 KOps/s | |
test_membership | 3.4450μs | 0.7554μs | 1.3238 MOps/s | 1.1165 MOps/s | |
test_membership_nested | 13.3740μs | 2.7157μs | 368.2242 KOps/s | 364.9741 KOps/s | |
test_membership_nested_leaf | 38.2610μs | 2.7623μs | 362.0182 KOps/s | 359.9347 KOps/s | |
test_membership_stacked_nested | 25.7380μs | 2.7300μs | 366.3009 KOps/s | 373.7011 KOps/s | |
test_membership_stacked_nested_leaf | 61.8660μs | 2.7230μs | 367.2440 KOps/s | 369.8825 KOps/s | |
test_membership_nested_last | 24.9070μs | 3.9495μs | 253.1980 KOps/s | 250.1057 KOps/s | |
test_membership_nested_leaf_last | 36.4380μs | 4.0084μs | 249.4788 KOps/s | 252.6788 KOps/s | |
test_membership_stacked_nested_last | 34.6200μs | 3.9845μs | 250.9744 KOps/s | 254.7659 KOps/s | |
test_membership_stacked_nested_leaf_last | 23.8140μs | 4.0076μs | 249.5269 KOps/s | 248.6922 KOps/s | |
test_nested_getleaf | 54.2010μs | 10.5510μs | 94.7776 KOps/s | 96.0680 KOps/s | |
test_nested_get | 44.4130μs | 9.9591μs | 100.4106 KOps/s | 101.6572 KOps/s | |
test_stacked_getleaf | 35.7470μs | 10.5536μs | 94.7540 KOps/s | 97.2378 KOps/s | |
test_stacked_get | 48.3600μs | 10.0310μs | 99.6910 KOps/s | 104.2235 KOps/s | |
test_nested_getitemleaf | 35.4670μs | 10.9250μs | 91.5331 KOps/s | 93.1015 KOps/s | |
test_nested_getitem | 32.5920μs | 10.0828μs | 99.1785 KOps/s | 99.7952 KOps/s | |
test_stacked_getitemleaf | 70.5620μs | 10.9725μs | 91.1368 KOps/s | 93.3632 KOps/s | |
test_stacked_getitem | 50.3340μs | 10.2176μs | 97.8700 KOps/s | 99.4836 KOps/s | |
test_lock_nested | 2.0886ms | 0.4847ms | 2.0630 KOps/s | 2.0709 KOps/s | |
test_lock_stack_nested | 0.8653ms | 0.4562ms | 2.1922 KOps/s | 2.1685 KOps/s | |
test_unlock_nested | 1.3997ms | 0.4103ms | 2.4375 KOps/s | 2.4480 KOps/s | |
test_unlock_stack_nested | 0.5112ms | 0.3732ms | 2.6798 KOps/s | 2.6107 KOps/s | |
test_flatten_speed | 0.1790ms | 92.4962μs | 10.8113 KOps/s | 10.9742 KOps/s | |
test_unflatten_speed | 1.1064ms | 0.4837ms | 2.0675 KOps/s | 2.1038 KOps/s | |
test_common_ops | 4.4452ms | 1.1204ms | 892.5051 Ops/s | 907.4090 Ops/s | |
test_creation | 18.3340μs | 2.1152μs | 472.7614 KOps/s | 481.8038 KOps/s | |
test_creation_empty | 50.9350μs | 17.5272μs | 57.0543 KOps/s | 53.0915 KOps/s | |
test_creation_nested_1 | 1.1891ms | 20.4966μs | 48.7886 KOps/s | 45.7827 KOps/s | |
test_creation_nested_2 | 53.6610μs | 25.0646μs | 39.8969 KOps/s | 38.7982 KOps/s | |
test_clone | 58.2590μs | 17.2242μs | 58.0578 KOps/s | 56.9950 KOps/s | |
test_getitem[int] | 0.8801ms | 16.3673μs | 61.0973 KOps/s | 61.6970 KOps/s | |
test_getitem[slice_int] | 0.1326ms | 28.9862μs | 34.4992 KOps/s | 34.2909 KOps/s | |
test_getitem[range] | 0.1694ms | 57.9917μs | 17.2439 KOps/s | 17.8833 KOps/s | |
test_getitem[tuple] | 0.1328ms | 24.2363μs | 41.2604 KOps/s | 40.5848 KOps/s | |
test_getitem[list] | 0.1802ms | 52.0503μs | 19.2122 KOps/s | 19.4206 KOps/s | |
test_setitem_dim[int] | 76.0120μs | 33.0083μs | 30.2954 KOps/s | 30.1149 KOps/s | |
test_setitem_dim[slice_int] | 0.1237ms | 61.4410μs | 16.2758 KOps/s | 16.5656 KOps/s | |
test_setitem_dim[range] | 0.1309ms | 83.6847μs | 11.9496 KOps/s | 12.0310 KOps/s | |
test_setitem_dim[tuple] | 0.1183ms | 49.6369μs | 20.1463 KOps/s | 20.5079 KOps/s | |
test_setitem | 87.3830μs | 30.1809μs | 33.1336 KOps/s | 33.0817 KOps/s | |
test_set | 80.8610μs | 28.8243μs | 34.6930 KOps/s | 34.3518 KOps/s | |
test_set_shared | 1.1981ms | 0.2149ms | 4.6533 KOps/s | 4.7097 KOps/s | |
test_update | 0.1344ms | 35.1260μs | 28.4689 KOps/s | 27.1584 KOps/s | |
test_update_nested | 0.1006ms | 46.2401μs | 21.6263 KOps/s | 20.8084 KOps/s | |
test_update__nested | 1.0622ms | 41.2790μs | 24.2254 KOps/s | 24.8274 KOps/s | |
test_set_nested | 83.4460μs | 32.5525μs | 30.7196 KOps/s | 30.9164 KOps/s | |
test_set_nested_new | 0.1068ms | 37.6180μs | 26.5830 KOps/s | 26.3167 KOps/s | |
test_select | 0.1169ms | 55.3101μs | 18.0799 KOps/s | 18.3121 KOps/s | |
test_select_nested | 0.1276ms | 59.7541μs | 16.7353 KOps/s | 17.1201 KOps/s | |
test_exclude_nested | 0.3464ms | 73.7500μs | 13.5593 KOps/s | 13.8012 KOps/s | |
test_empty[True] | 0.5260ms | 0.3466ms | 2.8852 KOps/s | 2.9088 KOps/s | |
test_empty[False] | 10.7253μs | 1.2254μs | 816.0856 KOps/s | 798.7081 KOps/s | |
test_unbind_speed | 0.3948ms | 0.2996ms | 3.3380 KOps/s | 3.3087 KOps/s | |
test_unbind_speed_stack0 | 0.4866ms | 0.2986ms | 3.3494 KOps/s | 3.3190 KOps/s | |
test_unbind_speed_stack1 | 96.3187ms | 0.8114ms | 1.2325 KOps/s | 1.3284 KOps/s | |
test_split | 2.1038ms | 1.9327ms | 517.4117 Ops/s | 475.7924 Ops/s | |
test_chunk | 0.1006s | 2.1259ms | 470.3941 Ops/s | 475.6220 Ops/s | |
test_creation[device0] | 0.2233ms | 0.1144ms | 8.7376 KOps/s | 8.7319 KOps/s | |
test_creation_from_tensor | 4.1643ms | 0.1154ms | 8.6636 KOps/s | 8.6501 KOps/s | |
test_add_one[memmap_tensor0] | 0.1654ms | 6.9848μs | 143.1689 KOps/s | 142.4628 KOps/s | |
test_contiguous[memmap_tensor0] | 21.7700μs | 1.8523μs | 539.8775 KOps/s | 529.3851 KOps/s | |
test_stack[memmap_tensor0] | 43.7320μs | 5.5014μs | 181.7728 KOps/s | 185.7574 KOps/s | |
test_memmaptd_index | 1.0349ms | 0.4013ms | 2.4916 KOps/s | 2.5286 KOps/s | |
test_memmaptd_index_astensor | 1.1132ms | 0.4855ms | 2.0598 KOps/s | 2.1146 KOps/s | |
test_memmaptd_index_op | 1.6399ms | 1.0008ms | 999.2205 Ops/s | 1.0077 KOps/s | |
test_serialize_model | 0.1234s | 0.1156s | 8.6486 Ops/s | 8.4451 Ops/s | |
test_serialize_model_pickle | 0.4948s | 0.4008s | 2.4947 Ops/s | 2.5065 Ops/s | |
test_serialize_weights | 0.1213s | 0.1143s | 8.7463 Ops/s | 8.5979 Ops/s | |
test_serialize_weights_returnearly | 0.2577s | 0.1676s | 5.9649 Ops/s | 6.3577 Ops/s | |
test_serialize_weights_pickle | 1.1362s | 0.7144s | 1.3998 Ops/s | 2.5665 Ops/s | |
test_serialize_weights_filesystem | 0.1430s | 0.1374s | 7.2764 Ops/s | 7.0210 Ops/s | |
test_serialize_model_filesystem | 0.1471s | 0.1409s | 7.0959 Ops/s | 6.5831 Ops/s | |
test_reshape_pytree | 0.1007ms | 38.5572μs | 25.9355 KOps/s | 26.2654 KOps/s | |
test_reshape_td | 0.1239ms | 47.9271μs | 20.8650 KOps/s | 21.9726 KOps/s | |
test_view_pytree | 94.4760μs | 38.8837μs | 25.7177 KOps/s | 26.9008 KOps/s | |
test_view_td | 0.1209ms | 51.9249μs | 19.2586 KOps/s | 19.1144 KOps/s | |
test_unbind_pytree | 87.3230μs | 35.4407μs | 28.2162 KOps/s | 28.5815 KOps/s | |
test_unbind_td | 0.3161ms | 44.1219μs | 22.6645 KOps/s | 22.0345 KOps/s | |
test_split_pytree | 0.1056ms | 39.1471μs | 25.5447 KOps/s | 26.8898 KOps/s | |
test_split_td | 0.4904ms | 57.1419μs | 17.5003 KOps/s | 17.6477 KOps/s | |
test_add_pytree | 0.1136ms | 44.3265μs | 22.5599 KOps/s | 23.1676 KOps/s | |
test_add_td | 0.1623ms | 83.2436μs | 12.0129 KOps/s | 11.8494 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1595ms | 70.4273μs | 14.1990 KOps/s | 14.0815 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.6085ms | 0.1916ms | 5.2188 KOps/s | 5.3841 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1310ms | 53.4878μs | 18.6959 KOps/s | 18.4740 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2319ms | 0.1454ms | 6.8761 KOps/s | 7.0612 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 78.3260μs | 25.6084μs | 39.0496 KOps/s | 38.7030 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1448ms | 69.9414μs | 14.2977 KOps/s | 14.1616 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1742ms | 81.0140μs | 12.3435 KOps/s | 12.7865 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4217ms | 71.1016μs | 14.0644 KOps/s | 14.8522 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2428ms | 0.1150ms | 8.6932 KOps/s | 8.7118 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3841ms | 0.2042ms | 4.8966 KOps/s | 4.7374 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1235ms | 53.7406μs | 18.6079 KOps/s | 18.7740 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4791ms | 68.4076μs | 14.6183 KOps/s | 14.2175 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3287ms | 0.1128ms | 8.8682 KOps/s | 9.0284 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6000ms | 0.2998ms | 3.3358 KOps/s | 3.3912 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4557ms | 0.2134ms | 4.6855 KOps/s | 4.5334 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.6354ms | 0.1194ms | 8.3744 KOps/s | 8.7776 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1289ms | 62.6269μs | 15.9676 KOps/s | 15.9413 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1203ms | 53.6732μs | 18.6313 KOps/s | 18.5316 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6359ms | 0.2441ms | 4.0965 KOps/s | 4.1166 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1924ms | 0.1115ms | 8.9670 KOps/s | 8.6660 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.8000μs | 20.4820μs | 48.8234 KOps/s | 48.0760 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1217ms | 58.8655μs | 16.9879 KOps/s | 16.6608 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.3039ms | 82.0635μs | 12.1857 KOps/s | 12.4763 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1307ms | 70.2084μs | 14.2433 KOps/s | 14.5847 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3146ms | 0.2144ms | 4.6633 KOps/s | 4.5661 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.8537ms | 1.7007ms | 587.9822 Ops/s | 580.2044 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4389ms | 0.2123ms | 4.7113 KOps/s | 4.7526 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.9536ms | 1.1497ms | 869.7607 Ops/s | 883.2997 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5951ms | 0.4595ms | 2.1761 KOps/s | 2.1544 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0314ms | 3.7917ms | 263.7305 Ops/s | 254.6542 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1211ms | 43.7747μs | 22.8443 KOps/s | 22.7122 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6926ms | 48.0456μs | 20.8135 KOps/s | 20.3912 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 97.3020μs | 36.8535μs | 27.1345 KOps/s | 26.9057 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 90.2290μs | 29.2066μs | 34.2388 KOps/s | 35.6719 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 98.4640μs | 37.4412μs | 26.7086 KOps/s | 26.7146 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 99.7060μs | 29.4057μs | 34.0071 KOps/s | 33.7383 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1690ms | 77.6081μs | 12.8852 KOps/s | 13.2860 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.7701ms | 27.8531μs | 35.9027 KOps/s | 34.9217 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1367ms | 68.9944μs | 14.4939 KOps/s | 14.2701 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 82.4840μs | 23.5736μs | 42.4204 KOps/s | 44.5221 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1420ms | 70.4922μs | 14.1860 KOps/s | 14.4944 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 61.2850μs | 23.2202μs | 43.0659 KOps/s | 43.8560 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1695ms | 78.5072μs | 12.7377 KOps/s | 13.0969 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.7942ms | 27.4457μs | 36.4355 KOps/s | 34.3973 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1384ms | 69.2735μs | 14.4355 KOps/s | 14.3874 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 58.3690μs | 23.0690μs | 43.3482 KOps/s | 43.8986 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1380ms | 69.6749μs | 14.3524 KOps/s | 14.3996 KOps/s | |
test_compile_indexing[int-pytree-eager] | 61.1850μs | 23.1067μs | 43.2776 KOps/s | 44.0665 KOps/s | |
test_mod_add[eager] | 77.1640μs | 25.6534μs | 38.9813 KOps/s | 39.0677 KOps/s | |
test_mod_add[compile] | 95.4480μs | 44.3688μs | 22.5383 KOps/s | 22.7424 KOps/s | |
test_mod_add[compile-overhead] | 99.5260μs | 43.9101μs | 22.7738 KOps/s | 22.3969 KOps/s | |
test_mod_wrap[eager] | 0.3750ms | 0.2086ms | 4.7944 KOps/s | 4.8135 KOps/s | |
test_mod_wrap[compile] | 1.7125ms | 0.2018ms | 4.9542 KOps/s | 4.9993 KOps/s | |
test_mod_wrap[compile-overhead] | 1.8159ms | 0.2023ms | 4.9423 KOps/s | 5.0743 KOps/s | |
test_mod_wrap_and_backward[eager] | 17.1217ms | 12.0155ms | 83.2262 Ops/s | 90.9881 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.6212ms | 12.8796ms | 77.6419 Ops/s | 79.8103 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.8745ms | 12.8821ms | 77.6271 Ops/s | 80.4245 Ops/s | |
test_seq_add[eager] | 0.1906ms | 86.8419μs | 11.5152 KOps/s | 10.9291 KOps/s | |
test_seq_add[compile] | 0.1199ms | 60.0116μs | 16.6634 KOps/s | 16.9535 KOps/s | |
test_seq_add[compile-overhead] | 0.1448ms | 58.5069μs | 17.0920 KOps/s | 16.8646 KOps/s | |
test_seq_wrap[eager] | 0.5671ms | 0.3811ms | 2.6242 KOps/s | 2.5515 KOps/s | |
test_seq_wrap[compile] | 0.4308ms | 0.2258ms | 4.4295 KOps/s | 4.4385 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3654ms | 0.2243ms | 4.4589 KOps/s | 4.4458 KOps/s | |
test_func_call_runtime[False-eager] | 1.1844ms | 0.5532ms | 1.8076 KOps/s | 1.8647 KOps/s | |
test_func_call_runtime[False-compile] | 0.5310ms | 0.4222ms | 2.3686 KOps/s | 2.3831 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6929ms | 0.4220ms | 2.3695 KOps/s | 2.4115 KOps/s | |
test_func_call_runtime[True-eager] | 1.2815ms | 0.7418ms | 1.3480 KOps/s | 1.3548 KOps/s | |
test_func_call_runtime[True-compile] | 0.6617ms | 0.4654ms | 2.1485 KOps/s | 2.1838 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 1.0553ms | 0.4645ms | 2.1531 KOps/s | 2.1762 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6442ms | 0.5288ms | 1.8911 KOps/s | 1.8992 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5605ms | 0.4248ms | 2.3538 KOps/s | 2.3140 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 1.0260ms | 0.4231ms | 2.3637 KOps/s | 2.3588 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0694ms | 0.8817ms | 1.1342 KOps/s | 1.1370 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8997ms | 0.4925ms | 2.0304 KOps/s | 2.0511 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0618ms | 0.4914ms | 2.0348 KOps/s | 2.0478 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4006ms | 1.8821ms | 531.3302 Ops/s | 535.2767 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0126ms | 0.5196ms | 1.9246 KOps/s | 1.9319 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8156ms | 0.5195ms | 1.9248 KOps/s | 1.9536 KOps/s | |
test_distributed | 0.3016ms | 0.1233ms | 8.1127 KOps/s | 7.8813 KOps/s | |
test_tdmodule | 38.5320μs | 18.6475μs | 53.6265 KOps/s | 52.8986 KOps/s | |
test_tdmodule_dispatch | 71.7640μs | 36.3194μs | 27.5335 KOps/s | 26.9198 KOps/s | |
test_tdseq | 49.8830μs | 21.1926μs | 47.1862 KOps/s | 47.0911 KOps/s | |
test_tdseq_dispatch | 79.3780μs | 42.9392μs | 23.2888 KOps/s | 23.8802 KOps/s | |
test_instantiation_functorch | 1.7754ms | 1.5098ms | 662.3285 Ops/s | 660.0071 Ops/s | |
test_exec_functorch | 0.2761ms | 0.1737ms | 5.7583 KOps/s | 5.5578 KOps/s | |
test_exec_functional_call | 0.7675ms | 0.1802ms | 5.5493 KOps/s | 5.8280 KOps/s | |
test_exec_td_decorator | 0.4519ms | 0.2202ms | 4.5408 KOps/s | 4.3674 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8329ms | 0.6388ms | 1.5655 KOps/s | 1.5486 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 2.2804ms | 0.6691ms | 1.4946 KOps/s | 1.5857 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8482ms | 0.5242ms | 1.9076 KOps/s | 1.9413 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8392ms | 0.5226ms | 1.9135 KOps/s | 1.9510 KOps/s | |
test_to_module_speed[True] | 2.0865ms | 1.2946ms | 772.4651 Ops/s | 780.3704 Ops/s | |
test_to_module_speed[False] | 1.7920ms | 1.2604ms | 793.4191 Ops/s | 806.1265 Ops/s | |
test_tc_init | 89.7880μs | 44.0745μs | 22.6889 KOps/s | 22.1491 KOps/s | |
test_tc_init_nested | 0.2106ms | 90.5107μs | 11.0484 KOps/s | 10.8908 KOps/s | |
test_tc_first_layer_tensor | 36.0880μs | 1.5016μs | 665.9669 KOps/s | 672.4405 KOps/s | |
test_tc_first_layer_nontensor | 25.0870μs | 4.5891μs | 217.9082 KOps/s | 216.3166 KOps/s | |
test_tc_second_layer_tensor | 0.1249ms | 2.7853μs | 359.0277 KOps/s | 354.5113 KOps/s | |
test_tc_second_layer_nontensor | 51.7870μs | 5.9181μs | 168.9744 KOps/s | 167.7661 KOps/s | |
test_unbind | 0.2202s | 12.3553ms | 80.9371 Ops/s | 84.7679 Ops/s | |
test_full_like | 15.1504ms | 11.5606ms | 86.5009 Ops/s | 79.1494 Ops/s | |
test_zeros_like | 11.5870ms | 7.0475ms | 141.8935 Ops/s | 135.7073 Ops/s | |
test_ones_like | 11.8546ms | 7.4206ms | 134.7591 Ops/s | 131.4181 Ops/s | |
test_clone | 15.2521ms | 9.1795ms | 108.9388 Ops/s | 108.6388 Ops/s | |
test_squeeze | 65.2420μs | 11.2258μs | 89.0807 KOps/s | 86.8982 KOps/s | |
test_unsqueeze | 0.1790ms | 86.0210μs | 11.6251 KOps/s | 11.1378 KOps/s | |
test_split | 0.4620ms | 0.1806ms | 5.5374 KOps/s | 5.2613 KOps/s | |
test_permute | 0.3044ms | 0.2113ms | 4.7324 KOps/s | 4.7397 KOps/s | |
test_stack | 27.4906ms | 24.4559ms | 40.8900 Ops/s | 42.1135 Ops/s | |
test_cat | 28.8616ms | 24.4343ms | 40.9260 Ops/s | 41.8835 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 38.4410μs | 13.9759μs | 71.5516 KOps/s | 63.1882 KOps/s | |
test_plain_set_stack_nested | 41.6810μs | 14.1622μs | 70.6107 KOps/s | 62.1313 KOps/s | |
test_plain_set_nested_inplace | 42.2510μs | 15.1272μs | 66.1061 KOps/s | 58.8672 KOps/s | |
test_plain_set_stack_nested_inplace | 41.4820μs | 15.0126μs | 66.6108 KOps/s | 59.7874 KOps/s | |
test_items | 24.0210μs | 2.9030μs | 344.4725 KOps/s | 348.2919 KOps/s | |
test_items_nested | 0.3750ms | 0.3184ms | 3.1404 KOps/s | 3.1560 KOps/s | |
test_items_nested_locked | 0.3705ms | 0.3229ms | 3.0965 KOps/s | 3.1057 KOps/s | |
test_items_nested_leaf | 92.8430μs | 58.1292μs | 17.2031 KOps/s | 17.3273 KOps/s | |
test_items_stack_nested | 0.5093ms | 0.3273ms | 3.0551 KOps/s | 3.1381 KOps/s | |
test_items_stack_nested_leaf | 86.1130μs | 59.4366μs | 16.8246 KOps/s | 16.8418 KOps/s | |
test_items_stack_nested_locked | 0.3545ms | 0.3234ms | 3.0922 KOps/s | 3.1248 KOps/s | |
test_keys | 24.8600μs | 3.4875μs | 286.7384 KOps/s | 289.9349 KOps/s | |
test_keys_nested | 0.1042ms | 69.5062μs | 14.3872 KOps/s | 14.2259 KOps/s | |
test_keys_nested_locked | 2.3342ms | 75.4122μs | 13.2605 KOps/s | 13.0363 KOps/s | |
test_keys_nested_leaf | 86.3420μs | 60.5947μs | 16.5031 KOps/s | 16.2255 KOps/s | |
test_keys_stack_nested | 99.2830μs | 70.4693μs | 14.1906 KOps/s | 14.2070 KOps/s | |
test_keys_stack_nested_leaf | 94.7230μs | 61.4742μs | 16.2670 KOps/s | 16.2218 KOps/s | |
test_keys_stack_nested_locked | 0.1078ms | 75.8216μs | 13.1888 KOps/s | 13.2989 KOps/s | |
test_values | 5.5733μs | 0.8494μs | 1.1773 MOps/s | 1.1875 MOps/s | |
test_values_nested | 58.1120μs | 31.5019μs | 31.7441 KOps/s | 32.2411 KOps/s | |
test_values_nested_locked | 60.1920μs | 33.2284μs | 30.0947 KOps/s | 30.7626 KOps/s | |
test_values_nested_leaf | 58.6220μs | 33.8333μs | 29.5567 KOps/s | 30.0387 KOps/s | |
test_values_stack_nested | 65.7020μs | 32.2577μs | 31.0003 KOps/s | 31.7099 KOps/s | |
test_values_stack_nested_leaf | 60.2720μs | 34.5357μs | 28.9555 KOps/s | 29.3781 KOps/s | |
test_values_stack_nested_locked | 66.0420μs | 33.9265μs | 29.4755 KOps/s | 30.2765 KOps/s | |
test_membership | 1.9786μs | 0.5119μs | 1.9533 MOps/s | 1.9357 MOps/s | |
test_membership_nested | 16.2905μs | 1.8834μs | 530.9444 KOps/s | 527.8548 KOps/s | |
test_membership_nested_leaf | 18.1255μs | 1.8692μs | 534.9984 KOps/s | 524.5213 KOps/s | |
test_membership_stacked_nested | 20.3800μs | 1.9282μs | 518.6069 KOps/s | 509.3233 KOps/s | |
test_membership_stacked_nested_leaf | 46.6010μs | 1.9329μs | 517.3699 KOps/s | 517.0159 KOps/s | |
test_membership_nested_last | 30.3910μs | 2.8292μs | 353.4612 KOps/s | 354.4577 KOps/s | |
test_membership_nested_leaf_last | 23.5310μs | 2.8065μs | 356.3172 KOps/s | 359.1310 KOps/s | |
test_membership_stacked_nested_last | 35.7310μs | 2.8203μs | 354.5696 KOps/s | 127.9615 KOps/s | |
test_membership_stacked_nested_leaf_last | 23.7410μs | 2.8178μs | 354.8819 KOps/s | 127.0745 KOps/s | |
test_nested_getleaf | 32.0610μs | 5.9933μs | 166.8534 KOps/s | 164.9303 KOps/s | |
test_nested_get | 31.4910μs | 5.7123μs | 175.0593 KOps/s | 174.7290 KOps/s | |
test_stacked_getleaf | 41.8610μs | 6.0561μs | 165.1230 KOps/s | 166.9064 KOps/s | |
test_stacked_get | 34.5210μs | 5.6856μs | 175.8842 KOps/s | 175.6934 KOps/s | |
test_nested_getitemleaf | 28.8510μs | 6.0631μs | 164.9325 KOps/s | 162.8633 KOps/s | |
test_nested_getitem | 89.0530μs | 5.7960μs | 172.5336 KOps/s | 169.9594 KOps/s | |
test_stacked_getitemleaf | 32.9910μs | 6.0900μs | 164.2028 KOps/s | 164.4775 KOps/s | |
test_stacked_getitem | 34.9210μs | 5.7725μs | 173.2358 KOps/s | 174.3639 KOps/s | |
test_lock_nested | 3.0112ms | 0.4231ms | 2.3634 KOps/s | 2.3802 KOps/s | |
test_lock_stack_nested | 0.4185ms | 0.3866ms | 2.5865 KOps/s | 2.6950 KOps/s | |
test_unlock_nested | 0.7511ms | 0.3611ms | 2.7693 KOps/s | 2.7829 KOps/s | |
test_unlock_stack_nested | 0.3964ms | 0.3281ms | 3.0474 KOps/s | 3.2035 KOps/s | |
test_flatten_speed | 0.1258ms | 72.9550μs | 13.7071 KOps/s | 13.8638 KOps/s | |
test_unflatten_speed | 0.3293ms | 0.2912ms | 3.4344 KOps/s | 3.4404 KOps/s | |
test_common_ops | 1.5036ms | 1.2246ms | 816.6040 Ops/s | 778.8786 Ops/s | |
test_creation | 25.6800μs | 1.5104μs | 662.0545 KOps/s | 675.5259 KOps/s | |
test_creation_empty | 38.2910μs | 14.5365μs | 68.7922 KOps/s | 55.7267 KOps/s | |
test_creation_nested_1 | 43.3610μs | 16.0784μs | 62.1951 KOps/s | 50.6707 KOps/s | |
test_creation_nested_2 | 46.1810μs | 18.6277μs | 53.6834 KOps/s | 44.4357 KOps/s | |
test_clone | 79.5420μs | 29.6555μs | 33.7206 KOps/s | 34.1317 KOps/s | |
test_getitem[int] | 1.3555ms | 16.3372μs | 61.2100 KOps/s | 61.2563 KOps/s | |
test_getitem[slice_int] | 0.1290ms | 28.3967μs | 35.2153 KOps/s | 33.6857 KOps/s | |
test_getitem[range] | 0.2490ms | 0.1156ms | 8.6499 KOps/s | 8.7464 KOps/s | |
test_getitem[tuple] | 0.1339ms | 25.0818μs | 39.8695 KOps/s | 39.6118 KOps/s | |
test_getitem[list] | 0.2133ms | 0.1100ms | 9.0947 KOps/s | 9.4703 KOps/s | |
test_setitem_dim[int] | 72.1720μs | 48.8533μs | 20.4695 KOps/s | 22.2248 KOps/s | |
test_setitem_dim[slice_int] | 0.1709ms | 69.5042μs | 14.3876 KOps/s | 14.4527 KOps/s | |
test_setitem_dim[range] | 0.1795ms | 0.1321ms | 7.5723 KOps/s | 7.7035 KOps/s | |
test_setitem_dim[tuple] | 0.1052ms | 66.0485μs | 15.1404 KOps/s | 15.8726 KOps/s | |
test_setitem | 80.6620μs | 45.6104μs | 21.9248 KOps/s | 23.2212 KOps/s | |
test_set | 76.6520μs | 43.3857μs | 23.0491 KOps/s | 23.8752 KOps/s | |
test_set_shared | 0.3419ms | 51.6279μs | 19.3694 KOps/s | 19.6768 KOps/s | |
test_update | 85.9830μs | 52.2448μs | 19.1407 KOps/s | 18.8975 KOps/s | |
test_update_nested | 0.3238ms | 58.2942μs | 17.1544 KOps/s | 17.2123 KOps/s | |
test_update__nested | 0.1651ms | 58.9625μs | 16.9599 KOps/s | 16.3084 KOps/s | |
test_set_nested | 93.3420μs | 42.9159μs | 23.3014 KOps/s | 22.0836 KOps/s | |
test_set_nested_new | 92.8420μs | 49.1277μs | 20.3551 KOps/s | 20.8834 KOps/s | |
test_select | 92.0230μs | 63.0927μs | 15.8497 KOps/s | 16.1009 KOps/s | |
test_select_nested | 88.1730μs | 41.8663μs | 23.8856 KOps/s | 24.0036 KOps/s | |
test_exclude_nested | 94.2320μs | 58.9064μs | 16.9761 KOps/s | 16.9580 KOps/s | |
test_empty[True] | 0.3126ms | 0.2572ms | 3.8873 KOps/s | 3.9540 KOps/s | |
test_empty[False] | 4.6081μs | 0.7341μs | 1.3622 MOps/s | 1.3534 MOps/s | |
test_to | 74.3720μs | 48.5171μs | 20.6113 KOps/s | 18.7834 KOps/s | |
test_to_nonblocking | 77.9620μs | 47.0516μs | 21.2533 KOps/s | 19.6777 KOps/s | |
test_unbind_speed | 0.8745ms | 0.2784ms | 3.5920 KOps/s | 3.6979 KOps/s | |
test_unbind_speed_stack0 | 0.3157ms | 0.2755ms | 3.6297 KOps/s | 3.7517 KOps/s | |
test_unbind_speed_stack1 | 0.7117ms | 0.6477ms | 1.5440 KOps/s | 1.4541 KOps/s | |
test_split | 94.2153ms | 2.1776ms | 459.2241 Ops/s | 441.5715 Ops/s | |
test_chunk | 93.9559ms | 2.1619ms | 462.5481 Ops/s | 440.2176 Ops/s | |
test_to[False] | 6.1137ms | 6.0088ms | 166.4221 Ops/s | 155.3698 Ops/s | |
test_to[True] | 4.6700ms | 4.2906ms | 233.0675 Ops/s | 223.6646 Ops/s | |
test_to_njt[False] | 0.3490s | 0.2707s | 3.6947 Ops/s | 3.6274 Ops/s | |
test_to_njt[True] | 0.2594s | 0.2588s | 3.8637 Ops/s | 3.8275 Ops/s | |
test_creation[device0] | 0.4392ms | 0.1286ms | 7.7759 KOps/s | 7.6751 KOps/s | |
test_creation_from_tensor | 0.3828ms | 0.1307ms | 7.6521 KOps/s | 7.4640 KOps/s | |
test_add_one[memmap_tensor0] | 0.1924ms | 9.2945μs | 107.5908 KOps/s | 108.2603 KOps/s | |
test_contiguous[memmap_tensor0] | 37.6610μs | 2.2028μs | 453.9584 KOps/s | 450.2607 KOps/s | |
test_stack[memmap_tensor0] | 34.7010μs | 7.0107μs | 142.6398 KOps/s | 138.4942 KOps/s | |
test_memmaptd_index | 1.0777ms | 0.4247ms | 2.3545 KOps/s | 2.3167 KOps/s | |
test_memmaptd_index_astensor | 0.7423ms | 0.4845ms | 2.0642 KOps/s | 2.0394 KOps/s | |
test_memmaptd_index_op | 1.3924ms | 1.0045ms | 995.5223 Ops/s | 941.4065 Ops/s | |
test_serialize_model | 0.1314s | 0.1306s | 7.6557 Ops/s | 6.9985 Ops/s | |
test_serialize_model_pickle | 1.3767s | 1.2184s | 0.8207 Ops/s | 0.8415 Ops/s | |
test_serialize_weights | 0.1315s | 0.1299s | 7.7010 Ops/s | 7.7523 Ops/s | |
test_serialize_weights_returnearly | 0.2139s | 55.9393ms | 17.8765 Ops/s | 17.7538 Ops/s | |
test_serialize_weights_pickle | 1.3481s | 1.2113s | 0.8255 Ops/s | 0.8360 Ops/s | |
test_reshape_pytree | 84.1630μs | 35.2254μs | 28.3886 KOps/s | 27.0959 KOps/s | |
test_reshape_td | 83.4520μs | 39.8896μs | 25.0692 KOps/s | 22.9001 KOps/s | |
test_view_pytree | 0.1224ms | 34.9717μs | 28.5946 KOps/s | 27.8612 KOps/s | |
test_view_td | 0.1172ms | 44.3192μs | 22.5636 KOps/s | 21.5249 KOps/s | |
test_unbind_pytree | 0.1197ms | 34.1638μs | 29.2708 KOps/s | 29.0008 KOps/s | |
test_unbind_td | 0.3833ms | 41.9812μs | 23.8202 KOps/s | 22.5830 KOps/s | |
test_split_pytree | 0.1361ms | 45.3210μs | 22.0648 KOps/s | 20.1521 KOps/s | |
test_split_td | 94.4479ms | 64.5901μs | 15.4823 KOps/s | 14.5929 KOps/s | |
test_add_pytree | 0.1672ms | 55.6491μs | 17.9698 KOps/s | 17.8259 KOps/s | |
test_add_td | 0.1919ms | 90.8509μs | 11.0070 KOps/s | 10.6234 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3017ms | 0.1640ms | 6.0987 KOps/s | 6.0588 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2831ms | 0.1497ms | 6.6782 KOps/s | 6.6939 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2764ms | 0.1570ms | 6.3687 KOps/s | 6.2628 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2859ms | 0.1824ms | 5.4834 KOps/s | 5.4050 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1304ms | 20.7896μs | 48.1009 KOps/s | 46.0479 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1402ms | 44.4991μs | 22.4724 KOps/s | 22.6342 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2098ms | 65.2083μs | 15.3355 KOps/s | 15.3490 KOps/s | |
test_compile_copy_nested[pytree-eager] | 95.1130μs | 50.0905μs | 19.9639 KOps/s | 19.9191 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4535ms | 0.3223ms | 3.1030 KOps/s | 3.1262 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3223ms | 0.2164ms | 4.6202 KOps/s | 4.6764 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2447ms | 0.1313ms | 7.6134 KOps/s | 7.4904 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1615ms | 58.6999μs | 17.0358 KOps/s | 16.5177 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3757ms | 0.3288ms | 3.0412 KOps/s | 2.9456 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7686ms | 0.6221ms | 1.6075 KOps/s | 1.5659 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3707ms | 0.2565ms | 3.8982 KOps/s | 3.8401 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4328ms | 0.3247ms | 3.0796 KOps/s | 3.0377 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1475ms | 70.5302μs | 14.1783 KOps/s | 13.7975 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2091ms | 0.1325ms | 7.5449 KOps/s | 7.0340 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6253ms | 0.5191ms | 1.9264 KOps/s | 1.8520 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3769ms | 0.3302ms | 3.0288 KOps/s | 2.9472 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 96.4920μs | 17.2927μs | 57.8280 KOps/s | 50.4917 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1136ms | 28.7791μs | 34.7474 KOps/s | 35.1231 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1565ms | 69.5917μs | 14.3695 KOps/s | 14.2745 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1359ms | 51.7062μs | 19.3400 KOps/s | 19.4599 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3619ms | 0.8238ms | 1.2139 KOps/s | 1.1270 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6022ms | 3.2225ms | 310.3165 Ops/s | 311.3616 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3753ms | 0.8275ms | 1.2085 KOps/s | 1.1085 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6804ms | 3.2549ms | 307.2302 Ops/s | 304.6299 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2350ms | 0.1218ms | 8.2111 KOps/s | 8.2158 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4807ms | 60.3209μs | 16.5780 KOps/s | 16.7429 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1714ms | 0.1157ms | 8.6456 KOps/s | 8.6276 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.4633ms | 42.4108μs | 23.5789 KOps/s | 23.9639 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.4957ms | 0.1163ms | 8.5950 KOps/s | 8.5456 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4263ms | 41.6872μs | 23.9882 KOps/s | 24.0590 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.5420ms | 0.1489ms | 6.7153 KOps/s | 6.5853 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1536ms | 24.5803μs | 40.6830 KOps/s | 38.9772 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2019ms | 0.1434ms | 6.9726 KOps/s | 6.9418 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.4278ms | 20.0932μs | 49.7680 KOps/s | 47.0945 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5532ms | 0.1450ms | 6.8974 KOps/s | 6.9164 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.0610μs | 20.1034μs | 49.7429 KOps/s | 46.8465 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5514ms | 0.1510ms | 6.6219 KOps/s | 6.5492 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4949ms | 24.4543μs | 40.8926 KOps/s | 39.3754 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5322ms | 0.1448ms | 6.9067 KOps/s | 6.9121 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4125ms | 19.9733μs | 50.0669 KOps/s | 46.8445 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2046ms | 0.1442ms | 6.9331 KOps/s | 6.8812 KOps/s | |
test_compile_indexing[int-pytree-eager] | 59.8220μs | 20.1865μs | 49.5381 KOps/s | 46.7855 KOps/s | |
test_mod_add[eager] | 77.9120μs | 30.6121μs | 32.6668 KOps/s | 30.9169 KOps/s | |
test_mod_add[compile] | 0.1324ms | 74.7454μs | 13.3787 KOps/s | 12.9022 KOps/s | |
test_mod_add[compile-overhead] | 0.3090ms | 0.1594ms | 6.2754 KOps/s | 5.8676 KOps/s | |
test_mod_wrap[eager] | 0.3251ms | 0.2442ms | 4.0958 KOps/s | 4.0449 KOps/s | |
test_mod_wrap[compile] | 0.3504ms | 0.2828ms | 3.5362 KOps/s | 3.3991 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7753ms | 4.1375ms | 241.6933 Ops/s | 239.9657 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4913ms | 1.3682ms | 730.8953 Ops/s | 677.1931 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5499ms | 1.2643ms | 790.9703 Ops/s | 771.3147 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4034ms | 0.9207ms | 1.0861 KOps/s | 1.0750 KOps/s | |
test_seq_add[eager] | 0.1456ms | 97.9999μs | 10.2041 KOps/s | 9.9467 KOps/s | |
test_seq_add[compile] | 0.1486ms | 89.0201μs | 11.2334 KOps/s | 11.4613 KOps/s | |
test_seq_add[compile-overhead] | 0.1852ms | 0.1265ms | 7.9053 KOps/s | 7.8295 KOps/s | |
test_seq_wrap[eager] | 0.4748ms | 0.3809ms | 2.6256 KOps/s | 2.5402 KOps/s | |
test_seq_wrap[compile] | 0.3849ms | 0.3023ms | 3.3075 KOps/s | 3.2817 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2657ms | 0.2200ms | 4.5449 KOps/s | 4.4546 KOps/s | |
test_func_call_runtime[False-eager] | 0.8665ms | 0.7444ms | 1.3433 KOps/s | 1.3414 KOps/s | |
test_func_call_runtime[False-compile] | 0.8277ms | 0.7566ms | 1.3217 KOps/s | 1.3171 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4086ms | 0.3608ms | 2.7715 KOps/s | 2.7614 KOps/s | |
test_func_call_runtime[True-eager] | 0.9846ms | 0.9049ms | 1.1051 KOps/s | 1.1123 KOps/s | |
test_func_call_runtime[True-compile] | 0.9018ms | 0.7805ms | 1.2812 KOps/s | 1.2694 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4342ms | 0.3806ms | 2.6273 KOps/s | 2.5883 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8038ms | 0.7398ms | 1.3517 KOps/s | 1.3415 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8312ms | 0.7600ms | 1.3158 KOps/s | 1.3098 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4257ms | 0.3632ms | 2.7532 KOps/s | 2.7389 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1009ms | 0.9963ms | 1.0037 KOps/s | 994.4724 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9966ms | 0.8129ms | 1.2302 KOps/s | 1.2265 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4528ms | 0.4059ms | 2.4638 KOps/s | 2.4115 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5024ms | 2.0661ms | 484.0043 Ops/s | 480.0987 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9743ms | 0.8245ms | 1.2129 KOps/s | 1.2071 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4581ms | 0.4121ms | 2.4268 KOps/s | 2.3967 KOps/s | |
test_distributed | 0.6102ms | 0.1180ms | 8.4746 KOps/s | 8.2038 KOps/s | |
test_tdmodule | 35.4810μs | 14.2923μs | 69.9676 KOps/s | 62.9707 KOps/s | |
test_tdmodule_dispatch | 46.5510μs | 27.5911μs | 36.2436 KOps/s | 31.7590 KOps/s | |
test_tdseq | 34.7610μs | 15.6775μs | 63.7855 KOps/s | 58.1298 KOps/s | |
test_tdseq_dispatch | 52.9710μs | 30.5953μs | 32.6848 KOps/s | 28.5339 KOps/s | |
test_instantiation_functorch | 1.9446ms | 1.8593ms | 537.8319 Ops/s | 527.6410 Ops/s | |
test_exec_functorch | 0.2554ms | 0.2109ms | 4.7415 KOps/s | 4.6622 KOps/s | |
test_exec_functional_call | 0.3480ms | 0.2139ms | 4.6751 KOps/s | 4.5991 KOps/s | |
test_exec_td_decorator | 0.4302ms | 0.2599ms | 3.8483 KOps/s | 3.8836 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7769ms | 0.6656ms | 1.5023 KOps/s | 1.4782 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7909ms | 0.6659ms | 1.5017 KOps/s | 1.4794 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6915ms | 0.5919ms | 1.6895 KOps/s | 1.6797 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7852ms | 0.5946ms | 1.6818 KOps/s | 1.6869 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.6043ms | 19.5287ms | 51.2066 Ops/s | 51.2504 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.6907ms | 19.5481ms | 51.1559 Ops/s | 51.1271 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7437ms | 19.4038ms | 51.5364 Ops/s | 50.6473 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7953ms | 19.4180ms | 51.4985 Ops/s | 51.6411 Ops/s | |
test_to_module_speed[True] | 1.2992ms | 0.9313ms | 1.0738 KOps/s | 1.0961 KOps/s | |
test_to_module_speed[False] | 1.3483ms | 0.9125ms | 1.0959 KOps/s | 1.1231 KOps/s | |
test_tc_init | 59.1620μs | 36.6761μs | 27.2657 KOps/s | 28.6109 KOps/s | |
test_tc_init_nested | 0.1002ms | 69.9197μs | 14.3021 KOps/s | 13.8911 KOps/s | |
test_tc_first_layer_tensor | 4.7300μs | 0.6831μs | 1.4639 MOps/s | 1.4151 MOps/s | |
test_tc_first_layer_nontensor | 28.2110μs | 2.4143μs | 414.2051 KOps/s | 431.9357 KOps/s | |
test_tc_second_layer_tensor | 7.2125μs | 1.4137μs | 707.3651 KOps/s | 710.8063 KOps/s | |
test_tc_second_layer_nontensor | 30.3010μs | 3.1530μs | 317.1589 KOps/s | 328.7131 KOps/s | |
test_unbind | 0.1914s | 11.6667ms | 85.7142 Ops/s | 93.9832 Ops/s | |
test_full_like | 0.6564ms | 0.5734ms | 1.7439 KOps/s | 1.7435 KOps/s | |
test_zeros_like | 0.2762ms | 0.1981ms | 5.0491 KOps/s | 5.0520 KOps/s | |
test_ones_like | 0.2329ms | 0.1979ms | 5.0533 KOps/s | 5.0564 KOps/s | |
test_clone | 0.4447ms | 0.4149ms | 2.4103 KOps/s | 2.4109 KOps/s | |
test_squeeze | 0.1132ms | 9.1726μs | 109.0201 KOps/s | 108.7719 KOps/s | |
test_unsqueeze | 0.2284ms | 69.6018μs | 14.3674 KOps/s | 14.2580 KOps/s | |
test_split | 0.4191ms | 0.1600ms | 6.2492 KOps/s | 6.1411 KOps/s | |
test_permute | 0.2237ms | 0.1701ms | 5.8801 KOps/s | 5.7037 KOps/s | |
test_stack | 1.2494ms | 0.8405ms | 1.1897 KOps/s | 1.1557 KOps/s | |
test_cat | 1.2613ms | 1.2314ms | 812.0793 Ops/s | 812.1496 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):