-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] better AddStateIndependentNormalScale #1028
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: b5911c0b4e023d3c8e20968732ff58da061f978b Pull Request resolved: #1028
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 58.3890μs | 24.2638μs | 41.2137 KOps/s | 41.6689 KOps/s | |
test_plain_set_stack_nested | 64.7910μs | 24.7675μs | 40.3755 KOps/s | 40.6260 KOps/s | |
test_plain_set_nested_inplace | 62.8180μs | 26.6703μs | 37.4949 KOps/s | 37.3632 KOps/s | |
test_plain_set_stack_nested_inplace | 74.2790μs | 26.6319μs | 37.5489 KOps/s | 37.3452 KOps/s | |
test_items | 32.4210μs | 4.1304μs | 242.1064 KOps/s | 240.6548 KOps/s | |
test_items_nested | 0.5777ms | 0.3855ms | 2.5938 KOps/s | 2.5576 KOps/s | |
test_items_nested_locked | 0.6880ms | 0.3840ms | 2.6040 KOps/s | 2.5077 KOps/s | |
test_items_nested_leaf | 0.1469ms | 80.5938μs | 12.4079 KOps/s | 12.3626 KOps/s | |
test_items_stack_nested | 0.7196ms | 0.3919ms | 2.5519 KOps/s | 2.4900 KOps/s | |
test_items_stack_nested_leaf | 0.1539ms | 82.1395μs | 12.1744 KOps/s | 11.8075 KOps/s | |
test_items_stack_nested_locked | 0.7396ms | 0.3926ms | 2.5473 KOps/s | 2.5183 KOps/s | |
test_keys | 39.6140μs | 3.6141μs | 276.6949 KOps/s | 283.7571 KOps/s | |
test_keys_nested | 0.1858ms | 0.1345ms | 7.4347 KOps/s | 7.2949 KOps/s | |
test_keys_nested_locked | 1.6189ms | 0.1409ms | 7.0967 KOps/s | 7.0873 KOps/s | |
test_keys_nested_leaf | 0.1883ms | 0.1182ms | 8.4624 KOps/s | 8.4178 KOps/s | |
test_keys_stack_nested | 0.2288ms | 0.1334ms | 7.4948 KOps/s | 7.4317 KOps/s | |
test_keys_stack_nested_leaf | 0.2017ms | 0.1168ms | 8.5624 KOps/s | 8.4908 KOps/s | |
test_keys_stack_nested_locked | 0.2321ms | 0.1400ms | 7.1434 KOps/s | 7.1055 KOps/s | |
test_values | 7.1414μs | 1.0585μs | 944.7608 KOps/s | 923.0691 KOps/s | |
test_values_nested | 0.1550ms | 94.2638μs | 10.6085 KOps/s | 10.7818 KOps/s | |
test_values_nested_locked | 0.1541ms | 93.1352μs | 10.7371 KOps/s | 10.7484 KOps/s | |
test_values_nested_leaf | 0.1367ms | 79.1001μs | 12.6422 KOps/s | 12.8107 KOps/s | |
test_values_stack_nested | 0.1542ms | 92.8300μs | 10.7724 KOps/s | 10.8777 KOps/s | |
test_values_stack_nested_leaf | 0.1424ms | 79.3704μs | 12.5992 KOps/s | 12.5925 KOps/s | |
test_values_stack_nested_locked | 0.2317ms | 93.4508μs | 10.7008 KOps/s | 10.8741 KOps/s | |
test_membership | 5.9697μs | 0.7420μs | 1.3477 MOps/s | 1.1605 MOps/s | |
test_membership_nested | 25.0260μs | 2.7282μs | 366.5428 KOps/s | 363.5107 KOps/s | |
test_membership_nested_leaf | 22.7930μs | 2.7539μs | 363.1259 KOps/s | 362.5998 KOps/s | |
test_membership_stacked_nested | 26.1390μs | 2.7345μs | 365.6998 KOps/s | 358.1913 KOps/s | |
test_membership_stacked_nested_leaf | 74.9500μs | 2.7474μs | 363.9764 KOps/s | 363.2722 KOps/s | |
test_membership_nested_last | 30.8670μs | 4.2279μs | 236.5242 KOps/s | 241.9288 KOps/s | |
test_membership_nested_leaf_last | 22.0120μs | 4.1882μs | 238.7687 KOps/s | 240.8186 KOps/s | |
test_membership_stacked_nested_last | 34.1740μs | 4.1383μs | 241.6465 KOps/s | 127.7012 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.3380μs | 4.1802μs | 239.2213 KOps/s | 127.4901 KOps/s | |
test_nested_getleaf | 37.7600μs | 10.5795μs | 94.5227 KOps/s | 94.3793 KOps/s | |
test_nested_get | 57.4190μs | 10.0153μs | 99.8474 KOps/s | 99.4219 KOps/s | |
test_stacked_getleaf | 38.3310μs | 10.5478μs | 94.8069 KOps/s | 94.3094 KOps/s | |
test_stacked_get | 51.6770μs | 9.9661μs | 100.3405 KOps/s | 99.4117 KOps/s | |
test_nested_getitemleaf | 61.7060μs | 11.7445μs | 85.1460 KOps/s | 90.0718 KOps/s | |
test_nested_getitem | 68.4360μs | 10.1704μs | 98.3243 KOps/s | 96.3534 KOps/s | |
test_stacked_getitemleaf | 43.5520μs | 11.0220μs | 90.7279 KOps/s | 90.5853 KOps/s | |
test_stacked_getitem | 36.8290μs | 10.2397μs | 97.6593 KOps/s | 96.8239 KOps/s | |
test_lock_nested | 90.3454ms | 0.6184ms | 1.6170 KOps/s | 1.9623 KOps/s | |
test_lock_stack_nested | 0.6461ms | 0.4865ms | 2.0553 KOps/s | 2.1230 KOps/s | |
test_unlock_nested | 93.0569ms | 0.5362ms | 1.8650 KOps/s | 2.3497 KOps/s | |
test_unlock_stack_nested | 0.7374ms | 0.3976ms | 2.5152 KOps/s | 2.6081 KOps/s | |
test_flatten_speed | 0.1891ms | 0.1007ms | 9.9279 KOps/s | 9.9992 KOps/s | |
test_unflatten_speed | 0.7083ms | 0.5150ms | 1.9419 KOps/s | 1.9861 KOps/s | |
test_common_ops | 4.9984ms | 1.1679ms | 856.2506 Ops/s | 860.7456 Ops/s | |
test_creation | 21.8310μs | 2.1055μs | 474.9375 KOps/s | 491.0379 KOps/s | |
test_creation_empty | 64.7210μs | 17.6829μs | 56.5517 KOps/s | 53.4103 KOps/s | |
test_creation_nested_1 | 46.3560μs | 20.9900μs | 47.6418 KOps/s | 44.9470 KOps/s | |
test_creation_nested_2 | 77.0340μs | 25.0643μs | 39.8974 KOps/s | 37.9299 KOps/s | |
test_clone | 0.1165ms | 17.7673μs | 56.2832 KOps/s | 58.4988 KOps/s | |
test_getitem[int] | 1.2183ms | 16.5824μs | 60.3047 KOps/s | 59.6805 KOps/s | |
test_getitem[slice_int] | 0.1475ms | 32.0243μs | 31.2263 KOps/s | 30.3361 KOps/s | |
test_getitem[range] | 0.3414ms | 60.8627μs | 16.4304 KOps/s | 17.0029 KOps/s | |
test_getitem[tuple] | 0.1325ms | 25.4708μs | 39.2606 KOps/s | 38.2784 KOps/s | |
test_getitem[list] | 0.2636ms | 55.5404μs | 18.0049 KOps/s | 18.2941 KOps/s | |
test_setitem_dim[int] | 76.5630μs | 35.1950μs | 28.4131 KOps/s | 29.0655 KOps/s | |
test_setitem_dim[slice_int] | 0.1684ms | 65.0976μs | 15.3615 KOps/s | 15.9109 KOps/s | |
test_setitem_dim[range] | 0.1465ms | 86.4543μs | 11.5668 KOps/s | 11.4979 KOps/s | |
test_setitem_dim[tuple] | 0.1063ms | 52.1018μs | 19.1932 KOps/s | 19.2157 KOps/s | |
test_setitem | 0.1996ms | 31.5936μs | 31.6520 KOps/s | 32.1179 KOps/s | |
test_set | 0.1391ms | 30.3619μs | 32.9360 KOps/s | 33.2717 KOps/s | |
test_set_shared | 1.1496ms | 0.2220ms | 4.5053 KOps/s | 4.4746 KOps/s | |
test_update | 0.1643ms | 36.9923μs | 27.0327 KOps/s | 25.8051 KOps/s | |
test_update_nested | 0.1409ms | 50.1102μs | 19.9560 KOps/s | 19.9312 KOps/s | |
test_update__nested | 0.1262ms | 40.8231μs | 24.4959 KOps/s | 26.6550 KOps/s | |
test_set_nested | 95.2780μs | 32.6547μs | 30.6234 KOps/s | 30.4666 KOps/s | |
test_set_nested_new | 0.1222ms | 38.2067μs | 26.1734 KOps/s | 26.2415 KOps/s | |
test_select | 0.1387ms | 55.7309μs | 17.9434 KOps/s | 18.0212 KOps/s | |
test_select_nested | 0.1549ms | 60.6147μs | 16.4976 KOps/s | 16.8778 KOps/s | |
test_exclude_nested | 0.1483ms | 74.8158μs | 13.3662 KOps/s | 13.4155 KOps/s | |
test_empty[True] | 1.0592ms | 0.3547ms | 2.8192 KOps/s | 2.8422 KOps/s | |
test_empty[False] | 8.5685μs | 1.2222μs | 818.2222 KOps/s | 829.5272 KOps/s | |
test_unbind_speed | 0.4914ms | 0.3056ms | 3.2721 KOps/s | 3.2038 KOps/s | |
test_unbind_speed_stack0 | 0.6571ms | 0.3057ms | 3.2714 KOps/s | 3.4017 KOps/s | |
test_unbind_speed_stack1 | 93.3837ms | 0.8294ms | 1.2057 KOps/s | 1.3552 KOps/s | |
test_split | 3.1473ms | 2.0165ms | 495.9160 Ops/s | 453.1185 Ops/s | |
test_chunk | 94.6383ms | 2.1993ms | 454.6842 Ops/s | 450.6180 Ops/s | |
test_creation[device0] | 0.2543ms | 0.1209ms | 8.2723 KOps/s | 8.1614 KOps/s | |
test_creation_from_tensor | 3.1850ms | 0.1219ms | 8.2003 KOps/s | 8.3542 KOps/s | |
test_add_one[memmap_tensor0] | 0.2815ms | 7.9482μs | 125.8152 KOps/s | 137.3672 KOps/s | |
test_contiguous[memmap_tensor0] | 28.7240μs | 1.8921μs | 528.5096 KOps/s | 513.8749 KOps/s | |
test_stack[memmap_tensor0] | 49.6830μs | 5.8076μs | 172.1868 KOps/s | 171.2012 KOps/s | |
test_memmaptd_index | 99.2589ms | 0.5549ms | 1.8020 KOps/s | 2.3777 KOps/s | |
test_memmaptd_index_astensor | 1.0905ms | 0.5223ms | 1.9147 KOps/s | 1.9097 KOps/s | |
test_memmaptd_index_op | 1.5238ms | 1.0646ms | 939.3153 Ops/s | 920.8463 Ops/s | |
test_serialize_model | 0.1323s | 0.1217s | 8.2166 Ops/s | 8.3264 Ops/s | |
test_serialize_model_pickle | 0.4509s | 0.3951s | 2.5308 Ops/s | 2.5162 Ops/s | |
test_serialize_weights | 0.1250s | 0.1167s | 8.5685 Ops/s | 8.8310 Ops/s | |
test_serialize_weights_returnearly | 0.2490s | 0.1749s | 5.7187 Ops/s | 6.3351 Ops/s | |
test_serialize_weights_pickle | 1.0946s | 0.7389s | 1.3533 Ops/s | 2.5299 Ops/s | |
test_serialize_weights_filesystem | 0.1485s | 0.1438s | 6.9559 Ops/s | 6.9332 Ops/s | |
test_serialize_model_filesystem | 0.2359s | 0.1590s | 6.2905 Ops/s | 6.5340 Ops/s | |
test_reshape_pytree | 85.2090μs | 39.7971μs | 25.1275 KOps/s | 25.7610 KOps/s | |
test_reshape_td | 0.1052ms | 46.3455μs | 21.5771 KOps/s | 22.2916 KOps/s | |
test_view_pytree | 95.2580μs | 39.8337μs | 25.1044 KOps/s | 26.0319 KOps/s | |
test_view_td | 0.1507ms | 53.6176μs | 18.6506 KOps/s | 19.5714 KOps/s | |
test_unbind_pytree | 76.1620μs | 36.4780μs | 27.4138 KOps/s | 27.2173 KOps/s | |
test_unbind_td | 0.2928ms | 45.9057μs | 21.7838 KOps/s | 22.1765 KOps/s | |
test_split_pytree | 0.1065ms | 38.4387μs | 26.0154 KOps/s | 26.1896 KOps/s | |
test_split_td | 0.1938ms | 57.8514μs | 17.2857 KOps/s | 17.2344 KOps/s | |
test_add_pytree | 0.1858ms | 49.2759μs | 20.2939 KOps/s | 21.4968 KOps/s | |
test_add_td | 0.1770ms | 87.9896μs | 11.3650 KOps/s | 11.1680 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1627ms | 57.8996μs | 17.2713 KOps/s | 16.6227 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2602ms | 0.1970ms | 5.0753 KOps/s | 5.0000 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1319ms | 58.1415μs | 17.1994 KOps/s | 17.3122 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2597ms | 0.1441ms | 6.9415 KOps/s | 6.9196 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 84.6080μs | 22.9797μs | 43.5166 KOps/s | 41.7986 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1505ms | 74.7103μs | 13.3850 KOps/s | 13.4477 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1780ms | 75.8064μs | 13.1915 KOps/s | 13.1492 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1205ms | 68.5801μs | 14.5815 KOps/s | 14.4119 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2747ms | 0.1828ms | 5.4690 KOps/s | 5.4432 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4468ms | 0.2413ms | 4.1449 KOps/s | 4.1671 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1396ms | 47.7805μs | 20.9291 KOps/s | 20.3363 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1556ms | 78.1717μs | 12.7923 KOps/s | 12.9370 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2976ms | 0.1766ms | 5.6628 KOps/s | 5.6445 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5208ms | 0.2982ms | 3.3534 KOps/s | 3.4333 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5121ms | 0.2755ms | 3.6303 KOps/s | 3.6499 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4616ms | 0.1873ms | 5.3378 KOps/s | 5.4219 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1778ms | 75.7233μs | 13.2060 KOps/s | 13.5323 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1328ms | 49.0834μs | 20.3735 KOps/s | 20.0738 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4422ms | 0.2416ms | 4.1395 KOps/s | 4.2614 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3063ms | 0.1788ms | 5.5942 KOps/s | 5.6960 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2257ms | 0.1125ms | 8.8907 KOps/s | 8.9498 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1810ms | 78.1776μs | 12.7914 KOps/s | 13.0020 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1485ms | 80.8335μs | 12.3711 KOps/s | 12.3885 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1428ms | 72.6062μs | 13.7729 KOps/s | 14.0801 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3247ms | 0.1940ms | 5.1539 KOps/s | 5.1122 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8221ms | 1.7444ms | 573.2729 Ops/s | 568.5812 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3798ms | 0.1933ms | 5.1726 KOps/s | 5.1580 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3598ms | 1.1162ms | 895.9330 Ops/s | 906.5866 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6615ms | 0.4207ms | 2.3773 KOps/s | 2.4043 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.1645ms | 4.0814ms | 245.0139 Ops/s | 242.9475 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1247ms | 33.1638μs | 30.1534 KOps/s | 28.8330 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0904ms | 50.4913μs | 19.8054 KOps/s | 20.3073 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1286ms | 30.3010μs | 33.0022 KOps/s | 33.0245 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1095ms | 29.1299μs | 34.3290 KOps/s | 34.1421 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1713ms | 30.4313μs | 32.8609 KOps/s | 33.5366 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1075ms | 29.2311μs | 34.2101 KOps/s | 34.1580 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1630ms | 73.7334μs | 13.5624 KOps/s | 13.7809 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5874ms | 26.6961μs | 37.4587 KOps/s | 35.1406 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1826ms | 69.2789μs | 14.4344 KOps/s | 14.5478 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 60.4730μs | 23.5330μs | 42.4934 KOps/s | 41.5428 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1424ms | 68.2797μs | 14.6456 KOps/s | 14.6505 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1110ms | 23.1394μs | 43.2163 KOps/s | 42.3667 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1951ms | 74.3403μs | 13.4516 KOps/s | 13.9065 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.7292ms | 26.9878μs | 37.0537 KOps/s | 35.9358 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1725ms | 70.6209μs | 14.1601 KOps/s | 14.6974 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1780ms | 23.2536μs | 43.0041 KOps/s | 42.4564 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2498ms | 70.5700μs | 14.1703 KOps/s | 14.5948 KOps/s | |
test_compile_indexing[int-pytree-eager] | 72.7160μs | 23.3207μs | 42.8803 KOps/s | 42.8341 KOps/s | |
test_mod_add[eager] | 86.3410μs | 24.6481μs | 40.5711 KOps/s | 38.0650 KOps/s | |
test_mod_add[compile] | 85.6810μs | 39.3445μs | 25.4165 KOps/s | 25.8943 KOps/s | |
test_mod_add[compile-overhead] | 0.1121ms | 40.1801μs | 24.8880 KOps/s | 25.7321 KOps/s | |
test_mod_wrap[eager] | 0.4207ms | 0.2152ms | 4.6458 KOps/s | 4.7840 KOps/s | |
test_mod_wrap[compile] | 0.3700ms | 0.2356ms | 4.2452 KOps/s | 4.2562 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4522ms | 0.2327ms | 4.2965 KOps/s | 4.3035 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.0851ms | 10.8222ms | 92.4027 Ops/s | 89.9961 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.4883ms | 11.0451ms | 90.5380 Ops/s | 88.1676 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 11.9943ms | 11.0348ms | 90.6223 Ops/s | 82.8575 Ops/s | |
test_seq_add[eager] | 0.2504ms | 91.2330μs | 10.9609 KOps/s | 10.7176 KOps/s | |
test_seq_add[compile] | 0.1674ms | 67.2862μs | 14.8619 KOps/s | 15.4722 KOps/s | |
test_seq_add[compile-overhead] | 0.1517ms | 65.3200μs | 15.3092 KOps/s | 15.6618 KOps/s | |
test_seq_wrap[eager] | 0.6657ms | 0.3939ms | 2.5387 KOps/s | 2.6178 KOps/s | |
test_seq_wrap[compile] | 1.2460ms | 0.2781ms | 3.5961 KOps/s | 3.7049 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3426ms | 0.2771ms | 3.6091 KOps/s | 3.6439 KOps/s | |
test_func_call_runtime[False-eager] | 1.0252ms | 0.5429ms | 1.8418 KOps/s | 1.8864 KOps/s | |
test_func_call_runtime[False-compile] | 0.7468ms | 0.5251ms | 1.9043 KOps/s | 1.9416 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7365ms | 0.5225ms | 1.9138 KOps/s | 1.9402 KOps/s | |
test_func_call_runtime[True-eager] | 1.6813ms | 0.7824ms | 1.2780 KOps/s | 1.3416 KOps/s | |
test_func_call_runtime[True-compile] | 1.1783ms | 0.5285ms | 1.8923 KOps/s | 1.8914 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9523ms | 0.5363ms | 1.8645 KOps/s | 1.8975 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.2286ms | 0.5463ms | 1.8305 KOps/s | 1.8727 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1377ms | 0.5205ms | 1.9214 KOps/s | 1.9306 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7198ms | 0.5182ms | 1.9297 KOps/s | 1.9435 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.5363ms | 0.9220ms | 1.0846 KOps/s | 1.1059 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0701ms | 0.7724ms | 1.2946 KOps/s | 1.3348 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1173ms | 0.7737ms | 1.2925 KOps/s | 1.3208 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5657ms | 1.9555ms | 511.3707 Ops/s | 515.4657 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.9231ms | 2.0063ms | 498.4410 Ops/s | 502.9212 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.7532ms | 2.0164ms | 495.9296 Ops/s | 501.0523 Ops/s | |
test_distributed | 0.3481ms | 0.1259ms | 7.9417 KOps/s | 7.7822 KOps/s | |
test_tdmodule | 31.6290μs | 17.3866μs | 57.5156 KOps/s | 54.6844 KOps/s | |
test_tdmodule_dispatch | 98.4840μs | 36.3408μs | 27.5173 KOps/s | 27.1136 KOps/s | |
test_tdseq | 57.9980μs | 20.5003μs | 48.7798 KOps/s | 45.4632 KOps/s | |
test_tdseq_dispatch | 96.2700μs | 41.7139μs | 23.9728 KOps/s | 23.4581 KOps/s | |
test_instantiation_functorch | 1.9711ms | 1.6217ms | 616.6324 Ops/s | 619.9585 Ops/s | |
test_instantiation_td | 3.4287ms | 1.2054ms | 829.5926 Ops/s | 832.1445 Ops/s | |
test_exec_functorch | 0.4531ms | 0.1949ms | 5.1301 KOps/s | 5.2333 KOps/s | |
test_exec_functional_call | 0.3506ms | 0.1790ms | 5.5851 KOps/s | 5.6597 KOps/s | |
test_exec_td | 0.4848ms | 0.2061ms | 4.8523 KOps/s | 4.9423 KOps/s | |
test_exec_td_decorator | 0.9682ms | 0.2397ms | 4.1711 KOps/s | 4.1967 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4358ms | 0.6991ms | 1.4304 KOps/s | 1.4422 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9654ms | 0.6956ms | 1.4376 KOps/s | 1.4499 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.8617ms | 0.5479ms | 1.8251 KOps/s | 1.8523 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8984ms | 0.5420ms | 1.8450 KOps/s | 1.8507 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.6616ms | 0.6505ms | 1.5374 KOps/s | 1.5398 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0416ms | 0.6496ms | 1.5394 KOps/s | 1.5330 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8587ms | 0.5404ms | 1.8505 KOps/s | 1.8659 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0426ms | 0.5408ms | 1.8490 KOps/s | 1.8575 KOps/s | |
test_to_module_speed[True] | 2.0379ms | 1.4165ms | 705.9527 Ops/s | 712.7198 Ops/s | |
test_to_module_speed[False] | 1.5458ms | 1.3870ms | 720.9640 Ops/s | 734.5983 Ops/s | |
test_tc_init | 92.9740μs | 44.8525μs | 22.2953 KOps/s | 21.3358 KOps/s | |
test_tc_init_nested | 0.1606ms | 86.6660μs | 11.5385 KOps/s | 10.5921 KOps/s | |
test_tc_first_layer_tensor | 26.4290μs | 1.5149μs | 660.1072 KOps/s | 648.7821 KOps/s | |
test_tc_first_layer_nontensor | 22.7830μs | 4.6929μs | 213.0878 KOps/s | 216.5014 KOps/s | |
test_tc_second_layer_tensor | 65.0900μs | 2.7741μs | 360.4756 KOps/s | 350.6166 KOps/s | |
test_tc_second_layer_nontensor | 63.3790μs | 6.0620μs | 164.9629 KOps/s | 166.3476 KOps/s | |
test_unbind | 0.4635s | 13.0243ms | 76.7797 Ops/s | 71.2245 Ops/s | |
test_full_like | 14.0529ms | 8.5487ms | 116.9765 Ops/s | 133.8156 Ops/s | |
test_zeros_like | 6.8364ms | 3.2949ms | 303.5013 Ops/s | 355.5127 Ops/s | |
test_ones_like | 6.8777ms | 3.7629ms | 265.7539 Ops/s | 167.1866 Ops/s | |
test_clone | 9.8628ms | 5.4492ms | 183.5116 Ops/s | 130.4426 Ops/s | |
test_squeeze | 65.8230μs | 12.6423μs | 79.0994 KOps/s | 80.2156 KOps/s | |
test_unsqueeze | 0.3375ms | 93.0169μs | 10.7507 KOps/s | 10.6437 KOps/s | |
test_split | 0.3672ms | 0.1970ms | 5.0751 KOps/s | 4.9805 KOps/s | |
test_permute | 0.3580ms | 0.2262ms | 4.4212 KOps/s | 4.4546 KOps/s | |
test_stack | 27.4948ms | 25.1978ms | 39.6860 Ops/s | 38.6689 Ops/s | |
test_cat | 29.0101ms | 25.5683ms | 39.1110 Ops/s | 38.3051 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1423ms | 16.7357μs | 59.7524 KOps/s | 56.5284 KOps/s | |
test_plain_set_stack_nested | 56.1710μs | 16.8278μs | 59.4256 KOps/s | 55.9927 KOps/s | |
test_plain_set_nested_inplace | 48.6200μs | 18.0181μs | 55.4998 KOps/s | 52.7857 KOps/s | |
test_plain_set_stack_nested_inplace | 48.6700μs | 17.8248μs | 56.1015 KOps/s | 53.1827 KOps/s | |
test_items | 31.4600μs | 2.9069μs | 344.0033 KOps/s | 348.7128 KOps/s | |
test_items_nested | 0.3949ms | 0.3411ms | 2.9321 KOps/s | 2.9389 KOps/s | |
test_items_nested_locked | 0.3926ms | 0.3437ms | 2.9092 KOps/s | 2.9305 KOps/s | |
test_items_nested_leaf | 0.1009ms | 62.6362μs | 15.9652 KOps/s | 15.8279 KOps/s | |
test_items_stack_nested | 0.3997ms | 0.3482ms | 2.8716 KOps/s | 2.9235 KOps/s | |
test_items_stack_nested_leaf | 89.0510μs | 64.8153μs | 15.4285 KOps/s | 15.5715 KOps/s | |
test_items_stack_nested_locked | 0.4038ms | 0.3496ms | 2.8601 KOps/s | 2.9309 KOps/s | |
test_keys | 44.1200μs | 3.4366μs | 290.9821 KOps/s | 292.4053 KOps/s | |
test_keys_nested | 0.1080ms | 71.0590μs | 14.0728 KOps/s | 14.3451 KOps/s | |
test_keys_nested_locked | 2.3715ms | 77.3580μs | 12.9269 KOps/s | 12.8336 KOps/s | |
test_keys_nested_leaf | 95.6320μs | 61.7956μs | 16.1824 KOps/s | 16.4192 KOps/s | |
test_keys_stack_nested | 0.1167ms | 71.5100μs | 13.9841 KOps/s | 14.0911 KOps/s | |
test_keys_stack_nested_leaf | 91.0410μs | 63.6188μs | 15.7186 KOps/s | 15.9523 KOps/s | |
test_keys_stack_nested_locked | 0.1162ms | 77.4400μs | 12.9132 KOps/s | 13.1604 KOps/s | |
test_values | 9.0352μs | 0.8407μs | 1.1894 MOps/s | 1.1658 MOps/s | |
test_values_nested | 80.4810μs | 49.0078μs | 20.4049 KOps/s | 20.4939 KOps/s | |
test_values_nested_locked | 79.5810μs | 50.6781μs | 19.7324 KOps/s | 19.8986 KOps/s | |
test_values_nested_leaf | 79.5420μs | 42.5785μs | 23.4861 KOps/s | 23.3740 KOps/s | |
test_values_stack_nested | 88.5720μs | 50.8311μs | 19.6730 KOps/s | 20.2900 KOps/s | |
test_values_stack_nested_leaf | 72.7310μs | 43.6166μs | 22.9271 KOps/s | 23.3851 KOps/s | |
test_values_stack_nested_locked | 97.1420μs | 51.1885μs | 19.5356 KOps/s | 19.5222 KOps/s | |
test_membership | 1.8995μs | 0.5016μs | 1.9935 MOps/s | 1.9819 MOps/s | |
test_membership_nested | 25.8455μs | 1.8917μs | 528.6371 KOps/s | 525.1662 KOps/s | |
test_membership_nested_leaf | 16.5600μs | 1.9037μs | 525.2828 KOps/s | 538.3513 KOps/s | |
test_membership_stacked_nested | 23.8810μs | 1.9684μs | 508.0165 KOps/s | 515.5881 KOps/s | |
test_membership_stacked_nested_leaf | 28.9500μs | 2.0147μs | 496.3458 KOps/s | 513.3398 KOps/s | |
test_membership_nested_last | 29.7110μs | 3.0638μs | 326.3884 KOps/s | 337.7153 KOps/s | |
test_membership_nested_leaf_last | 37.3800μs | 3.0443μs | 328.4821 KOps/s | 334.2325 KOps/s | |
test_membership_stacked_nested_last | 31.9710μs | 3.5436μs | 282.2017 KOps/s | 120.3593 KOps/s | |
test_membership_stacked_nested_leaf_last | 25.7810μs | 3.5477μs | 281.8728 KOps/s | 121.3895 KOps/s | |
test_nested_getleaf | 31.7610μs | 6.1121μs | 163.6088 KOps/s | 163.7989 KOps/s | |
test_nested_get | 36.3710μs | 5.7616μs | 173.5621 KOps/s | 172.2025 KOps/s | |
test_stacked_getleaf | 37.1310μs | 6.0498μs | 165.2960 KOps/s | 163.6326 KOps/s | |
test_stacked_get | 28.4500μs | 5.6815μs | 176.0112 KOps/s | 171.8623 KOps/s | |
test_nested_getitemleaf | 36.4600μs | 6.1655μs | 162.1917 KOps/s | 161.2197 KOps/s | |
test_nested_getitem | 34.6110μs | 5.7949μs | 172.5665 KOps/s | 170.6196 KOps/s | |
test_stacked_getitemleaf | 36.0500μs | 6.0714μs | 164.7079 KOps/s | 162.8861 KOps/s | |
test_stacked_getitem | 30.5800μs | 5.6554μs | 176.8211 KOps/s | 171.5395 KOps/s | |
test_lock_nested | 4.9128ms | 0.4433ms | 2.2557 KOps/s | 2.2909 KOps/s | |
test_lock_stack_nested | 0.4644ms | 0.4004ms | 2.4975 KOps/s | 2.5767 KOps/s | |
test_unlock_nested | 0.7794ms | 0.3772ms | 2.6508 KOps/s | 2.6654 KOps/s | |
test_unlock_stack_nested | 0.3792ms | 0.3384ms | 2.9547 KOps/s | 3.0780 KOps/s | |
test_flatten_speed | 0.1118ms | 77.3906μs | 12.9215 KOps/s | 13.0224 KOps/s | |
test_unflatten_speed | 0.3806ms | 0.3191ms | 3.1335 KOps/s | 3.0672 KOps/s | |
test_common_ops | 1.6218ms | 1.2709ms | 786.8371 Ops/s | 768.2553 Ops/s | |
test_creation | 26.2910μs | 1.4856μs | 673.1458 KOps/s | 674.8182 KOps/s | |
test_creation_empty | 40.0610μs | 15.4194μs | 64.8532 KOps/s | 57.9181 KOps/s | |
test_creation_nested_1 | 54.6910μs | 17.6016μs | 56.8129 KOps/s | 52.2371 KOps/s | |
test_creation_nested_2 | 52.9200μs | 19.9163μs | 50.2100 KOps/s | 45.8447 KOps/s | |
test_clone | 55.8510μs | 29.9012μs | 33.4435 KOps/s | 35.1329 KOps/s | |
test_getitem[int] | 92.4022ms | 23.3864μs | 42.7599 KOps/s | 59.9319 KOps/s | |
test_getitem[slice_int] | 0.1211ms | 28.3042μs | 35.3304 KOps/s | 35.0527 KOps/s | |
test_getitem[range] | 0.2178ms | 0.1103ms | 9.0702 KOps/s | 9.2988 KOps/s | |
test_getitem[tuple] | 0.1272ms | 24.4828μs | 40.8450 KOps/s | 41.2139 KOps/s | |
test_getitem[list] | 0.1963ms | 98.3504μs | 10.1677 KOps/s | 9.6626 KOps/s | |
test_setitem_dim[int] | 68.1720μs | 45.3140μs | 22.0682 KOps/s | 21.0523 KOps/s | |
test_setitem_dim[slice_int] | 93.3120μs | 66.8406μs | 14.9610 KOps/s | 14.9262 KOps/s | |
test_setitem_dim[range] | 0.1790ms | 0.1312ms | 7.6214 KOps/s | 7.8111 KOps/s | |
test_setitem_dim[tuple] | 96.0820μs | 64.6615μs | 15.4652 KOps/s | 15.6947 KOps/s | |
test_setitem | 88.9820μs | 42.7362μs | 23.3994 KOps/s | 21.6875 KOps/s | |
test_set | 94.5420μs | 43.3233μs | 23.0823 KOps/s | 22.2315 KOps/s | |
test_set_shared | 0.3458ms | 54.7176μs | 18.2756 KOps/s | 18.4538 KOps/s | |
test_update | 96.0420μs | 51.7313μs | 19.3307 KOps/s | 19.2693 KOps/s | |
test_update_nested | 0.1094ms | 59.7830μs | 16.7272 KOps/s | 16.6587 KOps/s | |
test_update__nested | 0.1106ms | 65.5302μs | 15.2601 KOps/s | 15.3016 KOps/s | |
test_set_nested | 83.4220μs | 46.1569μs | 21.6652 KOps/s | 20.7886 KOps/s | |
test_set_nested_new | 89.8520μs | 48.9444μs | 20.4314 KOps/s | 19.6188 KOps/s | |
test_select | 0.1048ms | 62.8124μs | 15.9204 KOps/s | 15.3797 KOps/s | |
test_select_nested | 67.6320μs | 44.1386μs | 22.6559 KOps/s | 23.2932 KOps/s | |
test_exclude_nested | 0.1114ms | 59.5756μs | 16.7854 KOps/s | 16.7120 KOps/s | |
test_empty[True] | 0.3298ms | 0.2596ms | 3.8517 KOps/s | 3.8812 KOps/s | |
test_empty[False] | 2.9811μs | 0.7385μs | 1.3541 MOps/s | 1.3100 MOps/s | |
test_to | 55.9910μs | 26.7009μs | 37.4519 KOps/s | 36.2138 KOps/s | |
test_to_nonblocking | 59.1710μs | 25.6447μs | 38.9944 KOps/s | 35.0508 KOps/s | |
test_unbind_speed | 0.3316ms | 0.2925ms | 3.4194 KOps/s | 3.4157 KOps/s | |
test_unbind_speed_stack0 | 0.3307ms | 0.2856ms | 3.5010 KOps/s | 3.5958 KOps/s | |
test_unbind_speed_stack1 | 91.4093ms | 0.7135ms | 1.4015 KOps/s | 1.4090 KOps/s | |
test_split | 92.8449ms | 2.1980ms | 454.9658 Ops/s | 447.5672 Ops/s | |
test_chunk | 94.7864ms | 2.1938ms | 455.8360 Ops/s | 444.7776 Ops/s | |
test_creation[device0] | 0.3389ms | 0.1297ms | 7.7113 KOps/s | 7.8659 KOps/s | |
test_creation_from_tensor | 0.3469ms | 0.1281ms | 7.8089 KOps/s | 7.4675 KOps/s | |
test_add_one[memmap_tensor0] | 0.2193ms | 9.2367μs | 108.2640 KOps/s | 107.8833 KOps/s | |
test_contiguous[memmap_tensor0] | 40.2000μs | 2.2211μs | 450.2210 KOps/s | 458.2078 KOps/s | |
test_stack[memmap_tensor0] | 38.9410μs | 7.1184μs | 140.4807 KOps/s | 144.9507 KOps/s | |
test_memmaptd_index | 1.2570ms | 0.4363ms | 2.2923 KOps/s | 2.2368 KOps/s | |
test_memmaptd_index_astensor | 1.0010ms | 0.5039ms | 1.9846 KOps/s | 1.9360 KOps/s | |
test_memmaptd_index_op | 1.4649ms | 1.0593ms | 944.0508 Ops/s | 929.4001 Ops/s | |
test_serialize_model | 0.1274s | 0.1267s | 7.8937 Ops/s | 7.9191 Ops/s | |
test_serialize_model_pickle | 1.3692s | 1.2170s | 0.8217 Ops/s | 0.8248 Ops/s | |
test_serialize_weights | 0.1265s | 0.1258s | 7.9498 Ops/s | 7.2366 Ops/s | |
test_serialize_weights_returnearly | 0.2123s | 56.8247ms | 17.5980 Ops/s | 17.1329 Ops/s | |
test_serialize_weights_pickle | 1.3520s | 1.2128s | 0.8245 Ops/s | 0.8222 Ops/s | |
test_reshape_pytree | 79.4510μs | 36.8362μs | 27.1472 KOps/s | 26.9702 KOps/s | |
test_reshape_td | 75.9710μs | 41.4816μs | 24.1071 KOps/s | 21.9840 KOps/s | |
test_view_pytree | 72.3910μs | 37.4068μs | 26.7331 KOps/s | 26.1584 KOps/s | |
test_view_td | 87.1020μs | 46.8623μs | 21.3391 KOps/s | 20.1478 KOps/s | |
test_unbind_pytree | 65.2010μs | 36.9212μs | 27.0847 KOps/s | 28.7121 KOps/s | |
test_unbind_td | 0.4094ms | 47.5471μs | 21.0318 KOps/s | 23.3650 KOps/s | |
test_split_pytree | 0.5058ms | 49.0169μs | 20.4011 KOps/s | 20.4075 KOps/s | |
test_split_td | 92.9630ms | 65.7714μs | 15.2042 KOps/s | 17.1371 KOps/s | |
test_add_pytree | 0.2159ms | 60.2547μs | 16.5962 KOps/s | 17.7736 KOps/s | |
test_add_td | 0.1727ms | 90.5780μs | 11.0402 KOps/s | 9.7175 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2744ms | 0.1622ms | 6.1662 KOps/s | 5.9074 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3122ms | 0.1664ms | 6.0099 KOps/s | 6.0587 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1822ms | 0.1435ms | 6.9679 KOps/s | 6.9724 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2335ms | 0.1834ms | 5.4517 KOps/s | 5.4662 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 49.9410μs | 22.0971μs | 45.2548 KOps/s | 46.0296 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 79.1610μs | 49.2764μs | 20.2937 KOps/s | 20.0404 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1050ms | 64.6360μs | 15.4712 KOps/s | 15.3571 KOps/s | |
test_compile_copy_nested[pytree-eager] | 80.5920μs | 49.4008μs | 20.2426 KOps/s | 20.2432 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3975ms | 0.3209ms | 3.1163 KOps/s | 3.1066 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3250ms | 0.2379ms | 4.2035 KOps/s | 4.3498 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1736ms | 0.1267ms | 7.8921 KOps/s | 7.5782 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1080ms | 66.0398μs | 15.1424 KOps/s | 14.7490 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4687ms | 0.3261ms | 3.0662 KOps/s | 3.1179 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7945ms | 0.6434ms | 1.5542 KOps/s | 1.6148 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4166ms | 0.2923ms | 3.4206 KOps/s | 3.5876 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4639ms | 0.3324ms | 3.0088 KOps/s | 3.1091 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2045ms | 80.3316μs | 12.4484 KOps/s | 13.1272 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 5.1195ms | 0.1354ms | 7.3881 KOps/s | 7.7893 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6511ms | 0.5534ms | 1.8071 KOps/s | 1.8847 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3777ms | 0.3189ms | 3.1355 KOps/s | 3.1363 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 60.0210μs | 20.5942μs | 48.5574 KOps/s | 52.7423 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 75.6310μs | 38.9632μs | 25.6653 KOps/s | 24.2892 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1144ms | 69.1166μs | 14.4683 KOps/s | 14.2719 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1090ms | 51.1624μs | 19.5456 KOps/s | 19.2744 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3962ms | 0.8541ms | 1.1709 KOps/s | 1.0973 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4039ms | 3.2170ms | 310.8495 Ops/s | 306.3086 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3065ms | 0.8175ms | 1.2233 KOps/s | 1.1165 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.2276ms | 3.1475ms | 317.7091 Ops/s | 317.0454 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1530ms | 0.1074ms | 9.3082 KOps/s | 9.2256 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1876ms | 61.0547μs | 16.3788 KOps/s | 15.6756 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1460ms | 0.1060ms | 9.4302 KOps/s | 9.0480 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 92.7410μs | 45.0695μs | 22.1880 KOps/s | 21.6481 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1522ms | 0.1090ms | 9.1784 KOps/s | 8.9923 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 87.2310μs | 45.1765μs | 22.1354 KOps/s | 21.4942 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2177ms | 0.1409ms | 7.0969 KOps/s | 7.1843 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1583ms | 24.7774μs | 40.3594 KOps/s | 38.3022 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1756ms | 0.1351ms | 7.3993 KOps/s | 7.5364 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 55.2510μs | 21.0112μs | 47.5936 KOps/s | 46.3228 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1769ms | 0.1317ms | 7.5939 KOps/s | 7.1622 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 74.6610μs | 20.9046μs | 47.8365 KOps/s | 46.8255 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1737ms | 0.1374ms | 7.2802 KOps/s | 7.1761 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4930ms | 24.6602μs | 40.5511 KOps/s | 37.0827 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2777ms | 0.1365ms | 7.3264 KOps/s | 7.5099 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 42.6210μs | 21.2679μs | 47.0192 KOps/s | 47.5101 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1906ms | 0.1369ms | 7.3065 KOps/s | 7.1917 KOps/s | |
test_compile_indexing[int-pytree-eager] | 59.8810μs | 20.7755μs | 48.1336 KOps/s | 46.1364 KOps/s | |
test_mod_add[eager] | 79.2420μs | 33.4720μs | 29.8757 KOps/s | 30.4741 KOps/s | |
test_mod_add[compile] | 0.2139ms | 71.9198μs | 13.9044 KOps/s | 13.3007 KOps/s | |
test_mod_add[compile-overhead] | 0.2542ms | 0.1354ms | 7.3861 KOps/s | 7.1248 KOps/s | |
test_mod_wrap[eager] | 0.8793ms | 0.7762ms | 1.2884 KOps/s | 1.2712 KOps/s | |
test_mod_wrap[compile] | 2.0104ms | 0.8370ms | 1.1948 KOps/s | 1.2054 KOps/s | |
test_mod_wrap[compile-overhead] | 4.8816ms | 3.0480ms | 328.0816 Ops/s | 326.3853 Ops/s | |
test_mod_wrap_and_backward[eager] | 4.1679ms | 4.0289ms | 248.2090 Ops/s | 242.1125 Ops/s | |
test_mod_wrap_and_backward[compile] | 4.5927ms | 4.0538ms | 246.6793 Ops/s | 243.1358 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3214ms | 0.9075ms | 1.1019 KOps/s | 947.6109 Ops/s | |
test_seq_add[eager] | 0.1691ms | 0.1025ms | 9.7518 KOps/s | 9.2484 KOps/s | |
test_seq_add[compile] | 0.1512ms | 81.7035μs | 12.2394 KOps/s | 11.9050 KOps/s | |
test_seq_add[compile-overhead] | 0.1631ms | 0.1162ms | 8.6078 KOps/s | 8.7365 KOps/s | |
test_seq_wrap[eager] | 1.1051ms | 0.9270ms | 1.0787 KOps/s | 1.0695 KOps/s | |
test_seq_wrap[compile] | 1.1602ms | 0.8627ms | 1.1591 KOps/s | 1.1730 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2754ms | 0.2223ms | 4.4977 KOps/s | 4.3842 KOps/s | |
test_func_call_runtime[False-eager] | 2.4523ms | 2.3534ms | 424.9210 Ops/s | 415.7170 Ops/s | |
test_func_call_runtime[False-compile] | 2.6346ms | 2.4065ms | 415.5372 Ops/s | 421.0807 Ops/s | |
test_func_call_runtime[False-compile-overhead] | 0.5160ms | 0.3587ms | 2.7880 KOps/s | 2.7621 KOps/s | |
test_func_call_runtime[True-eager] | 2.7246ms | 2.5124ms | 398.0268 Ops/s | 398.3127 Ops/s | |
test_func_call_runtime[True-compile] | 2.5340ms | 2.4269ms | 412.0536 Ops/s | 416.9603 Ops/s | |
test_func_call_runtime[True-compile-overhead] | 0.4361ms | 0.3817ms | 2.6201 KOps/s | 2.5884 KOps/s | |
test_func_call_cm_runtime[False-eager] | 2.5820ms | 2.3504ms | 425.4614 Ops/s | 425.3230 Ops/s | |
test_func_call_cm_runtime[False-compile] | 3.3716ms | 2.4235ms | 412.6323 Ops/s | 421.8261 Ops/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4121ms | 0.3627ms | 2.7570 KOps/s | 2.7277 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.7337ms | 2.6280ms | 380.5103 Ops/s | 381.8068 Ops/s | |
test_func_call_cm_runtime[True-compile] | 2.5885ms | 2.4570ms | 407.0020 Ops/s | 411.6115 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4524ms | 0.4058ms | 2.4641 KOps/s | 2.4209 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.1681ms | 3.7258ms | 268.3953 Ops/s | 265.8805 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.5715ms | 2.4971ms | 400.4624 Ops/s | 407.4220 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5064ms | 0.4203ms | 2.3795 KOps/s | 2.4075 KOps/s | |
test_distributed | 2.2564ms | 0.2277ms | 4.3911 KOps/s | 8.4109 KOps/s | |
test_tdmodule | 33.4010μs | 14.8592μs | 67.2985 KOps/s | 63.0890 KOps/s | |
test_tdmodule_dispatch | 0.1389ms | 28.4702μs | 35.1245 KOps/s | 31.9094 KOps/s | |
test_tdseq | 35.1410μs | 15.4230μs | 64.8382 KOps/s | 59.8306 KOps/s | |
test_tdseq_dispatch | 51.5910μs | 31.0826μs | 32.1724 KOps/s | 29.6275 KOps/s | |
test_instantiation_functorch | 2.0268ms | 1.8565ms | 538.6506 Ops/s | 530.9760 Ops/s | |
test_instantiation_td | 1.8418ms | 1.1940ms | 837.5336 Ops/s | 815.5357 Ops/s | |
test_exec_functorch | 1.0934ms | 1.0021ms | 997.9489 Ops/s | 1.0003 KOps/s | |
test_exec_functional_call | 1.0748ms | 1.0003ms | 999.6634 Ops/s | 991.0057 Ops/s | |
test_exec_td | 1.1040ms | 1.0300ms | 970.8783 Ops/s | 972.3659 Ops/s | |
test_exec_td_decorator | 1.1508ms | 1.0589ms | 944.3695 Ops/s | 938.4328 Ops/s | |
test_vmap_mlp_speed[True-True] | 1.3513ms | 1.2597ms | 793.8120 Ops/s | 786.7953 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.3639ms | 1.2630ms | 791.7664 Ops/s | 789.6626 Ops/s | |
test_vmap_mlp_speed[False-True] | 1.2819ms | 1.1529ms | 867.3409 Ops/s | 869.6870 Ops/s | |
test_vmap_mlp_speed[False-False] | 1.2547ms | 1.1512ms | 868.6713 Ops/s | 863.6865 Ops/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3386ms | 1.2375ms | 808.1064 Ops/s | 810.0433 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.6914ms | 1.2383ms | 807.5816 Ops/s | 809.4758 Ops/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.2927ms | 1.1536ms | 866.8157 Ops/s | 867.1741 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2501ms | 1.1560ms | 865.0159 Ops/s | 865.4151 Ops/s | |
test_vmap_transformer_speed[True-True] | 13.3088ms | 12.9652ms | 77.1297 Ops/s | 76.3558 Ops/s | |
test_vmap_transformer_speed[True-False] | 13.4270ms | 12.9892ms | 76.9873 Ops/s | 76.5024 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.3921ms | 12.8237ms | 77.9806 Ops/s | 78.0220 Ops/s | |
test_vmap_transformer_speed[False-False] | 13.2321ms | 12.7641ms | 78.3445 Ops/s | 77.9944 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 33.9418ms | 33.4486ms | 29.8966 Ops/s | 30.0246 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 34.0363ms | 33.4086ms | 29.9325 Ops/s | 29.8874 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 34.4638ms | 33.3476ms | 29.9871 Ops/s | 30.0393 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 33.9998ms | 33.3381ms | 29.9957 Ops/s | 30.0852 Ops/s | |
test_to_module_speed[True] | 1.2446ms | 0.9870ms | 1.0132 KOps/s | 980.9178 Ops/s | |
test_to_module_speed[False] | 1.3460ms | 0.9653ms | 1.0359 KOps/s | 1.0061 KOps/s | |
test_tc_init | 70.6310μs | 35.2663μs | 28.3557 KOps/s | 26.6116 KOps/s | |
test_tc_init_nested | 0.1128ms | 70.3442μs | 14.2158 KOps/s | 13.2240 KOps/s | |
test_tc_first_layer_tensor | 4.1273μs | 0.6876μs | 1.4543 MOps/s | 1.4907 MOps/s | |
test_tc_first_layer_nontensor | 31.9610μs | 2.2752μs | 439.5173 KOps/s | 446.8183 KOps/s | |
test_tc_second_layer_tensor | 8.2575μs | 1.3871μs | 720.9354 KOps/s | 728.5186 KOps/s | |
test_tc_second_layer_nontensor | 26.0010μs | 2.8743μs | 347.9164 KOps/s | 334.5002 KOps/s | |
test_unbind | 0.1929s | 12.1914ms | 82.0250 Ops/s | 92.8402 Ops/s | |
test_full_like | 0.6646ms | 0.5751ms | 1.7388 KOps/s | 1.7402 KOps/s | |
test_zeros_like | 0.2636ms | 0.1978ms | 5.0553 KOps/s | 5.0550 KOps/s | |
test_ones_like | 0.2336ms | 0.1977ms | 5.0593 KOps/s | 5.0576 KOps/s | |
test_clone | 0.4500ms | 0.4139ms | 2.4163 KOps/s | 2.4169 KOps/s | |
test_squeeze | 34.9900μs | 9.8769μs | 101.2466 KOps/s | 98.1188 KOps/s | |
test_unsqueeze | 0.2251ms | 76.1287μs | 13.1356 KOps/s | 12.8015 KOps/s | |
test_split | 0.4429ms | 0.1593ms | 6.2790 KOps/s | 6.2572 KOps/s | |
test_permute | 0.2209ms | 0.1787ms | 5.5944 KOps/s | 5.2968 KOps/s | |
test_stack | 1.2554ms | 0.8653ms | 1.1557 KOps/s | 1.1817 KOps/s | |
test_cat | 1.2503ms | 1.2312ms | 812.1867 Ops/s | 812.1426 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):