-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Better handling of params and buffers in bytes #1059
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 24, 2024
ghstack-source-id: 87945c47b376d223bb3dc33bd6ec7cb9bb047455 Pull Request resolved: #1059
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 50.9850μs | 25.2778μs | 39.5604 KOps/s | 36.2971 KOps/s | |
test_plain_set_stack_nested | 54.2310μs | 25.4077μs | 39.3581 KOps/s | 39.2841 KOps/s | |
test_plain_set_nested_inplace | 0.1320ms | 28.1206μs | 35.5611 KOps/s | 36.4272 KOps/s | |
test_plain_set_stack_nested_inplace | 65.5130μs | 27.8515μs | 35.9047 KOps/s | 36.7080 KOps/s | |
test_items | 27.9720μs | 4.5793μs | 218.3757 KOps/s | 240.3508 KOps/s | |
test_items_nested | 0.5260ms | 0.3823ms | 2.6159 KOps/s | 2.6618 KOps/s | |
test_items_nested_locked | 0.6798ms | 0.3822ms | 2.6163 KOps/s | 2.6594 KOps/s | |
test_items_nested_leaf | 0.1630ms | 81.4655μs | 12.2751 KOps/s | 12.3408 KOps/s | |
test_items_stack_nested | 0.9544ms | 0.3950ms | 2.5319 KOps/s | 2.6390 KOps/s | |
test_items_stack_nested_leaf | 0.1924ms | 85.4768μs | 11.6991 KOps/s | 12.3086 KOps/s | |
test_items_stack_nested_locked | 0.5499ms | 0.3833ms | 2.6088 KOps/s | 2.6326 KOps/s | |
test_keys | 27.4520μs | 3.5101μs | 284.8948 KOps/s | 251.5046 KOps/s | |
test_keys_nested | 0.5425ms | 0.1380ms | 7.2485 KOps/s | 7.4542 KOps/s | |
test_keys_nested_locked | 1.8447ms | 0.1421ms | 7.0375 KOps/s | 7.1337 KOps/s | |
test_keys_nested_leaf | 0.1978ms | 0.1194ms | 8.3742 KOps/s | 8.5304 KOps/s | |
test_keys_stack_nested | 0.2348ms | 0.1357ms | 7.3694 KOps/s | 7.4286 KOps/s | |
test_keys_stack_nested_leaf | 0.4376ms | 0.1202ms | 8.3180 KOps/s | 8.4820 KOps/s | |
test_keys_stack_nested_locked | 0.2421ms | 0.1412ms | 7.0826 KOps/s | 7.1711 KOps/s | |
test_values | 7.3396μs | 1.0549μs | 947.9499 KOps/s | 958.9198 KOps/s | |
test_values_nested | 0.1665ms | 93.7434μs | 10.6674 KOps/s | 10.7122 KOps/s | |
test_values_nested_locked | 0.1756ms | 95.6591μs | 10.4538 KOps/s | 10.7122 KOps/s | |
test_values_nested_leaf | 0.1361ms | 79.4629μs | 12.5845 KOps/s | 12.6714 KOps/s | |
test_values_stack_nested | 0.1697ms | 92.4218μs | 10.8200 KOps/s | 10.0742 KOps/s | |
test_values_stack_nested_leaf | 0.1794ms | 80.0854μs | 12.4867 KOps/s | 12.7826 KOps/s | |
test_values_stack_nested_locked | 0.1730ms | 92.5867μs | 10.8007 KOps/s | 10.6162 KOps/s | |
test_membership | 19.8060μs | 0.8869μs | 1.1275 MOps/s | 1.1188 MOps/s | |
test_membership_nested | 37.8600μs | 2.7578μs | 362.6123 KOps/s | 362.6169 KOps/s | |
test_membership_nested_leaf | 38.5820μs | 2.7750μs | 360.3606 KOps/s | 362.6622 KOps/s | |
test_membership_stacked_nested | 22.2820μs | 2.7589μs | 362.4628 KOps/s | 365.2297 KOps/s | |
test_membership_stacked_nested_leaf | 21.6000μs | 2.7399μs | 364.9791 KOps/s | 364.9493 KOps/s | |
test_membership_nested_last | 40.6560μs | 4.0848μs | 244.8089 KOps/s | 239.9212 KOps/s | |
test_membership_nested_leaf_last | 0.1354ms | 4.2217μs | 236.8733 KOps/s | 235.5535 KOps/s | |
test_membership_stacked_nested_last | 29.9060μs | 4.1591μs | 240.4384 KOps/s | 241.6389 KOps/s | |
test_membership_stacked_nested_leaf_last | 20.2780μs | 4.1523μs | 240.8287 KOps/s | 238.5202 KOps/s | |
test_nested_getleaf | 48.2200μs | 10.8263μs | 92.3674 KOps/s | 93.7447 KOps/s | |
test_nested_get | 54.5510μs | 10.4548μs | 95.6495 KOps/s | 97.6538 KOps/s | |
test_stacked_getleaf | 31.0680μs | 10.7153μs | 93.3244 KOps/s | 93.9791 KOps/s | |
test_stacked_get | 59.6510μs | 10.0628μs | 99.3764 KOps/s | 98.6289 KOps/s | |
test_nested_getitemleaf | 0.2401ms | 11.0028μs | 90.8862 KOps/s | 90.1356 KOps/s | |
test_nested_getitem | 38.2210μs | 10.3936μs | 96.2129 KOps/s | 97.3860 KOps/s | |
test_stacked_getitemleaf | 34.8150μs | 11.0169μs | 90.7693 KOps/s | 91.8617 KOps/s | |
test_stacked_getitem | 49.1190μs | 10.0789μs | 99.2169 KOps/s | 96.6376 KOps/s | |
test_lock_nested | 2.0079ms | 0.5070ms | 1.9725 KOps/s | 2.0036 KOps/s | |
test_lock_stack_nested | 0.8646ms | 0.4809ms | 2.0795 KOps/s | 2.0969 KOps/s | |
test_unlock_nested | 0.7429ms | 0.4235ms | 2.3612 KOps/s | 2.4004 KOps/s | |
test_unlock_stack_nested | 0.7564ms | 0.3963ms | 2.5230 KOps/s | 2.5634 KOps/s | |
test_flatten_speed | 0.1967ms | 0.1011ms | 9.8886 KOps/s | 10.0276 KOps/s | |
test_unflatten_speed | 0.6113ms | 0.5289ms | 1.8908 KOps/s | 1.9785 KOps/s | |
test_common_ops | 2.3898ms | 1.1640ms | 859.1179 Ops/s | 874.9428 Ops/s | |
test_creation | 15.4390μs | 2.0885μs | 478.8170 KOps/s | 473.6908 KOps/s | |
test_creation_empty | 56.3850μs | 19.6430μs | 50.9086 KOps/s | 52.3445 KOps/s | |
test_creation_nested_1 | 63.5680μs | 23.3505μs | 42.8256 KOps/s | 44.1723 KOps/s | |
test_creation_nested_2 | 0.1847ms | 28.0021μs | 35.7117 KOps/s | 37.2542 KOps/s | |
test_clone | 0.1069ms | 17.4114μs | 57.4336 KOps/s | 57.7216 KOps/s | |
test_getitem[int] | 1.0862ms | 17.2734μs | 57.8925 KOps/s | 60.4099 KOps/s | |
test_getitem[slice_int] | 0.1409ms | 31.5720μs | 31.6736 KOps/s | 30.8119 KOps/s | |
test_getitem[range] | 0.1678ms | 58.1938μs | 17.1840 KOps/s | 17.3759 KOps/s | |
test_getitem[tuple] | 0.3387ms | 27.1929μs | 36.7743 KOps/s | 40.1226 KOps/s | |
test_getitem[list] | 0.2069ms | 53.4016μs | 18.7260 KOps/s | 18.9127 KOps/s | |
test_setitem_dim[int] | 72.4160μs | 34.5786μs | 28.9196 KOps/s | 30.0335 KOps/s | |
test_setitem_dim[slice_int] | 0.1146ms | 62.7581μs | 15.9342 KOps/s | 16.0455 KOps/s | |
test_setitem_dim[range] | 0.1274ms | 84.9173μs | 11.7762 KOps/s | 11.8995 KOps/s | |
test_setitem_dim[tuple] | 0.1261ms | 51.3566μs | 19.4717 KOps/s | 20.4723 KOps/s | |
test_setitem | 0.1131ms | 31.1630μs | 32.0893 KOps/s | 32.3908 KOps/s | |
test_set | 0.1893ms | 32.7878μs | 30.4991 KOps/s | 33.4985 KOps/s | |
test_set_shared | 3.8127ms | 0.2184ms | 4.5792 KOps/s | 4.5462 KOps/s | |
test_update | 0.8563ms | 40.1160μs | 24.9277 KOps/s | 25.7156 KOps/s | |
test_update_nested | 0.1336ms | 51.2992μs | 19.4935 KOps/s | 20.2399 KOps/s | |
test_update__nested | 0.1362ms | 45.8377μs | 21.8161 KOps/s | 22.3840 KOps/s | |
test_set_nested | 0.1157ms | 34.0409μs | 29.3765 KOps/s | 29.6126 KOps/s | |
test_set_nested_new | 0.1245ms | 39.1609μs | 25.5357 KOps/s | 26.3857 KOps/s | |
test_select | 0.3712ms | 56.8991μs | 17.5750 KOps/s | 18.0551 KOps/s | |
test_select_nested | 0.1443ms | 60.9780μs | 16.3994 KOps/s | 16.8010 KOps/s | |
test_exclude_nested | 0.1848ms | 75.8185μs | 13.1894 KOps/s | 13.4319 KOps/s | |
test_empty[True] | 0.4256ms | 0.3530ms | 2.8325 KOps/s | 2.8476 KOps/s | |
test_empty[False] | 10.8920μs | 1.2762μs | 783.5608 KOps/s | 815.8417 KOps/s | |
test_unbind_speed | 0.4358ms | 0.3069ms | 3.2582 KOps/s | 3.3564 KOps/s | |
test_unbind_speed_stack0 | 0.6287ms | 0.3084ms | 3.2424 KOps/s | 3.3692 KOps/s | |
test_unbind_speed_stack1 | 0.1044s | 0.8169ms | 1.2242 KOps/s | 1.3541 KOps/s | |
test_split | 94.8092ms | 2.2791ms | 438.7619 Ops/s | 465.1243 Ops/s | |
test_chunk | 3.3165ms | 2.0851ms | 479.5905 Ops/s | 463.8114 Ops/s | |
test_creation[device0] | 0.2061ms | 0.1155ms | 8.6578 KOps/s | 8.5613 KOps/s | |
test_creation_from_tensor | 4.0363ms | 0.1189ms | 8.4086 KOps/s | 8.5654 KOps/s | |
test_add_one[memmap_tensor0] | 0.3127ms | 7.4140μs | 134.8805 KOps/s | 131.3967 KOps/s | |
test_contiguous[memmap_tensor0] | 29.6450μs | 1.8680μs | 535.3351 KOps/s | 528.9048 KOps/s | |
test_stack[memmap_tensor0] | 62.9080μs | 5.8100μs | 172.1184 KOps/s | 178.8852 KOps/s | |
test_memmaptd_index | 1.1172ms | 0.4144ms | 2.4131 KOps/s | 2.4443 KOps/s | |
test_memmaptd_index_astensor | 0.7936ms | 0.5150ms | 1.9416 KOps/s | 1.9629 KOps/s | |
test_memmaptd_index_op | 1.5181ms | 1.0907ms | 916.8726 Ops/s | 946.5318 Ops/s | |
test_serialize_model | 0.2190s | 0.1360s | 7.3525 Ops/s | 8.5497 Ops/s | |
test_serialize_model_pickle | 0.4498s | 0.3910s | 2.5575 Ops/s | 2.5343 Ops/s | |
test_serialize_weights | 0.1226s | 0.1149s | 8.7015 Ops/s | 7.5729 Ops/s | |
test_serialize_weights_returnearly | 0.2111s | 0.1641s | 6.0950 Ops/s | 6.3326 Ops/s | |
test_serialize_weights_pickle | 1.2492s | 0.7413s | 1.3490 Ops/s | 1.1823 Ops/s | |
test_serialize_weights_filesystem | 0.1466s | 0.1397s | 7.1579 Ops/s | 6.9977 Ops/s | |
test_serialize_model_filesystem | 0.1492s | 0.1432s | 6.9817 Ops/s | 6.3504 Ops/s | |
test_reshape_pytree | 85.4800μs | 39.1159μs | 25.5651 KOps/s | 25.8883 KOps/s | |
test_reshape_td | 94.9270μs | 46.5750μs | 21.4707 KOps/s | 21.0027 KOps/s | |
test_view_pytree | 0.1142ms | 38.8931μs | 25.7115 KOps/s | 26.0437 KOps/s | |
test_view_td | 0.1287ms | 51.8198μs | 19.2977 KOps/s | 18.7811 KOps/s | |
test_unbind_pytree | 82.6140μs | 35.5093μs | 28.1616 KOps/s | 27.6032 KOps/s | |
test_unbind_td | 0.3025ms | 44.9891μs | 22.2276 KOps/s | 22.3532 KOps/s | |
test_split_pytree | 82.3440μs | 37.9234μs | 26.3690 KOps/s | 26.6139 KOps/s | |
test_split_td | 0.4640ms | 60.3392μs | 16.5730 KOps/s | 17.4408 KOps/s | |
test_add_pytree | 0.1212ms | 44.6161μs | 22.4135 KOps/s | 21.5931 KOps/s | |
test_add_td | 0.2418ms | 89.8526μs | 11.1293 KOps/s | 11.6290 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1448ms | 73.5814μs | 13.5904 KOps/s | 14.0955 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4239ms | 0.2010ms | 4.9752 KOps/s | 4.9636 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1633ms | 55.4518μs | 18.0337 KOps/s | 18.5034 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.4902ms | 0.1476ms | 6.7767 KOps/s | 6.7221 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 83.7860μs | 28.5121μs | 35.0728 KOps/s | 36.8720 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1509ms | 76.7001μs | 13.0378 KOps/s | 13.0191 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1952ms | 78.2960μs | 12.7720 KOps/s | 12.6426 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1348ms | 66.7650μs | 14.9779 KOps/s | 14.7020 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2593ms | 0.1222ms | 8.1805 KOps/s | 8.2369 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.5037ms | 0.2460ms | 4.0646 KOps/s | 4.0690 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1114ms | 54.2452μs | 18.4348 KOps/s | 19.1453 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2110ms | 79.0778μs | 12.6458 KOps/s | 12.7849 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2365ms | 0.1130ms | 8.8516 KOps/s | 9.1545 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4942ms | 0.2994ms | 3.3397 KOps/s | 3.3683 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5194ms | 0.2781ms | 3.5955 KOps/s | 3.6459 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2772ms | 0.1258ms | 7.9499 KOps/s | 8.3687 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1973ms | 75.3485μs | 13.2717 KOps/s | 13.2071 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1058ms | 53.7655μs | 18.5993 KOps/s | 18.6299 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4140ms | 0.2406ms | 4.1561 KOps/s | 4.0969 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1948ms | 0.1123ms | 8.9073 KOps/s | 9.0839 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 63.8890μs | 28.9546μs | 34.5369 KOps/s | 32.7889 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1950ms | 78.3342μs | 12.7658 KOps/s | 13.1418 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1989ms | 81.6671μs | 12.2448 KOps/s | 12.4928 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1449ms | 69.6925μs | 14.3488 KOps/s | 14.8215 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3187ms | 0.2141ms | 4.6704 KOps/s | 4.7908 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1096ms | 1.8817ms | 531.4270 Ops/s | 550.4544 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2975ms | 0.2088ms | 4.7882 KOps/s | 4.8541 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7969ms | 1.1745ms | 851.4409 Ops/s | 862.2249 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5673ms | 0.4644ms | 2.1535 KOps/s | 2.2151 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.0429ms | 4.2707ms | 234.1532 Ops/s | 238.7514 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 91.3400μs | 43.3303μs | 23.0786 KOps/s | 23.4419 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5126ms | 49.6883μs | 20.1255 KOps/s | 20.2112 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1262ms | 36.7438μs | 27.2155 KOps/s | 27.1077 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 76.6030μs | 29.6479μs | 33.7292 KOps/s | 34.2690 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 98.8440μs | 37.6640μs | 26.5505 KOps/s | 26.1505 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 82.7040μs | 29.3167μs | 34.1102 KOps/s | 34.1464 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1989ms | 78.0715μs | 12.8088 KOps/s | 12.7074 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.7490ms | 29.9881μs | 33.3466 KOps/s | 36.0119 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1316ms | 70.7183μs | 14.1406 KOps/s | 14.1155 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 74.7400μs | 24.4762μs | 40.8560 KOps/s | 42.4356 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1584ms | 71.7133μs | 13.9444 KOps/s | 14.0314 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 64.5000μs | 24.3573μs | 41.0555 KOps/s | 42.8857 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1737ms | 77.9945μs | 12.8214 KOps/s | 12.5016 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.7960ms | 29.7332μs | 33.6324 KOps/s | 35.2287 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1545ms | 71.3077μs | 14.0237 KOps/s | 13.6522 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 74.4390μs | 24.2440μs | 41.2473 KOps/s | 42.9394 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1636ms | 71.2426μs | 14.0365 KOps/s | 13.9865 KOps/s | |
test_compile_indexing[int-pytree-eager] | 78.3860μs | 24.1755μs | 41.3642 KOps/s | 42.4792 KOps/s | |
test_mod_add[eager] | 0.1017ms | 27.0468μs | 36.9729 KOps/s | 37.3886 KOps/s | |
test_mod_add[compile] | 0.1128ms | 43.7501μs | 22.8571 KOps/s | 22.2525 KOps/s | |
test_mod_add[compile-overhead] | 0.1297ms | 45.2390μs | 22.1048 KOps/s | 21.8189 KOps/s | |
test_mod_wrap[eager] | 0.3530ms | 0.2145ms | 4.6621 KOps/s | 4.5375 KOps/s | |
test_mod_wrap[compile] | 1.6844ms | 0.2031ms | 4.9234 KOps/s | 4.8687 KOps/s | |
test_mod_wrap[compile-overhead] | 1.8272ms | 0.2049ms | 4.8803 KOps/s | 4.9270 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.4587ms | 11.3453ms | 88.1419 Ops/s | 89.8715 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.6384ms | 10.7640ms | 92.9025 Ops/s | 91.8278 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.3171ms | 10.7737ms | 92.8184 Ops/s | 91.7907 Ops/s | |
test_seq_add[eager] | 0.2135ms | 91.8749μs | 10.8844 KOps/s | 10.5742 KOps/s | |
test_seq_add[compile] | 0.3087ms | 59.2228μs | 16.8854 KOps/s | 17.0663 KOps/s | |
test_seq_add[compile-overhead] | 0.1119ms | 56.6774μs | 17.6437 KOps/s | 17.1668 KOps/s | |
test_seq_wrap[eager] | 0.7140ms | 0.3922ms | 2.5499 KOps/s | 2.5482 KOps/s | |
test_seq_wrap[compile] | 0.4091ms | 0.2211ms | 4.5238 KOps/s | 4.4813 KOps/s | |
test_seq_wrap[compile-overhead] | 0.7652ms | 0.2241ms | 4.4631 KOps/s | 4.4687 KOps/s | |
test_func_call_runtime[False-eager] | 1.1828ms | 0.5546ms | 1.8032 KOps/s | 1.8117 KOps/s | |
test_func_call_runtime[False-compile] | 0.8060ms | 0.4303ms | 2.3242 KOps/s | 2.4005 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5823ms | 0.4307ms | 2.3218 KOps/s | 2.3804 KOps/s | |
test_func_call_runtime[True-eager] | 1.2490ms | 0.7639ms | 1.3091 KOps/s | 1.3327 KOps/s | |
test_func_call_runtime[True-compile] | 0.8568ms | 0.4710ms | 2.1232 KOps/s | 2.1896 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9894ms | 0.4700ms | 2.1278 KOps/s | 2.1927 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9872ms | 0.5552ms | 1.8012 KOps/s | 1.8737 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5349ms | 0.4264ms | 2.3451 KOps/s | 2.4025 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5249ms | 0.4244ms | 2.3564 KOps/s | 2.3857 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1289ms | 0.9194ms | 1.0877 KOps/s | 1.1253 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6174ms | 0.4917ms | 2.0339 KOps/s | 2.0519 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2408ms | 0.5050ms | 1.9800 KOps/s | 2.0285 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.3608ms | 1.9250ms | 519.4894 Ops/s | 514.3537 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0841ms | 0.5146ms | 1.9433 KOps/s | 1.9634 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9386ms | 0.5207ms | 1.9205 KOps/s | 1.9470 KOps/s | |
test_distributed | 0.3528ms | 0.1301ms | 7.6864 KOps/s | 7.9096 KOps/s | |
test_tdmodule | 0.1292ms | 18.8109μs | 53.1608 KOps/s | 54.3624 KOps/s | |
test_tdmodule_dispatch | 69.9000μs | 37.8946μs | 26.3890 KOps/s | 27.1553 KOps/s | |
test_tdseq | 43.9620μs | 22.0067μs | 45.4407 KOps/s | 47.1451 KOps/s | |
test_tdseq_dispatch | 78.9170μs | 43.6641μs | 22.9021 KOps/s | 23.9203 KOps/s | |
test_instantiation_functorch | 2.3553ms | 1.5420ms | 648.4967 Ops/s | 671.5561 Ops/s | |
test_exec_functorch | 0.3241ms | 0.1807ms | 5.5349 KOps/s | 5.6503 KOps/s | |
test_exec_functional_call | 0.3776ms | 0.1749ms | 5.7171 KOps/s | 5.8677 KOps/s | |
test_exec_td_decorator | 0.5443ms | 0.2390ms | 4.1845 KOps/s | 4.3227 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8604ms | 0.6648ms | 1.5043 KOps/s | 1.5151 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.4865ms | 0.6815ms | 1.4673 KOps/s | 1.5476 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7885ms | 0.5435ms | 1.8400 KOps/s | 1.8764 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9429ms | 0.5437ms | 1.8391 KOps/s | 1.8883 KOps/s | |
test_to_module_speed[True] | 1.9313ms | 1.3816ms | 723.7976 Ops/s | 728.2294 Ops/s | |
test_to_module_speed[False] | 2.3072ms | 1.3789ms | 725.2033 Ops/s | 748.7661 Ops/s | |
test_tc_init | 94.9570μs | 49.3110μs | 20.2795 KOps/s | 21.4068 KOps/s | |
test_tc_init_nested | 0.2019ms | 98.8200μs | 10.1194 KOps/s | 10.7995 KOps/s | |
test_tc_first_layer_tensor | 20.5280μs | 1.5658μs | 638.6400 KOps/s | 674.1653 KOps/s | |
test_tc_first_layer_nontensor | 23.1430μs | 4.7552μs | 210.2953 KOps/s | 211.4929 KOps/s | |
test_tc_second_layer_tensor | 31.4280μs | 2.8450μs | 351.4901 KOps/s | 365.8675 KOps/s | |
test_tc_second_layer_nontensor | 40.2150μs | 6.0847μs | 164.3477 KOps/s | 163.9763 KOps/s | |
test_unbind | 0.2244s | 13.4579ms | 74.3056 Ops/s | 80.8105 Ops/s | |
test_full_like | 8.8134ms | 7.8154ms | 127.9529 Ops/s | 141.8460 Ops/s | |
test_zeros_like | 4.1868ms | 3.0080ms | 332.4438 Ops/s | 364.5264 Ops/s | |
test_ones_like | 4.0811ms | 3.5213ms | 283.9878 Ops/s | 323.5869 Ops/s | |
test_clone | 6.1733ms | 5.5212ms | 181.1205 Ops/s | 182.0028 Ops/s | |
test_squeeze | 65.8030μs | 13.0837μs | 76.4310 KOps/s | 78.1954 KOps/s | |
test_unsqueeze | 0.2059ms | 94.4143μs | 10.5916 KOps/s | 10.6146 KOps/s | |
test_split | 0.4893ms | 0.2005ms | 4.9884 KOps/s | 5.1880 KOps/s | |
test_permute | 0.3850ms | 0.2227ms | 4.4909 KOps/s | 4.5270 KOps/s | |
test_stack | 31.1231ms | 26.3454ms | 37.9573 Ops/s | 39.1565 Ops/s | |
test_cat | 31.7582ms | 25.9747ms | 38.4991 Ops/s | 37.4143 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.1120μs | 16.3559μs | 61.1402 KOps/s | 56.9670 KOps/s | |
test_plain_set_stack_nested | 45.4920μs | 16.4144μs | 60.9220 KOps/s | 56.1203 KOps/s | |
test_plain_set_nested_inplace | 45.8320μs | 17.5377μs | 57.0201 KOps/s | 52.2424 KOps/s | |
test_plain_set_stack_nested_inplace | 51.0520μs | 17.5583μs | 56.9532 KOps/s | 52.1962 KOps/s | |
test_items | 28.1610μs | 2.8939μs | 345.5599 KOps/s | 339.5853 KOps/s | |
test_items_nested | 0.3776ms | 0.3393ms | 2.9477 KOps/s | 2.9190 KOps/s | |
test_items_nested_locked | 0.3751ms | 0.3412ms | 2.9308 KOps/s | 2.9209 KOps/s | |
test_items_nested_leaf | 92.9540μs | 64.3164μs | 15.5481 KOps/s | 15.6053 KOps/s | |
test_items_stack_nested | 0.3847ms | 0.3422ms | 2.9223 KOps/s | 2.9146 KOps/s | |
test_items_stack_nested_leaf | 92.2140μs | 67.6533μs | 14.7813 KOps/s | 15.1308 KOps/s | |
test_items_stack_nested_locked | 0.3843ms | 0.3449ms | 2.8997 KOps/s | 2.8809 KOps/s | |
test_keys | 29.2510μs | 3.4648μs | 288.6185 KOps/s | 287.5643 KOps/s | |
test_keys_nested | 0.2455ms | 71.9638μs | 13.8959 KOps/s | 13.9297 KOps/s | |
test_keys_nested_locked | 0.7104ms | 77.6068μs | 12.8855 KOps/s | 12.8679 KOps/s | |
test_keys_nested_leaf | 92.5140μs | 62.1584μs | 16.0879 KOps/s | 15.9261 KOps/s | |
test_keys_stack_nested | 0.1124ms | 72.9481μs | 13.7084 KOps/s | 13.7180 KOps/s | |
test_keys_stack_nested_leaf | 0.1020ms | 64.2188μs | 15.5718 KOps/s | 15.5271 KOps/s | |
test_keys_stack_nested_locked | 0.1305ms | 77.6926μs | 12.8712 KOps/s | 12.7175 KOps/s | |
test_values | 5.4387μs | 0.8513μs | 1.1747 MOps/s | 1.1749 MOps/s | |
test_values_nested | 85.1540μs | 50.1209μs | 19.9518 KOps/s | 20.0560 KOps/s | |
test_values_nested_locked | 84.2540μs | 51.6620μs | 19.3566 KOps/s | 19.3931 KOps/s | |
test_values_nested_leaf | 80.2230μs | 43.4079μs | 23.0373 KOps/s | 23.1727 KOps/s | |
test_values_stack_nested | 90.8640μs | 51.3826μs | 19.4618 KOps/s | 19.5647 KOps/s | |
test_values_stack_nested_leaf | 76.6740μs | 43.9010μs | 22.7785 KOps/s | 22.4047 KOps/s | |
test_values_stack_nested_locked | 89.7440μs | 52.3210μs | 19.1128 KOps/s | 19.1090 KOps/s | |
test_membership | 1.7676μs | 0.5316μs | 1.8810 MOps/s | 1.8716 MOps/s | |
test_membership_nested | 32.3420μs | 2.0310μs | 492.3575 KOps/s | 521.9606 KOps/s | |
test_membership_nested_leaf | 20.7160μs | 1.9782μs | 505.5062 KOps/s | 523.5041 KOps/s | |
test_membership_stacked_nested | 48.2220μs | 2.0123μs | 496.9341 KOps/s | 517.2289 KOps/s | |
test_membership_stacked_nested_leaf | 32.2310μs | 2.0165μs | 495.9178 KOps/s | 509.2876 KOps/s | |
test_membership_nested_last | 36.8820μs | 3.0614μs | 326.6452 KOps/s | 331.0646 KOps/s | |
test_membership_nested_leaf_last | 51.2420μs | 3.0316μs | 329.8538 KOps/s | 329.2018 KOps/s | |
test_membership_stacked_nested_last | 29.1210μs | 4.3158μs | 231.7074 KOps/s | 281.8751 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.6210μs | 4.3461μs | 230.0888 KOps/s | 285.3937 KOps/s | |
test_nested_getleaf | 67.5030μs | 5.9964μs | 166.7679 KOps/s | 167.0660 KOps/s | |
test_nested_get | 39.7120μs | 5.7080μs | 175.1912 KOps/s | 175.4234 KOps/s | |
test_stacked_getleaf | 37.7010μs | 6.0278μs | 165.8968 KOps/s | 166.3138 KOps/s | |
test_stacked_get | 31.0220μs | 5.7352μs | 174.3612 KOps/s | 175.0660 KOps/s | |
test_nested_getitemleaf | 39.2020μs | 6.1399μs | 162.8700 KOps/s | 166.1194 KOps/s | |
test_nested_getitem | 30.1910μs | 5.7835μs | 172.9052 KOps/s | 173.6457 KOps/s | |
test_stacked_getitemleaf | 27.1020μs | 6.1027μs | 163.8607 KOps/s | 163.9374 KOps/s | |
test_stacked_getitem | 31.2920μs | 5.7629μs | 173.5232 KOps/s | 172.7621 KOps/s | |
test_lock_nested | 0.8497ms | 0.4345ms | 2.3013 KOps/s | 2.3283 KOps/s | |
test_lock_stack_nested | 0.4414ms | 0.3896ms | 2.5667 KOps/s | 2.5113 KOps/s | |
test_unlock_nested | 0.8251ms | 0.3719ms | 2.6892 KOps/s | 2.7052 KOps/s | |
test_unlock_stack_nested | 0.3604ms | 0.3284ms | 3.0453 KOps/s | 2.9844 KOps/s | |
test_flatten_speed | 0.1187ms | 78.6786μs | 12.7099 KOps/s | 12.5378 KOps/s | |
test_unflatten_speed | 0.3663ms | 0.3221ms | 3.1048 KOps/s | 3.1016 KOps/s | |
test_common_ops | 1.7922ms | 1.2467ms | 802.1026 Ops/s | 770.5488 Ops/s | |
test_creation | 22.4910μs | 1.5035μs | 665.1055 KOps/s | 666.6443 KOps/s | |
test_creation_empty | 38.8020μs | 14.9142μs | 67.0501 KOps/s | 55.4549 KOps/s | |
test_creation_nested_1 | 53.8220μs | 16.6288μs | 60.1366 KOps/s | 50.3835 KOps/s | |
test_creation_nested_2 | 47.7420μs | 19.3445μs | 51.6943 KOps/s | 44.9573 KOps/s | |
test_clone | 63.4130μs | 29.1266μs | 34.3329 KOps/s | 34.1029 KOps/s | |
test_getitem[int] | 1.2815ms | 16.6025μs | 60.2321 KOps/s | 59.5450 KOps/s | |
test_getitem[slice_int] | 0.1360ms | 30.1560μs | 33.1609 KOps/s | 33.5631 KOps/s | |
test_getitem[range] | 0.3372ms | 0.1177ms | 8.4964 KOps/s | 8.6731 KOps/s | |
test_getitem[tuple] | 0.1310ms | 26.2633μs | 38.0760 KOps/s | 38.5319 KOps/s | |
test_getitem[list] | 0.2121ms | 0.1053ms | 9.4933 KOps/s | 9.4946 KOps/s | |
test_setitem_dim[int] | 0.1257ms | 45.2111μs | 22.1185 KOps/s | 22.0500 KOps/s | |
test_setitem_dim[slice_int] | 0.1202ms | 71.3952μs | 14.0065 KOps/s | 14.6102 KOps/s | |
test_setitem_dim[range] | 0.1831ms | 0.1312ms | 7.6207 KOps/s | 7.6484 KOps/s | |
test_setitem_dim[tuple] | 96.6540μs | 61.9140μs | 16.1514 KOps/s | 16.2960 KOps/s | |
test_setitem | 84.1740μs | 41.6906μs | 23.9862 KOps/s | 23.3036 KOps/s | |
test_set | 83.1630μs | 42.1660μs | 23.7158 KOps/s | 23.5232 KOps/s | |
test_set_shared | 0.3434ms | 54.8672μs | 18.2258 KOps/s | 18.6235 KOps/s | |
test_update | 0.1056ms | 50.0645μs | 19.9742 KOps/s | 18.9613 KOps/s | |
test_update_nested | 0.1099ms | 58.5212μs | 17.0878 KOps/s | 16.6811 KOps/s | |
test_update__nested | 0.5767ms | 70.5351μs | 14.1773 KOps/s | 15.0213 KOps/s | |
test_set_nested | 0.1016ms | 43.4695μs | 23.0046 KOps/s | 22.6190 KOps/s | |
test_set_nested_new | 0.1017ms | 48.5728μs | 20.5877 KOps/s | 20.9433 KOps/s | |
test_select | 0.1233ms | 62.4751μs | 16.0064 KOps/s | 16.3805 KOps/s | |
test_select_nested | 83.2840μs | 41.6862μs | 23.9888 KOps/s | 23.8263 KOps/s | |
test_exclude_nested | 93.3640μs | 57.9254μs | 17.2636 KOps/s | 16.8931 KOps/s | |
test_empty[True] | 0.2881ms | 0.2527ms | 3.9571 KOps/s | 3.9473 KOps/s | |
test_empty[False] | 3.7322μs | 0.7571μs | 1.3209 MOps/s | 1.2786 MOps/s | |
test_to | 57.0620μs | 26.8996μs | 37.1753 KOps/s | 37.7573 KOps/s | |
test_to_nonblocking | 56.9730μs | 25.1770μs | 39.7188 KOps/s | 39.8040 KOps/s | |
test_unbind_speed | 0.3213ms | 0.2841ms | 3.5201 KOps/s | 3.5744 KOps/s | |
test_unbind_speed_stack0 | 0.3334ms | 0.2759ms | 3.6251 KOps/s | 3.6007 KOps/s | |
test_unbind_speed_stack1 | 92.0338ms | 0.7039ms | 1.4207 KOps/s | 1.3952 KOps/s | |
test_split | 95.4925ms | 2.2943ms | 435.8554 Ops/s | 438.5948 Ops/s | |
test_chunk | 95.0610ms | 2.2931ms | 436.0972 Ops/s | 436.6340 Ops/s | |
test_to[False] | 3.4521ms | 3.3947ms | 294.5789 Ops/s | 297.3516 Ops/s | |
test_to[True] | 4.8450ms | 4.5109ms | 221.6838 Ops/s | 225.6850 Ops/s | |
test_to_njt[False] | 0.3353s | 0.2533s | 3.9486 Ops/s | 3.9538 Ops/s | |
test_to_njt[True] | 0.2615s | 0.2610s | 3.8314 Ops/s | 3.8412 Ops/s | |
test_creation[device0] | 0.3361ms | 0.1325ms | 7.5458 KOps/s | 7.7339 KOps/s | |
test_creation_from_tensor | 0.3924ms | 0.1307ms | 7.6502 KOps/s | 7.6618 KOps/s | |
test_add_one[memmap_tensor0] | 0.2299ms | 9.2964μs | 107.5681 KOps/s | 108.1190 KOps/s | |
test_contiguous[memmap_tensor0] | 36.0210μs | 2.2014μs | 454.2660 KOps/s | 450.8318 KOps/s | |
test_stack[memmap_tensor0] | 34.8320μs | 7.2462μs | 138.0039 KOps/s | 141.9388 KOps/s | |
test_memmaptd_index | 93.6809ms | 0.4968ms | 2.0128 KOps/s | 2.2316 KOps/s | |
test_memmaptd_index_astensor | 0.7855ms | 0.5080ms | 1.9684 KOps/s | 1.9393 KOps/s | |
test_memmaptd_index_op | 1.4116ms | 1.0270ms | 973.7307 Ops/s | 923.5931 Ops/s | |
test_serialize_model | 0.1315s | 0.1302s | 7.6799 Ops/s | 6.9422 Ops/s | |
test_serialize_model_pickle | 1.3510s | 1.2136s | 0.8240 Ops/s | 0.8197 Ops/s | |
test_serialize_weights | 0.1305s | 0.1296s | 7.7185 Ops/s | 7.6875 Ops/s | |
test_serialize_weights_returnearly | 0.2446s | 63.7378ms | 15.6893 Ops/s | 17.7574 Ops/s | |
test_serialize_weights_pickle | 1.3698s | 1.1902s | 0.8402 Ops/s | 0.8354 Ops/s | |
test_reshape_pytree | 80.7430μs | 36.6821μs | 27.2612 KOps/s | 27.4976 KOps/s | |
test_reshape_td | 70.5930μs | 42.7904μs | 23.3697 KOps/s | 24.0160 KOps/s | |
test_view_pytree | 64.2220μs | 37.0852μs | 26.9649 KOps/s | 27.3129 KOps/s | |
test_view_td | 87.0240μs | 48.0722μs | 20.8020 KOps/s | 21.1385 KOps/s | |
test_unbind_pytree | 69.7230μs | 34.7446μs | 28.7814 KOps/s | 28.3163 KOps/s | |
test_unbind_td | 0.4789ms | 43.0176μs | 23.2463 KOps/s | 23.7101 KOps/s | |
test_split_pytree | 0.4912ms | 46.1375μs | 21.6744 KOps/s | 21.2061 KOps/s | |
test_split_td | 0.1539ms | 58.5878μs | 17.0684 KOps/s | 17.5074 KOps/s | |
test_add_pytree | 89.1430μs | 58.5658μs | 17.0748 KOps/s | 17.3764 KOps/s | |
test_add_td | 0.1451ms | 93.3320μs | 10.7144 KOps/s | 10.2704 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2274ms | 0.1640ms | 6.0959 KOps/s | 6.1550 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3358ms | 0.1605ms | 6.2289 KOps/s | 6.1955 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2213ms | 0.1546ms | 6.4697 KOps/s | 6.3328 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2498ms | 0.1884ms | 5.3079 KOps/s | 5.3478 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 52.9130μs | 24.5729μs | 40.6952 KOps/s | 44.8239 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1153ms | 48.7931μs | 20.4947 KOps/s | 20.5891 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1135ms | 65.1860μs | 15.3407 KOps/s | 15.1029 KOps/s | |
test_compile_copy_nested[pytree-eager] | 94.5240μs | 49.8299μs | 20.0683 KOps/s | 20.0359 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3783ms | 0.3189ms | 3.1360 KOps/s | 3.1446 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3252ms | 0.2310ms | 4.3287 KOps/s | 4.2130 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1976ms | 0.1302ms | 7.6805 KOps/s | 7.7046 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1250ms | 65.7620μs | 15.2063 KOps/s | 15.1863 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4489ms | 0.3302ms | 3.0284 KOps/s | 3.0704 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7199ms | 0.6356ms | 1.5734 KOps/s | 1.5736 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4096ms | 0.2816ms | 3.5517 KOps/s | 3.4734 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4394ms | 0.3217ms | 3.1083 KOps/s | 3.1166 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1746ms | 80.0369μs | 12.4942 KOps/s | 12.8613 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1951ms | 0.1307ms | 7.6502 KOps/s | 7.6634 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6246ms | 0.5309ms | 1.8836 KOps/s | 1.8519 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3750ms | 0.3247ms | 3.0802 KOps/s | 3.0595 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 55.3620μs | 20.9704μs | 47.6862 KOps/s | 48.6995 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 81.0530μs | 39.5255μs | 25.3001 KOps/s | 25.5187 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1198ms | 71.1412μs | 14.0566 KOps/s | 14.1813 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1160ms | 52.2543μs | 19.1372 KOps/s | 19.5139 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3805ms | 0.8307ms | 1.2038 KOps/s | 1.1271 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4183ms | 3.2328ms | 309.3250 Ops/s | 314.6502 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3988ms | 0.8346ms | 1.1982 KOps/s | 1.1023 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.3170ms | 3.2557ms | 307.1495 Ops/s | 306.8132 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1660ms | 0.1204ms | 8.3067 KOps/s | 8.2859 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1972ms | 63.5608μs | 15.7330 KOps/s | 15.7558 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1517ms | 0.1140ms | 8.7701 KOps/s | 8.6266 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 88.8840μs | 43.9176μs | 22.7699 KOps/s | 22.9074 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1932ms | 0.1207ms | 8.2848 KOps/s | 8.5908 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 91.1240μs | 46.3182μs | 21.5898 KOps/s | 21.6283 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1957ms | 0.1556ms | 6.4274 KOps/s | 6.7310 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1588ms | 26.6910μs | 37.4659 KOps/s | 37.6721 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1880ms | 0.1420ms | 7.0416 KOps/s | 7.0079 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 60.2430μs | 21.5214μs | 46.4653 KOps/s | 44.2621 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1903ms | 0.1434ms | 6.9723 KOps/s | 6.6746 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 65.5830μs | 21.2598μs | 47.0372 KOps/s | 45.4718 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2966ms | 0.1552ms | 6.4429 KOps/s | 6.5260 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4375ms | 27.0482μs | 36.9710 KOps/s | 37.1093 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1914ms | 0.1431ms | 6.9862 KOps/s | 6.7529 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 62.6730μs | 21.4222μs | 46.6806 KOps/s | 44.4347 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1835ms | 0.1434ms | 6.9732 KOps/s | 6.9588 KOps/s | |
test_compile_indexing[int-pytree-eager] | 72.4330μs | 21.5375μs | 46.4305 KOps/s | 46.2241 KOps/s | |
test_mod_add[eager] | 88.4340μs | 31.4618μs | 31.7845 KOps/s | 30.0647 KOps/s | |
test_mod_add[compile] | 0.1335ms | 82.5132μs | 12.1193 KOps/s | 12.0709 KOps/s | |
test_mod_add[compile-overhead] | 0.3269ms | 0.1562ms | 6.4004 KOps/s | 5.9363 KOps/s | |
test_mod_wrap[eager] | 0.3265ms | 0.2442ms | 4.0952 KOps/s | 3.9770 KOps/s | |
test_mod_wrap[compile] | 0.3900ms | 0.3054ms | 3.2747 KOps/s | 3.2936 KOps/s | |
test_mod_wrap[compile-overhead] | 7.9493ms | 4.2199ms | 236.9713 Ops/s | 244.7850 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5737ms | 1.3585ms | 736.1255 Ops/s | 720.9859 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5935ms | 1.3462ms | 742.8240 Ops/s | 736.8129 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.5015ms | 1.0145ms | 985.7039 Ops/s | 1.0623 KOps/s | |
test_seq_add[eager] | 0.1627ms | 97.4735μs | 10.2592 KOps/s | 9.5419 KOps/s | |
test_seq_add[compile] | 0.1638ms | 91.5847μs | 10.9189 KOps/s | 10.3438 KOps/s | |
test_seq_add[compile-overhead] | 0.1687ms | 0.1247ms | 8.0188 KOps/s | 7.9056 KOps/s | |
test_seq_wrap[eager] | 0.4589ms | 0.3794ms | 2.6354 KOps/s | 2.4676 KOps/s | |
test_seq_wrap[compile] | 0.4907ms | 0.3176ms | 3.1484 KOps/s | 2.9408 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2858ms | 0.2217ms | 4.5105 KOps/s | 4.4843 KOps/s | |
test_func_call_runtime[False-eager] | 0.8834ms | 0.7452ms | 1.3419 KOps/s | 1.3030 KOps/s | |
test_func_call_runtime[False-compile] | 0.9031ms | 0.8002ms | 1.2497 KOps/s | 1.1840 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4148ms | 0.3603ms | 2.7755 KOps/s | 2.7347 KOps/s | |
test_func_call_runtime[True-eager] | 0.9702ms | 0.9010ms | 1.1098 KOps/s | 1.0823 KOps/s | |
test_func_call_runtime[True-compile] | 1.0687ms | 0.8267ms | 1.2096 KOps/s | 1.2040 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4345ms | 0.3817ms | 2.6199 KOps/s | 2.6098 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8236ms | 0.7414ms | 1.3487 KOps/s | 1.3213 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8609ms | 0.8042ms | 1.2434 KOps/s | 1.2231 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4423ms | 0.3634ms | 2.7516 KOps/s | 2.7463 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1802ms | 1.0193ms | 981.1021 Ops/s | 963.7470 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9229ms | 0.8507ms | 1.1756 KOps/s | 1.1021 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4655ms | 0.4078ms | 2.4521 KOps/s | 2.4009 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5998ms | 2.1262ms | 470.3218 Ops/s | 462.7400 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9326ms | 0.8626ms | 1.1592 KOps/s | 1.1376 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4933ms | 0.4118ms | 2.4284 KOps/s | 2.4172 KOps/s | |
test_distributed | 5.0958ms | 0.1616ms | 6.1869 KOps/s | 8.8128 KOps/s | |
test_tdmodule | 0.2209ms | 14.3060μs | 69.9008 KOps/s | 63.5952 KOps/s | |
test_tdmodule_dispatch | 49.7320μs | 27.6947μs | 36.1080 KOps/s | 32.0809 KOps/s | |
test_tdseq | 35.0920μs | 15.4424μs | 64.7567 KOps/s | 57.8639 KOps/s | |
test_tdseq_dispatch | 51.4820μs | 30.7313μs | 32.5401 KOps/s | 28.7451 KOps/s | |
test_instantiation_functorch | 2.5389ms | 1.8553ms | 539.0100 Ops/s | 537.9172 Ops/s | |
test_exec_functorch | 0.3030ms | 0.2097ms | 4.7680 KOps/s | 4.5729 KOps/s | |
test_exec_functional_call | 0.3130ms | 0.2102ms | 4.7583 KOps/s | 4.4692 KOps/s | |
test_exec_td_decorator | 0.4233ms | 0.2602ms | 3.8427 KOps/s | 3.5694 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9167ms | 0.6903ms | 1.4486 KOps/s | 1.3820 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8100ms | 0.6880ms | 1.4534 KOps/s | 1.3852 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7211ms | 0.6107ms | 1.6374 KOps/s | 1.5886 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7050ms | 0.6101ms | 1.6390 KOps/s | 1.5878 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0666ms | 19.8876ms | 50.2825 Ops/s | 50.0729 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.9628ms | 19.8778ms | 50.3074 Ops/s | 49.8945 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7920ms | 19.7165ms | 50.7190 Ops/s | 50.4128 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7872ms | 19.7260ms | 50.6946 Ops/s | 50.4192 Ops/s | |
test_to_module_speed[True] | 1.3274ms | 0.9829ms | 1.0174 KOps/s | 1.0031 KOps/s | |
test_to_module_speed[False] | 1.3870ms | 0.9655ms | 1.0357 KOps/s | 1.0251 KOps/s | |
test_tc_init | 63.3720μs | 34.9585μs | 28.6054 KOps/s | 26.7331 KOps/s | |
test_tc_init_nested | 0.1105ms | 71.2361μs | 14.0378 KOps/s | 13.1764 KOps/s | |
test_tc_first_layer_tensor | 3.9074μs | 0.6887μs | 1.4520 MOps/s | 1.4294 MOps/s | |
test_tc_first_layer_nontensor | 20.1810μs | 2.3294μs | 429.2982 KOps/s | 434.1352 KOps/s | |
test_tc_second_layer_tensor | 7.3753μs | 1.4262μs | 701.1818 KOps/s | 705.8370 KOps/s | |
test_tc_second_layer_nontensor | 24.1510μs | 3.0714μs | 325.5875 KOps/s | 329.5853 KOps/s | |
test_unbind | 0.1914s | 9.6175ms | 103.9776 Ops/s | 91.6734 Ops/s | |
test_full_like | 0.6575ms | 0.5725ms | 1.7466 KOps/s | 1.7478 KOps/s | |
test_zeros_like | 0.2564ms | 0.1979ms | 5.0518 KOps/s | 5.0519 KOps/s | |
test_ones_like | 0.2423ms | 0.1977ms | 5.0570 KOps/s | 5.0564 KOps/s | |
test_clone | 0.4494ms | 0.4146ms | 2.4117 KOps/s | 2.4120 KOps/s | |
test_squeeze | 39.9720μs | 9.8462μs | 101.5620 KOps/s | 101.6073 KOps/s | |
test_unsqueeze | 0.2361ms | 74.1137μs | 13.4928 KOps/s | 13.3719 KOps/s | |
test_split | 0.4213ms | 0.1651ms | 6.0553 KOps/s | 6.0893 KOps/s | |
test_permute | 0.2218ms | 0.1746ms | 5.7281 KOps/s | 5.6147 KOps/s | |
test_stack | 1.2660ms | 0.8693ms | 1.1504 KOps/s | 1.1751 KOps/s | |
test_cat | 1.2542ms | 1.2314ms | 812.0897 Ops/s | 812.2780 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 24, 2024
ghstack-source-id: 87945c47b376d223bb3dc33bd6ec7cb9bb047455 Pull Request resolved: #1059
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):