-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster clone #1043
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 59cabdc7c27552211b9545bd2f64374648379bb9 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 907aab5dd5e2dd4a4997dd20caba9809de8fcd5b Pull Request resolved: #1043
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 55.2040μs | 24.0702μs | 41.5451 KOps/s | 42.1304 KOps/s | |
test_plain_set_stack_nested | 80.5210μs | 24.7400μs | 40.4204 KOps/s | 41.5060 KOps/s | |
test_plain_set_nested_inplace | 78.7870μs | 26.9802μs | 37.0642 KOps/s | 38.4888 KOps/s | |
test_plain_set_stack_nested_inplace | 62.9870μs | 27.2732μs | 36.6661 KOps/s | 38.6167 KOps/s | |
test_items | 46.6680μs | 4.1555μs | 240.6469 KOps/s | 242.0493 KOps/s | |
test_items_nested | 0.7413ms | 0.3928ms | 2.5458 KOps/s | 2.6189 KOps/s | |
test_items_nested_locked | 0.8704ms | 0.3896ms | 2.5665 KOps/s | 2.5889 KOps/s | |
test_items_nested_leaf | 0.1432ms | 80.3889μs | 12.4395 KOps/s | 12.3296 KOps/s | |
test_items_stack_nested | 0.5721ms | 0.3927ms | 2.5468 KOps/s | 2.6155 KOps/s | |
test_items_stack_nested_leaf | 0.1636ms | 82.7001μs | 12.0919 KOps/s | 11.9363 KOps/s | |
test_items_stack_nested_locked | 0.5050ms | 0.3982ms | 2.5111 KOps/s | 2.6179 KOps/s | |
test_keys | 38.9530μs | 3.5158μs | 284.4341 KOps/s | 282.9181 KOps/s | |
test_keys_nested | 0.2564ms | 0.1359ms | 7.3600 KOps/s | 7.4405 KOps/s | |
test_keys_nested_locked | 0.7893ms | 0.1421ms | 7.0386 KOps/s | 7.1866 KOps/s | |
test_keys_nested_leaf | 0.2295ms | 0.1199ms | 8.3379 KOps/s | 8.6415 KOps/s | |
test_keys_stack_nested | 0.2867ms | 0.1370ms | 7.3015 KOps/s | 7.5869 KOps/s | |
test_keys_stack_nested_leaf | 0.2310ms | 0.1190ms | 8.4001 KOps/s | 8.8305 KOps/s | |
test_keys_stack_nested_locked | 0.2397ms | 0.1418ms | 7.0543 KOps/s | 7.3715 KOps/s | |
test_values | 10.5038μs | 1.0510μs | 951.4767 KOps/s | 952.1750 KOps/s | |
test_values_nested | 0.1542ms | 93.6328μs | 10.6800 KOps/s | 10.7606 KOps/s | |
test_values_nested_locked | 0.1594ms | 93.8030μs | 10.6606 KOps/s | 10.6479 KOps/s | |
test_values_nested_leaf | 0.1515ms | 80.2103μs | 12.4672 KOps/s | 12.6481 KOps/s | |
test_values_stack_nested | 0.1853ms | 93.9342μs | 10.6458 KOps/s | 10.6697 KOps/s | |
test_values_stack_nested_leaf | 0.1567ms | 79.4728μs | 12.5829 KOps/s | 12.3791 KOps/s | |
test_values_stack_nested_locked | 0.1594ms | 93.9118μs | 10.6483 KOps/s | 10.5403 KOps/s | |
test_membership | 43.0500μs | 0.9024μs | 1.1082 MOps/s | 1.3386 MOps/s | |
test_membership_nested | 26.4500μs | 2.7908μs | 358.3192 KOps/s | 368.3031 KOps/s | |
test_membership_nested_leaf | 41.3180μs | 2.7556μs | 362.8919 KOps/s | 364.6585 KOps/s | |
test_membership_stacked_nested | 19.0460μs | 2.7515μs | 363.4420 KOps/s | 361.5474 KOps/s | |
test_membership_stacked_nested_leaf | 24.6660μs | 2.7663μs | 361.4905 KOps/s | 362.5533 KOps/s | |
test_membership_nested_last | 34.3140μs | 4.1169μs | 242.8984 KOps/s | 243.6236 KOps/s | |
test_membership_nested_leaf_last | 23.1230μs | 4.1227μs | 242.5575 KOps/s | 240.2986 KOps/s | |
test_membership_stacked_nested_last | 29.9770μs | 4.1349μs | 241.8431 KOps/s | 74.9929 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.8870μs | 4.1348μs | 241.8510 KOps/s | 74.1683 KOps/s | |
test_nested_getleaf | 69.5710μs | 10.7129μs | 93.3455 KOps/s | 92.2626 KOps/s | |
test_nested_get | 40.5460μs | 10.2951μs | 97.1336 KOps/s | 96.5422 KOps/s | |
test_stacked_getleaf | 31.8190μs | 10.7303μs | 93.1937 KOps/s | 93.5101 KOps/s | |
test_stacked_get | 48.9400μs | 10.2241μs | 97.8083 KOps/s | 96.4769 KOps/s | |
test_nested_getitemleaf | 38.2410μs | 11.1419μs | 89.7510 KOps/s | 89.7716 KOps/s | |
test_nested_getitem | 35.6170μs | 10.5061μs | 95.1826 KOps/s | 94.1227 KOps/s | |
test_stacked_getitemleaf | 51.4280μs | 11.1562μs | 89.6360 KOps/s | 89.0112 KOps/s | |
test_stacked_getitem | 34.8250μs | 10.5937μs | 94.3954 KOps/s | 94.5519 KOps/s | |
test_lock_nested | 0.9031ms | 0.5048ms | 1.9810 KOps/s | 1.9610 KOps/s | |
test_lock_stack_nested | 0.7507ms | 0.4813ms | 2.0778 KOps/s | 2.1816 KOps/s | |
test_unlock_nested | 91.2590ms | 0.5124ms | 1.9515 KOps/s | 2.3140 KOps/s | |
test_unlock_stack_nested | 0.6612ms | 0.3966ms | 2.5214 KOps/s | 2.6844 KOps/s | |
test_flatten_speed | 0.2118ms | 0.1014ms | 9.8662 KOps/s | 9.9177 KOps/s | |
test_unflatten_speed | 1.0434ms | 0.5192ms | 1.9261 KOps/s | 1.9624 KOps/s | |
test_common_ops | 3.3315ms | 1.1337ms | 882.1049 Ops/s | 884.9059 Ops/s | |
test_creation | 20.3890μs | 2.1382μs | 467.6765 KOps/s | 480.6785 KOps/s | |
test_creation_empty | 54.0510μs | 17.8120μs | 56.1420 KOps/s | 55.2601 KOps/s | |
test_creation_nested_1 | 54.5830μs | 20.9369μs | 47.7625 KOps/s | 47.4675 KOps/s | |
test_creation_nested_2 | 63.1590μs | 25.5557μs | 39.1303 KOps/s | 38.6142 KOps/s | |
test_clone | 54.1210μs | 18.1318μs | 55.1517 KOps/s | 57.7977 KOps/s | |
test_getitem[int] | 1.3044ms | 16.8593μs | 59.3144 KOps/s | 59.9380 KOps/s | |
test_getitem[slice_int] | 0.1469ms | 31.3313μs | 31.9170 KOps/s | 32.6633 KOps/s | |
test_getitem[range] | 0.2914ms | 58.1315μs | 17.2024 KOps/s | 17.1892 KOps/s | |
test_getitem[tuple] | 0.1307ms | 25.7436μs | 38.8446 KOps/s | 39.4209 KOps/s | |
test_getitem[list] | 0.5729ms | 53.3210μs | 18.7543 KOps/s | 18.4543 KOps/s | |
test_setitem_dim[int] | 75.3910μs | 32.7954μs | 30.4920 KOps/s | 30.4142 KOps/s | |
test_setitem_dim[slice_int] | 0.1038ms | 60.7738μs | 16.4545 KOps/s | 16.3692 KOps/s | |
test_setitem_dim[range] | 0.1268ms | 84.3316μs | 11.8580 KOps/s | 11.8904 KOps/s | |
test_setitem_dim[tuple] | 80.6910μs | 49.0576μs | 20.3842 KOps/s | 20.4635 KOps/s | |
test_setitem | 0.1235ms | 30.9137μs | 32.3481 KOps/s | 34.1282 KOps/s | |
test_set | 0.1295ms | 29.6659μs | 33.7087 KOps/s | 34.3331 KOps/s | |
test_set_shared | 3.2612ms | 0.2192ms | 4.5625 KOps/s | 4.4957 KOps/s | |
test_update | 0.8240ms | 37.8721μs | 26.4047 KOps/s | 26.6955 KOps/s | |
test_update_nested | 0.2080ms | 48.6730μs | 20.5453 KOps/s | 20.8918 KOps/s | |
test_update__nested | 0.1712ms | 45.5287μs | 21.9642 KOps/s | 21.9893 KOps/s | |
test_set_nested | 0.1171ms | 33.6186μs | 29.7454 KOps/s | 31.0773 KOps/s | |
test_set_nested_new | 0.1088ms | 37.8624μs | 26.4114 KOps/s | 27.0025 KOps/s | |
test_select | 0.2058ms | 54.3074μs | 18.4137 KOps/s | 18.3959 KOps/s | |
test_select_nested | 0.1135ms | 59.4679μs | 16.8158 KOps/s | 16.7770 KOps/s | |
test_exclude_nested | 0.1843ms | 74.9441μs | 13.3433 KOps/s | 13.1351 KOps/s | |
test_empty[True] | 0.5726ms | 0.3507ms | 2.8516 KOps/s | 2.8404 KOps/s | |
test_empty[False] | 11.0985μs | 1.2365μs | 808.7466 KOps/s | 837.4069 KOps/s | |
test_unbind_speed | 0.6641ms | 0.3053ms | 3.2757 KOps/s | 3.2326 KOps/s | |
test_unbind_speed_stack0 | 0.5148ms | 0.2992ms | 3.3423 KOps/s | 3.4537 KOps/s | |
test_unbind_speed_stack1 | 88.4448ms | 0.8164ms | 1.2249 KOps/s | 1.3685 KOps/s | |
test_split | 89.2924ms | 2.1704ms | 460.7355 Ops/s | 449.5882 Ops/s | |
test_chunk | 2.3169ms | 2.0117ms | 497.0838 Ops/s | 451.4708 Ops/s | |
test_creation[device0] | 0.2270ms | 0.1144ms | 8.7418 KOps/s | 8.5561 KOps/s | |
test_creation_from_tensor | 3.2434ms | 0.1159ms | 8.6268 KOps/s | 8.4243 KOps/s | |
test_add_one[memmap_tensor0] | 0.3097ms | 7.5704μs | 132.0930 KOps/s | 132.8288 KOps/s | |
test_contiguous[memmap_tensor0] | 20.8190μs | 1.9203μs | 520.7553 KOps/s | 525.3566 KOps/s | |
test_stack[memmap_tensor0] | 65.6330μs | 5.9090μs | 169.2339 KOps/s | 177.8126 KOps/s | |
test_memmaptd_index | 1.1423ms | 0.4135ms | 2.4182 KOps/s | 2.3495 KOps/s | |
test_memmaptd_index_astensor | 92.6772ms | 0.6412ms | 1.5595 KOps/s | 1.9121 KOps/s | |
test_memmaptd_index_op | 1.7381ms | 1.0418ms | 959.9138 Ops/s | 928.9508 Ops/s | |
test_serialize_model | 0.1237s | 0.1186s | 8.4332 Ops/s | 7.4221 Ops/s | |
test_serialize_model_pickle | 0.4429s | 0.3880s | 2.5772 Ops/s | 2.5460 Ops/s | |
test_serialize_weights | 0.1225s | 0.1163s | 8.5961 Ops/s | 8.3735 Ops/s | |
test_serialize_weights_returnearly | 0.2593s | 0.1733s | 5.7710 Ops/s | 6.3471 Ops/s | |
test_serialize_weights_pickle | 0.5323s | 0.4242s | 2.3576 Ops/s | 2.4555 Ops/s | |
test_serialize_weights_filesystem | 0.1525s | 0.1427s | 7.0063 Ops/s | 7.0131 Ops/s | |
test_serialize_model_filesystem | 0.1638s | 0.1502s | 6.6582 Ops/s | 6.4135 Ops/s | |
test_reshape_pytree | 0.1128ms | 39.7528μs | 25.1555 KOps/s | 25.3383 KOps/s | |
test_reshape_td | 0.1075ms | 47.5626μs | 21.0249 KOps/s | 20.6877 KOps/s | |
test_view_pytree | 82.2450μs | 39.9636μs | 25.0227 KOps/s | 25.5105 KOps/s | |
test_view_td | 0.1256ms | 53.0362μs | 18.8551 KOps/s | 18.0257 KOps/s | |
test_unbind_pytree | 92.5140μs | 36.7187μs | 27.2341 KOps/s | 27.1587 KOps/s | |
test_unbind_td | 0.3234ms | 46.6323μs | 21.4444 KOps/s | 21.9632 KOps/s | |
test_split_pytree | 81.5630μs | 38.3235μs | 26.0937 KOps/s | 26.4352 KOps/s | |
test_split_td | 90.6141ms | 68.4080μs | 14.6182 KOps/s | 17.1582 KOps/s | |
test_add_pytree | 0.1393ms | 46.5985μs | 21.4599 KOps/s | 22.0668 KOps/s | |
test_add_td | 0.1513ms | 84.6751μs | 11.8098 KOps/s | 11.7618 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1260ms | 58.2647μs | 17.1631 KOps/s | 17.2325 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3531ms | 0.1957ms | 5.1089 KOps/s | 5.0839 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1281ms | 57.5424μs | 17.3785 KOps/s | 17.4008 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2717ms | 0.1428ms | 7.0035 KOps/s | 7.0406 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 78.4070μs | 23.7465μs | 42.1115 KOps/s | 43.9284 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1521ms | 74.7398μs | 13.3797 KOps/s | 13.5426 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1606ms | 76.6130μs | 13.0526 KOps/s | 13.2181 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1283ms | 68.3856μs | 14.6230 KOps/s | 14.7656 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3390ms | 0.1841ms | 5.4314 KOps/s | 5.4556 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4953ms | 0.2418ms | 4.1362 KOps/s | 4.1663 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1133ms | 48.3315μs | 20.6904 KOps/s | 21.0563 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4648ms | 78.0996μs | 12.8042 KOps/s | 12.5873 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3873ms | 0.1760ms | 5.6832 KOps/s | 5.7181 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4900ms | 0.2863ms | 3.4923 KOps/s | 3.4899 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 1.1605ms | 0.2711ms | 3.6886 KOps/s | 3.6455 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3870ms | 0.1838ms | 5.4402 KOps/s | 5.5156 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1882ms | 74.6380μs | 13.3980 KOps/s | 13.7642 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1152ms | 48.6756μs | 20.5442 KOps/s | 20.1030 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3458ms | 0.2337ms | 4.2784 KOps/s | 4.2805 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3537ms | 0.1793ms | 5.5788 KOps/s | 5.6716 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2137ms | 0.1107ms | 9.0338 KOps/s | 8.9605 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1604ms | 79.3639μs | 12.6002 KOps/s | 12.9477 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1551ms | 79.1041μs | 12.6416 KOps/s | 12.8786 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1420ms | 70.8130μs | 14.1217 KOps/s | 14.6558 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2892ms | 0.1980ms | 5.0492 KOps/s | 5.1426 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.3442ms | 1.7724ms | 564.2021 Ops/s | 574.8181 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3217ms | 0.1949ms | 5.1304 KOps/s | 5.1444 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2643ms | 1.1034ms | 906.3135 Ops/s | 905.8048 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5005ms | 0.4185ms | 2.3894 KOps/s | 2.4210 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.2653ms | 3.9545ms | 252.8757 Ops/s | 243.4475 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 75.0500μs | 33.3260μs | 30.0066 KOps/s | 29.7103 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6502ms | 48.8797μs | 20.4584 KOps/s | 20.3323 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 88.0950μs | 30.3429μs | 32.9566 KOps/s | 34.1538 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 90.2590μs | 30.5300μs | 32.7546 KOps/s | 33.6433 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 83.3370μs | 30.0438μs | 33.2847 KOps/s | 33.6350 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 94.7580μs | 30.5871μs | 32.6935 KOps/s | 33.3503 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1683ms | 75.2612μs | 13.2871 KOps/s | 13.4903 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6519ms | 28.2976μs | 35.3387 KOps/s | 36.3037 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1585ms | 69.3681μs | 14.4159 KOps/s | 14.7127 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 71.2040μs | 24.0686μs | 41.5479 KOps/s | 42.5702 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1465ms | 68.1433μs | 14.6750 KOps/s | 14.3498 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1321ms | 24.3061μs | 41.1420 KOps/s | 42.3124 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1472ms | 74.0812μs | 13.4987 KOps/s | 13.7176 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0199ms | 28.8162μs | 34.7027 KOps/s | 36.6106 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1578ms | 68.2958μs | 14.6422 KOps/s | 14.6586 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1511ms | 24.0253μs | 41.6229 KOps/s | 42.8426 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1578ms | 68.5942μs | 14.5785 KOps/s | 14.4760 KOps/s | |
test_compile_indexing[int-pytree-eager] | 61.1740μs | 23.5147μs | 42.5266 KOps/s | 42.7769 KOps/s | |
test_mod_add[eager] | 80.8820μs | 25.1353μs | 39.7847 KOps/s | 38.9137 KOps/s | |
test_mod_add[compile] | 0.1078ms | 37.8160μs | 26.4438 KOps/s | 25.1341 KOps/s | |
test_mod_add[compile-overhead] | 0.1096ms | 37.9867μs | 26.3250 KOps/s | 25.7056 KOps/s | |
test_mod_wrap[eager] | 0.3068ms | 0.2066ms | 4.8393 KOps/s | 4.7257 KOps/s | |
test_mod_wrap[compile] | 0.3235ms | 0.2270ms | 4.4045 KOps/s | 4.2206 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3646ms | 0.2282ms | 4.3822 KOps/s | 4.2401 KOps/s | |
test_mod_wrap_and_backward[eager] | 15.6249ms | 11.4799ms | 87.1085 Ops/s | 77.0375 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.1078ms | 12.3785ms | 80.7853 Ops/s | 81.2796 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.8163ms | 12.0194ms | 83.1989 Ops/s | 84.0429 Ops/s | |
test_seq_add[eager] | 0.1983ms | 90.5419μs | 11.0446 KOps/s | 10.9099 KOps/s | |
test_seq_add[compile] | 0.1322ms | 62.9994μs | 15.8732 KOps/s | 15.5318 KOps/s | |
test_seq_add[compile-overhead] | 0.1440ms | 62.7408μs | 15.9386 KOps/s | 15.6838 KOps/s | |
test_seq_wrap[eager] | 1.7486ms | 0.3915ms | 2.5540 KOps/s | 2.5970 KOps/s | |
test_seq_wrap[compile] | 0.3740ms | 0.2664ms | 3.7534 KOps/s | 3.7408 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3757ms | 0.2654ms | 3.7674 KOps/s | 3.6732 KOps/s | |
test_func_call_runtime[False-eager] | 0.8976ms | 0.5341ms | 1.8724 KOps/s | 1.9381 KOps/s | |
test_func_call_runtime[False-compile] | 0.9391ms | 0.4960ms | 2.0161 KOps/s | 1.9974 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6158ms | 0.4968ms | 2.0131 KOps/s | 1.9979 KOps/s | |
test_func_call_runtime[True-eager] | 1.0286ms | 0.7551ms | 1.3244 KOps/s | 1.3564 KOps/s | |
test_func_call_runtime[True-compile] | 0.6645ms | 0.5113ms | 1.9556 KOps/s | 1.9571 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8654ms | 0.5105ms | 1.9589 KOps/s | 1.9273 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1769ms | 0.5380ms | 1.8587 KOps/s | 1.9189 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6064ms | 0.4966ms | 2.0135 KOps/s | 1.9795 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9350ms | 0.4980ms | 2.0082 KOps/s | 1.9725 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1134ms | 0.8949ms | 1.1174 KOps/s | 1.1173 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1778ms | 0.7478ms | 1.3373 KOps/s | 1.3510 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8603ms | 0.7480ms | 1.3370 KOps/s | 1.3372 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6312ms | 1.9122ms | 522.9534 Ops/s | 517.5645 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6447ms | 1.9676ms | 508.2256 Ops/s | 503.1250 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.5699ms | 1.9600ms | 510.1922 Ops/s | 502.0780 Ops/s | |
test_distributed | 0.3669ms | 0.1263ms | 7.9164 KOps/s | 7.7516 KOps/s | |
test_tdmodule | 39.7740μs | 18.2648μs | 54.7500 KOps/s | 57.7617 KOps/s | |
test_tdmodule_dispatch | 63.3090μs | 35.5226μs | 28.1511 KOps/s | 28.6891 KOps/s | |
test_tdseq | 47.4290μs | 19.8837μs | 50.2924 KOps/s | 49.4645 KOps/s | |
test_tdseq_dispatch | 60.3530μs | 39.8648μs | 25.0848 KOps/s | 24.6163 KOps/s | |
test_instantiation_functorch | 2.1041ms | 1.5568ms | 642.3261 Ops/s | 628.6173 Ops/s | |
test_exec_functorch | 0.2696ms | 0.1881ms | 5.3154 KOps/s | 5.3737 KOps/s | |
test_exec_functional_call | 0.4156ms | 0.1794ms | 5.5746 KOps/s | 5.5570 KOps/s | |
test_exec_td_decorator | 0.4843ms | 0.2348ms | 4.2591 KOps/s | 4.2952 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9729ms | 0.6367ms | 1.5707 KOps/s | 1.5347 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8994ms | 0.6367ms | 1.5706 KOps/s | 1.4999 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7270ms | 0.5295ms | 1.8886 KOps/s | 1.8519 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8280ms | 0.5306ms | 1.8845 KOps/s | 1.8588 KOps/s | |
test_to_module_speed[True] | 1.9765ms | 1.4106ms | 708.9265 Ops/s | 713.4014 Ops/s | |
test_to_module_speed[False] | 2.6806ms | 1.3809ms | 724.1843 Ops/s | 715.5962 Ops/s | |
test_tc_init | 85.9910μs | 45.4337μs | 22.0101 KOps/s | 21.9750 KOps/s | |
test_tc_init_nested | 0.1779ms | 91.2759μs | 10.9558 KOps/s | 10.9910 KOps/s | |
test_tc_first_layer_tensor | 15.7100μs | 1.4863μs | 672.8186 KOps/s | 659.7225 KOps/s | |
test_tc_first_layer_nontensor | 25.9590μs | 4.7461μs | 210.6992 KOps/s | 211.7136 KOps/s | |
test_tc_second_layer_tensor | 39.8940μs | 2.7262μs | 366.8170 KOps/s | 347.1926 KOps/s | |
test_tc_second_layer_nontensor | 41.6280μs | 6.0046μs | 166.5392 KOps/s | 165.0120 KOps/s | |
test_unbind | 0.4614s | 13.0136ms | 76.8427 Ops/s | 74.7783 Ops/s | |
test_full_like | 8.2863ms | 7.4034ms | 135.0723 Ops/s | 83.9534 Ops/s | |
test_zeros_like | 3.1878ms | 2.8251ms | 353.9718 Ops/s | 125.7677 Ops/s | |
test_ones_like | 3.8028ms | 3.3804ms | 295.8267 Ops/s | 128.3713 Ops/s | |
test_clone | 5.7404ms | 5.1627ms | 193.6960 Ops/s | 105.3918 Ops/s | |
test_squeeze | 60.5330μs | 13.0309μs | 76.7405 KOps/s | 82.0392 KOps/s | |
test_unsqueeze | 0.3446ms | 93.0395μs | 10.7481 KOps/s | 10.7240 KOps/s | |
test_split | 0.3855ms | 0.1935ms | 5.1683 KOps/s | 5.1208 KOps/s | |
test_permute | 0.4293ms | 0.2234ms | 4.4756 KOps/s | 4.5486 KOps/s | |
test_stack | 25.0799ms | 24.2709ms | 41.2015 Ops/s | 38.2512 Ops/s | |
test_cat | 32.3145ms | 24.8184ms | 40.2927 Ops/s | 38.8803 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 64.6910μs | 17.4915μs | 57.1707 KOps/s | 62.3736 KOps/s | |
test_plain_set_stack_nested | 61.0510μs | 17.6351μs | 56.7050 KOps/s | 61.8447 KOps/s | |
test_plain_set_nested_inplace | 51.9310μs | 18.7937μs | 53.2092 KOps/s | 57.9485 KOps/s | |
test_plain_set_stack_nested_inplace | 50.3310μs | 18.8273μs | 53.1143 KOps/s | 57.8303 KOps/s | |
test_items | 30.1800μs | 2.8688μs | 348.5787 KOps/s | 344.6387 KOps/s | |
test_items_nested | 0.3750ms | 0.3435ms | 2.9116 KOps/s | 2.9412 KOps/s | |
test_items_nested_locked | 0.4565ms | 0.3421ms | 2.9233 KOps/s | 2.9214 KOps/s | |
test_items_nested_leaf | 88.5910μs | 62.5189μs | 15.9952 KOps/s | 15.9143 KOps/s | |
test_items_stack_nested | 0.3957ms | 0.3421ms | 2.9235 KOps/s | 2.8939 KOps/s | |
test_items_stack_nested_leaf | 0.1054ms | 63.9476μs | 15.6378 KOps/s | 15.8625 KOps/s | |
test_items_stack_nested_locked | 0.3730ms | 0.3440ms | 2.9066 KOps/s | 2.8698 KOps/s | |
test_keys | 22.6100μs | 3.4198μs | 292.4136 KOps/s | 291.7586 KOps/s | |
test_keys_nested | 0.1181ms | 72.0712μs | 13.8752 KOps/s | 14.1085 KOps/s | |
test_keys_nested_locked | 0.7818ms | 78.0458μs | 12.8130 KOps/s | 12.7888 KOps/s | |
test_keys_nested_leaf | 94.4010μs | 62.8639μs | 15.9074 KOps/s | 16.3361 KOps/s | |
test_keys_stack_nested | 0.1167ms | 72.4579μs | 13.8011 KOps/s | 13.8705 KOps/s | |
test_keys_stack_nested_leaf | 0.1012ms | 64.6452μs | 15.4691 KOps/s | 16.0431 KOps/s | |
test_keys_stack_nested_locked | 0.1273ms | 79.0397μs | 12.6519 KOps/s | 12.7910 KOps/s | |
test_values | 5.4685μs | 0.8464μs | 1.1815 MOps/s | 1.1630 MOps/s | |
test_values_nested | 79.0810μs | 48.6889μs | 20.5386 KOps/s | 20.4816 KOps/s | |
test_values_nested_locked | 99.0420μs | 50.7454μs | 19.7062 KOps/s | 19.9361 KOps/s | |
test_values_nested_leaf | 81.0610μs | 42.9070μs | 23.3062 KOps/s | 23.2175 KOps/s | |
test_values_stack_nested | 97.2910μs | 50.3509μs | 19.8606 KOps/s | 20.3821 KOps/s | |
test_values_stack_nested_leaf | 86.3410μs | 43.9669μs | 22.7444 KOps/s | 22.9951 KOps/s | |
test_values_stack_nested_locked | 0.2830ms | 51.8002μs | 19.3050 KOps/s | 19.6115 KOps/s | |
test_membership | 1.9230μs | 0.5085μs | 1.9664 MOps/s | 1.9526 MOps/s | |
test_membership_nested | 13.4450μs | 1.9315μs | 517.7387 KOps/s | 519.0301 KOps/s | |
test_membership_nested_leaf | 14.3405μs | 1.9279μs | 518.7058 KOps/s | 519.2948 KOps/s | |
test_membership_stacked_nested | 17.1200μs | 1.9686μs | 507.9818 KOps/s | 484.4739 KOps/s | |
test_membership_stacked_nested_leaf | 23.1100μs | 1.9682μs | 508.0746 KOps/s | 497.9849 KOps/s | |
test_membership_nested_last | 29.8710μs | 3.1277μs | 319.7264 KOps/s | 322.5704 KOps/s | |
test_membership_nested_leaf_last | 30.3610μs | 3.1268μs | 319.8177 KOps/s | 320.9798 KOps/s | |
test_membership_stacked_nested_last | 26.8700μs | 3.6756μs | 272.0645 KOps/s | 315.6095 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.9500μs | 3.6295μs | 275.5170 KOps/s | 323.0423 KOps/s | |
test_nested_getleaf | 26.5410μs | 6.1258μs | 163.2442 KOps/s | 165.8852 KOps/s | |
test_nested_get | 35.8000μs | 5.7884μs | 172.7604 KOps/s | 174.9783 KOps/s | |
test_stacked_getleaf | 33.4800μs | 6.0536μs | 165.1917 KOps/s | 166.0817 KOps/s | |
test_stacked_get | 38.0510μs | 5.6332μs | 177.5179 KOps/s | 176.9269 KOps/s | |
test_nested_getitemleaf | 31.5510μs | 6.1944μs | 161.4360 KOps/s | 162.5207 KOps/s | |
test_nested_getitem | 32.4300μs | 5.8417μs | 171.1835 KOps/s | 172.6993 KOps/s | |
test_stacked_getitemleaf | 31.9400μs | 6.0982μs | 163.9825 KOps/s | 165.0406 KOps/s | |
test_stacked_getitem | 37.3700μs | 5.7173μs | 174.9091 KOps/s | 178.3609 KOps/s | |
test_lock_nested | 0.8504ms | 0.4347ms | 2.3003 KOps/s | 2.3490 KOps/s | |
test_lock_stack_nested | 0.4489ms | 0.4018ms | 2.4885 KOps/s | 2.5525 KOps/s | |
test_unlock_nested | 0.8101ms | 0.3726ms | 2.6838 KOps/s | 2.7418 KOps/s | |
test_unlock_stack_nested | 0.4114ms | 0.3378ms | 2.9603 KOps/s | 3.0173 KOps/s | |
test_flatten_speed | 0.1526ms | 76.3581μs | 13.0962 KOps/s | 13.0093 KOps/s | |
test_unflatten_speed | 0.3775ms | 0.3261ms | 3.0666 KOps/s | 3.0903 KOps/s | |
test_common_ops | 1.7208ms | 1.3119ms | 762.2257 Ops/s | 793.2897 Ops/s | |
test_creation | 28.6110μs | 1.4901μs | 671.1012 KOps/s | 678.2139 KOps/s | |
test_creation_empty | 43.7910μs | 17.2515μs | 57.9660 KOps/s | 70.7596 KOps/s | |
test_creation_nested_1 | 51.2910μs | 18.9771μs | 52.6950 KOps/s | 62.5801 KOps/s | |
test_creation_nested_2 | 57.3610μs | 21.4234μs | 46.6779 KOps/s | 54.0383 KOps/s | |
test_clone | 67.4110μs | 30.0882μs | 33.2356 KOps/s | 33.6444 KOps/s | |
test_getitem[int] | 1.2897ms | 16.3094μs | 61.3142 KOps/s | 62.0766 KOps/s | |
test_getitem[slice_int] | 0.1202ms | 28.1484μs | 35.5260 KOps/s | 34.9454 KOps/s | |
test_getitem[range] | 0.1494ms | 0.1094ms | 9.1392 KOps/s | 8.9000 KOps/s | |
test_getitem[tuple] | 0.1208ms | 24.4164μs | 40.9560 KOps/s | 40.8593 KOps/s | |
test_getitem[list] | 0.1936ms | 0.1025ms | 9.7603 KOps/s | 9.7655 KOps/s | |
test_setitem_dim[int] | 72.0410μs | 48.2160μs | 20.7400 KOps/s | 21.3862 KOps/s | |
test_setitem_dim[slice_int] | 0.1231ms | 69.0780μs | 14.4764 KOps/s | 14.0173 KOps/s | |
test_setitem_dim[range] | 0.1584ms | 0.1300ms | 7.6943 KOps/s | 7.6015 KOps/s | |
test_setitem_dim[tuple] | 0.1086ms | 63.0949μs | 15.8491 KOps/s | 15.6227 KOps/s | |
test_setitem | 83.5410μs | 43.8131μs | 22.8242 KOps/s | 24.1336 KOps/s | |
test_set | 0.1733ms | 42.5957μs | 23.4766 KOps/s | 24.6457 KOps/s | |
test_set_shared | 0.3786ms | 55.0897μs | 18.1522 KOps/s | 17.8960 KOps/s | |
test_update | 0.1021ms | 52.5460μs | 19.0309 KOps/s | 20.2961 KOps/s | |
test_update_nested | 97.2310μs | 60.6895μs | 16.4773 KOps/s | 17.6588 KOps/s | |
test_update__nested | 0.4641ms | 68.2978μs | 14.6418 KOps/s | 15.8124 KOps/s | |
test_set_nested | 91.5810μs | 45.0816μs | 22.1820 KOps/s | 22.8718 KOps/s | |
test_set_nested_new | 86.7620μs | 48.8387μs | 20.4756 KOps/s | 21.5101 KOps/s | |
test_select | 0.1067ms | 63.0227μs | 15.8673 KOps/s | 16.6216 KOps/s | |
test_select_nested | 81.3010μs | 41.7126μs | 23.9736 KOps/s | 23.5750 KOps/s | |
test_exclude_nested | 92.4520μs | 59.4788μs | 16.8127 KOps/s | 16.5625 KOps/s | |
test_empty[True] | 0.3204ms | 0.2603ms | 3.8418 KOps/s | 3.7951 KOps/s | |
test_empty[False] | 3.0891μs | 0.7503μs | 1.3329 MOps/s | 1.3264 MOps/s | |
test_to | 55.8810μs | 27.2170μs | 36.7417 KOps/s | 37.0662 KOps/s | |
test_to_nonblocking | 60.1910μs | 26.2218μs | 38.1362 KOps/s | 38.7924 KOps/s | |
test_unbind_speed | 0.3252ms | 0.2877ms | 3.4756 KOps/s | 3.5875 KOps/s | |
test_unbind_speed_stack0 | 0.4245ms | 0.2858ms | 3.4986 KOps/s | 3.6170 KOps/s | |
test_unbind_speed_stack1 | 92.3428ms | 0.7280ms | 1.3737 KOps/s | 1.3942 KOps/s | |
test_split | 94.3142ms | 2.1895ms | 456.7283 Ops/s | 455.2783 Ops/s | |
test_chunk | 94.5432ms | 2.2068ms | 453.1395 Ops/s | 451.5402 Ops/s | |
test_to[False] | 3.6285ms | 3.5007ms | 285.6550 Ops/s | 283.2320 Ops/s | |
test_to[True] | 4.8257ms | 4.5107ms | 221.6942 Ops/s | 216.2505 Ops/s | |
test_to_njt[False] | 0.3304s | 0.2533s | 3.9486 Ops/s | 3.9586 Ops/s | |
test_to_njt[True] | 0.3654s | 0.2825s | 3.5397 Ops/s | 3.5432 Ops/s | |
test_creation[device0] | 0.3442ms | 0.1290ms | 7.7515 KOps/s | 7.7318 KOps/s | |
test_creation_from_tensor | 0.3536ms | 0.1349ms | 7.4130 KOps/s | 7.5734 KOps/s | |
test_add_one[memmap_tensor0] | 0.1920ms | 9.0742μs | 110.2030 KOps/s | 108.9577 KOps/s | |
test_contiguous[memmap_tensor0] | 28.7700μs | 2.2242μs | 449.5964 KOps/s | 455.0590 KOps/s | |
test_stack[memmap_tensor0] | 44.3210μs | 6.9143μs | 144.6277 KOps/s | 142.6505 KOps/s | |
test_memmaptd_index | 1.0628ms | 0.4521ms | 2.2120 KOps/s | 2.2411 KOps/s | |
test_memmaptd_index_astensor | 0.7913ms | 0.5223ms | 1.9145 KOps/s | 1.9330 KOps/s | |
test_memmaptd_index_op | 1.4671ms | 1.0886ms | 918.5924 Ops/s | 948.8396 Ops/s | |
test_serialize_model | 0.1314s | 0.1302s | 7.6815 Ops/s | 7.6935 Ops/s | |
test_serialize_model_pickle | 1.3612s | 1.2166s | 0.8220 Ops/s | 0.8236 Ops/s | |
test_serialize_weights | 0.1312s | 0.1302s | 7.6806 Ops/s | 7.7149 Ops/s | |
test_serialize_weights_returnearly | 0.2413s | 63.0868ms | 15.8512 Ops/s | 18.1619 Ops/s | |
test_serialize_weights_pickle | 1.3697s | 1.2220s | 0.8183 Ops/s | 0.8382 Ops/s | |
test_reshape_pytree | 63.9810μs | 35.9113μs | 27.8464 KOps/s | 27.4553 KOps/s | |
test_reshape_td | 74.2310μs | 42.4497μs | 23.5573 KOps/s | 21.8817 KOps/s | |
test_view_pytree | 69.8810μs | 36.1139μs | 27.6902 KOps/s | 27.0243 KOps/s | |
test_view_td | 96.9620μs | 48.0660μs | 20.8047 KOps/s | 21.2325 KOps/s | |
test_unbind_pytree | 76.1710μs | 35.4824μs | 28.1830 KOps/s | 28.6585 KOps/s | |
test_unbind_td | 0.5644ms | 44.0766μs | 22.6878 KOps/s | 22.9953 KOps/s | |
test_split_pytree | 0.5734ms | 46.9196μs | 21.3131 KOps/s | 21.4071 KOps/s | |
test_split_td | 0.1791ms | 58.9396μs | 16.9665 KOps/s | 17.2427 KOps/s | |
test_add_pytree | 0.1098ms | 58.3665μs | 17.1331 KOps/s | 15.9888 KOps/s | |
test_add_td | 0.1699ms | 97.0100μs | 10.3082 KOps/s | 9.9450 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2172ms | 0.1631ms | 6.1327 KOps/s | 6.1130 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2905ms | 0.1630ms | 6.1349 KOps/s | 6.1734 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.3607ms | 0.1594ms | 6.2718 KOps/s | 6.2452 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2516ms | 0.1878ms | 5.3249 KOps/s | 5.0231 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 64.5220μs | 21.9615μs | 45.5342 KOps/s | 45.7283 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 79.5810μs | 49.6280μs | 20.1499 KOps/s | 20.3719 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4664ms | 64.9909μs | 15.3868 KOps/s | 15.3952 KOps/s | |
test_compile_copy_nested[pytree-eager] | 89.4110μs | 49.7008μs | 20.1204 KOps/s | 20.2017 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3800ms | 0.3215ms | 3.1101 KOps/s | 3.0927 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3143ms | 0.2304ms | 4.3411 KOps/s | 4.2955 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1817ms | 0.1335ms | 7.4881 KOps/s | 7.6395 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1785ms | 68.0735μs | 14.6900 KOps/s | 14.7162 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4903ms | 0.3362ms | 2.9746 KOps/s | 3.0313 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7903ms | 0.6545ms | 1.5279 KOps/s | 1.5331 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4435ms | 0.2893ms | 3.4565 KOps/s | 3.5042 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4725ms | 0.3301ms | 3.0296 KOps/s | 3.0899 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1679ms | 82.7514μs | 12.0844 KOps/s | 12.5207 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2904ms | 0.1381ms | 7.2411 KOps/s | 7.6142 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7554ms | 0.5426ms | 1.8430 KOps/s | 1.7981 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4033ms | 0.3386ms | 2.9534 KOps/s | 3.0222 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 71.3410μs | 19.9336μs | 50.1665 KOps/s | 51.0579 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 76.2110μs | 38.5944μs | 25.9105 KOps/s | 26.2121 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1125ms | 69.8947μs | 14.3072 KOps/s | 14.2886 KOps/s | |
test_compile_copy_flat[pytree-eager] | 88.7820μs | 51.4224μs | 19.4468 KOps/s | 19.5588 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.5289ms | 0.8882ms | 1.1259 KOps/s | 1.1088 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6716ms | 3.3804ms | 295.8260 Ops/s | 296.0345 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4824ms | 0.8829ms | 1.1327 KOps/s | 1.0946 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6454ms | 3.3741ms | 296.3732 Ops/s | 298.6134 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1891ms | 0.1291ms | 7.7477 KOps/s | 8.3317 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1956ms | 65.0053μs | 15.3834 KOps/s | 16.0779 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1710ms | 0.1210ms | 8.2633 KOps/s | 8.7372 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 94.0720μs | 46.9377μs | 21.3048 KOps/s | 20.8601 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1718ms | 0.1227ms | 8.1488 KOps/s | 8.2741 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 98.0920μs | 47.0325μs | 21.2619 KOps/s | 20.7284 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2182ms | 0.1569ms | 6.3748 KOps/s | 6.5581 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1611ms | 27.1722μs | 36.8023 KOps/s | 36.3397 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2012ms | 0.1476ms | 6.7748 KOps/s | 6.9058 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 58.2310μs | 21.2804μs | 46.9916 KOps/s | 45.5771 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2146ms | 0.1514ms | 6.6040 KOps/s | 6.9250 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1029ms | 21.2453μs | 47.0693 KOps/s | 46.7919 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2583ms | 0.1510ms | 6.6238 KOps/s | 6.6613 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4889ms | 25.9147μs | 38.5882 KOps/s | 38.0216 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2349ms | 0.1443ms | 6.9295 KOps/s | 6.9366 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 70.0410μs | 22.0001μs | 45.4544 KOps/s | 47.1263 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1906ms | 0.1523ms | 6.5640 KOps/s | 6.9709 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1092ms | 22.4767μs | 44.4906 KOps/s | 47.3117 KOps/s | |
test_mod_add[eager] | 73.6420μs | 35.7101μs | 28.0032 KOps/s | 30.4730 KOps/s | |
test_mod_add[compile] | 0.1536ms | 86.7590μs | 11.5262 KOps/s | 11.9345 KOps/s | |
test_mod_add[compile-overhead] | 0.3079ms | 0.1538ms | 6.5029 KOps/s | 6.2570 KOps/s | |
test_mod_wrap[eager] | 0.3213ms | 0.2612ms | 3.8283 KOps/s | 3.7986 KOps/s | |
test_mod_wrap[compile] | 1.4774ms | 0.3006ms | 3.3271 KOps/s | 3.2573 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6998ms | 4.0456ms | 247.1815 Ops/s | 250.5227 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7780ms | 1.3833ms | 722.9038 Ops/s | 659.5659 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7281ms | 1.3805ms | 724.3693 Ops/s | 671.5750 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3531ms | 0.9238ms | 1.0825 KOps/s | 969.5488 Ops/s | |
test_seq_add[eager] | 0.1703ms | 0.1035ms | 9.6584 KOps/s | 9.8411 KOps/s | |
test_seq_add[compile] | 0.2846ms | 92.7136μs | 10.7859 KOps/s | 10.6824 KOps/s | |
test_seq_add[compile-overhead] | 0.1706ms | 0.1264ms | 7.9129 KOps/s | 7.9276 KOps/s | |
test_seq_wrap[eager] | 0.4655ms | 0.4066ms | 2.4591 KOps/s | 2.5692 KOps/s | |
test_seq_wrap[compile] | 0.4328ms | 0.3195ms | 3.1303 KOps/s | 2.9395 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2715ms | 0.2249ms | 4.4471 KOps/s | 4.4061 KOps/s | |
test_func_call_runtime[False-eager] | 0.8174ms | 0.7472ms | 1.3384 KOps/s | 1.2785 KOps/s | |
test_func_call_runtime[False-compile] | 0.8903ms | 0.8071ms | 1.2390 KOps/s | 1.2132 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4305ms | 0.3626ms | 2.7575 KOps/s | 2.7212 KOps/s | |
test_func_call_runtime[True-eager] | 0.9784ms | 0.9193ms | 1.0877 KOps/s | 1.0560 KOps/s | |
test_func_call_runtime[True-compile] | 1.1557ms | 0.8419ms | 1.1878 KOps/s | 1.1133 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4757ms | 0.3847ms | 2.5995 KOps/s | 2.5888 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8246ms | 0.7432ms | 1.3455 KOps/s | 1.2709 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9112ms | 0.8399ms | 1.1906 KOps/s | 1.2135 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4344ms | 0.3654ms | 2.7364 KOps/s | 2.7238 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1288ms | 1.0192ms | 981.1979 Ops/s | 951.6626 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9470ms | 0.8568ms | 1.1671 KOps/s | 1.1504 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5045ms | 0.4072ms | 2.4558 KOps/s | 2.4307 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5721ms | 2.1138ms | 473.0709 Ops/s | 456.9252 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9462ms | 0.8719ms | 1.1470 KOps/s | 1.1227 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5054ms | 0.4175ms | 2.3951 KOps/s | 2.3865 KOps/s | |
test_distributed | 4.6462ms | 0.1999ms | 5.0013 KOps/s | 8.4112 KOps/s | |
test_tdmodule | 54.3600μs | 15.6687μs | 63.8216 KOps/s | 69.0712 KOps/s | |
test_tdmodule_dispatch | 60.3610μs | 30.7425μs | 32.5283 KOps/s | 36.7650 KOps/s | |
test_tdseq | 39.7210μs | 16.7627μs | 59.6563 KOps/s | 66.5648 KOps/s | |
test_tdseq_dispatch | 54.7210μs | 33.6382μs | 29.7281 KOps/s | 32.8211 KOps/s | |
test_instantiation_functorch | 2.0408ms | 1.8950ms | 527.7035 Ops/s | 524.1952 Ops/s | |
test_exec_functorch | 0.2999ms | 0.2116ms | 4.7252 KOps/s | 4.5700 KOps/s | |
test_exec_functional_call | 0.2587ms | 0.2129ms | 4.6967 KOps/s | 4.3597 KOps/s | |
test_exec_td_decorator | 0.4377ms | 0.2659ms | 3.7601 KOps/s | 3.6357 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8116ms | 0.6889ms | 1.4515 KOps/s | 1.4108 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8179ms | 0.6907ms | 1.4477 KOps/s | 1.3929 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7056ms | 0.6037ms | 1.6565 KOps/s | 1.5751 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7341ms | 0.6048ms | 1.6535 KOps/s | 1.5708 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.8737ms | 19.7956ms | 50.5163 Ops/s | 49.7888 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.8927ms | 19.7992ms | 50.5072 Ops/s | 49.8491 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7446ms | 19.6578ms | 50.8705 Ops/s | 50.1061 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.8836ms | 19.6820ms | 50.8079 Ops/s | 50.2137 Ops/s | |
test_to_module_speed[True] | 1.4213ms | 1.0062ms | 993.8417 Ops/s | 996.5710 Ops/s | |
test_to_module_speed[False] | 1.3933ms | 0.9824ms | 1.0179 KOps/s | 1.0155 KOps/s | |
test_tc_init | 68.6610μs | 37.9908μs | 26.3222 KOps/s | 28.8119 KOps/s | |
test_tc_init_nested | 0.1218ms | 78.3289μs | 12.7667 KOps/s | 14.0810 KOps/s | |
test_tc_first_layer_tensor | 5.3844μs | 0.7055μs | 1.4175 MOps/s | 1.4476 MOps/s | |
test_tc_first_layer_nontensor | 22.8600μs | 2.3071μs | 433.4489 KOps/s | 436.6492 KOps/s | |
test_tc_second_layer_tensor | 12.0527μs | 1.4152μs | 706.6054 KOps/s | 673.5233 KOps/s | |
test_tc_second_layer_nontensor | 31.6510μs | 3.0174μs | 331.4149 KOps/s | 331.7546 KOps/s | |
test_unbind | 0.1935s | 9.5226ms | 105.0138 Ops/s | 91.7937 Ops/s | |
test_full_like | 0.6565ms | 0.5731ms | 1.7448 KOps/s | 1.7471 KOps/s | |
test_zeros_like | 0.2586ms | 0.1979ms | 5.0534 KOps/s | 5.0550 KOps/s | |
test_ones_like | 0.2545ms | 0.1977ms | 5.0587 KOps/s | 5.0596 KOps/s | |
test_clone | 0.4461ms | 0.4145ms | 2.4123 KOps/s | 2.4116 KOps/s | |
test_squeeze | 29.8100μs | 9.8382μs | 101.6444 KOps/s | 90.0261 KOps/s | |
test_unsqueeze | 0.2814ms | 76.8047μs | 13.0200 KOps/s | 12.2990 KOps/s | |
test_split | 0.1794s | 0.2026ms | 4.9362 KOps/s | 5.9923 KOps/s | |
test_permute | 0.2407ms | 0.1899ms | 5.2651 KOps/s | 5.3567 KOps/s | |
test_stack | 1.2584ms | 0.8519ms | 1.1739 KOps/s | 1.1777 KOps/s | |
test_cat | 1.2564ms | 1.2313ms | 812.1439 Ops/s | 812.1742 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 8d0a33544643d5105d79896583d0e05e50d350e2 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 683407f1fff61e6a44ddd41f510e859d053df5a7 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: ec35e5e2341cf926f87fe4f25746966e032927b7 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 60f8c7d2ea2870a9ae51f1c74acb23555f087615 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: e192b1ba8598993e48ffa78d8b067eea6395bae5 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 48025831ccce161787bdd18e4a6b1fa80a15c0ab Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 185f3b4dc68a60f68688a3c652972e09b687ad58 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 2345c916deea3b070878f122302e0ff3378b5fcf Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 922360e440b1c5fa21aed61d6bc6e465f50dd2e7 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 14d558692120ea48c40188f4eaaced9c506c0f17 Pull Request resolved: #1043
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 14d558692120ea48c40188f4eaaced9c506c0f17 Pull Request resolved: #1043
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):