-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Compatibility with non-tensor inputs in CudaGraphModule #1039
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 3965461dd9b4b3684cf2013093797d5306c11008 Pull Request resolved: #1039
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 11, 2024
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 756234dd11739d521ca04ac42cb0d83a1d361b6d Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 3eff6c24b0fa381665823deb5a1efdb7d2cc1bd2 Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: f6f296cdcf00498cae4be818f15eec75309906ce Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: f09bb358e4941818dad4a0ae1e4d48d347492d6f Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 61e14db945864a8bcf5c19934b7761b8a94d1fc8 Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: f5a48452c26ae0c28399355573fe0458e402574c Pull Request resolved: #1039
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: f5a48452c26ae0c28399355573fe0458e402574c Pull Request resolved: #1039
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 68.6680μs | 25.5609μs | 39.1222 KOps/s | 41.5688 KOps/s | |
test_plain_set_stack_nested | 90.6830μs | 26.1473μs | 38.2449 KOps/s | 41.0413 KOps/s | |
test_plain_set_nested_inplace | 90.4160μs | 28.1242μs | 35.5565 KOps/s | 37.8238 KOps/s | |
test_plain_set_stack_nested_inplace | 68.2080μs | 28.5149μs | 35.0694 KOps/s | 37.8899 KOps/s | |
test_items | 42.5390μs | 4.1822μs | 239.1058 KOps/s | 234.9301 KOps/s | |
test_items_nested | 0.5270ms | 0.3860ms | 2.5907 KOps/s | 2.5815 KOps/s | |
test_items_nested_locked | 0.7049ms | 0.3879ms | 2.5779 KOps/s | 2.5746 KOps/s | |
test_items_nested_leaf | 0.1532ms | 81.7796μs | 12.2280 KOps/s | 12.4404 KOps/s | |
test_items_stack_nested | 0.6681ms | 0.3884ms | 2.5749 KOps/s | 2.5679 KOps/s | |
test_items_stack_nested_leaf | 0.1524ms | 83.3459μs | 11.9982 KOps/s | 11.9533 KOps/s | |
test_items_stack_nested_locked | 0.8341ms | 0.3876ms | 2.5801 KOps/s | 2.5399 KOps/s | |
test_keys | 36.2750μs | 3.5054μs | 285.2717 KOps/s | 286.5756 KOps/s | |
test_keys_nested | 0.2553ms | 0.1398ms | 7.1506 KOps/s | 7.3854 KOps/s | |
test_keys_nested_locked | 1.4961ms | 0.1397ms | 7.1588 KOps/s | 7.0919 KOps/s | |
test_keys_nested_leaf | 0.1837ms | 0.1177ms | 8.4939 KOps/s | 8.3937 KOps/s | |
test_keys_stack_nested | 0.2543ms | 0.1355ms | 7.3815 KOps/s | 7.4291 KOps/s | |
test_keys_stack_nested_leaf | 0.2079ms | 0.1183ms | 8.4548 KOps/s | 8.5049 KOps/s | |
test_keys_stack_nested_locked | 0.2582ms | 0.1444ms | 6.9229 KOps/s | 7.1095 KOps/s | |
test_values | 9.0548μs | 1.0426μs | 959.1143 KOps/s | 946.4509 KOps/s | |
test_values_nested | 0.1722ms | 94.9326μs | 10.5338 KOps/s | 10.9274 KOps/s | |
test_values_nested_locked | 0.1580ms | 94.6643μs | 10.5636 KOps/s | 10.7749 KOps/s | |
test_values_nested_leaf | 0.1480ms | 81.2121μs | 12.3134 KOps/s | 12.3745 KOps/s | |
test_values_stack_nested | 0.1587ms | 96.8799μs | 10.3221 KOps/s | 10.6034 KOps/s | |
test_values_stack_nested_leaf | 0.1881ms | 80.8342μs | 12.3710 KOps/s | 12.5551 KOps/s | |
test_values_stack_nested_locked | 0.1576ms | 95.1402μs | 10.5108 KOps/s | 10.3305 KOps/s | |
test_membership | 37.6100μs | 0.9122μs | 1.0962 MOps/s | 1.1124 MOps/s | |
test_membership_nested | 37.7000μs | 2.8388μs | 352.2670 KOps/s | 364.4237 KOps/s | |
test_membership_nested_leaf | 51.1860μs | 2.8331μs | 352.9666 KOps/s | 362.3185 KOps/s | |
test_membership_stacked_nested | 17.6430μs | 2.8411μs | 351.9819 KOps/s | 363.7218 KOps/s | |
test_membership_stacked_nested_leaf | 38.9390μs | 2.8625μs | 349.3403 KOps/s | 360.0837 KOps/s | |
test_membership_nested_last | 37.3500μs | 4.3139μs | 231.8083 KOps/s | 238.8381 KOps/s | |
test_membership_nested_leaf_last | 41.7880μs | 4.2854μs | 233.3489 KOps/s | 239.3048 KOps/s | |
test_membership_stacked_nested_last | 37.4400μs | 5.6021μs | 178.5046 KOps/s | 136.5082 KOps/s | |
test_membership_stacked_nested_leaf_last | 48.4410μs | 5.5444μs | 180.3617 KOps/s | 135.7152 KOps/s | |
test_nested_getleaf | 47.6630μs | 10.4839μs | 95.3841 KOps/s | 92.7710 KOps/s | |
test_nested_get | 48.1000μs | 9.9818μs | 100.1823 KOps/s | 98.3399 KOps/s | |
test_stacked_getleaf | 52.3380μs | 10.5587μs | 94.7084 KOps/s | 94.4509 KOps/s | |
test_stacked_get | 46.7180μs | 10.0954μs | 99.0550 KOps/s | 98.6173 KOps/s | |
test_nested_getitemleaf | 47.6490μs | 11.0752μs | 90.2915 KOps/s | 89.8590 KOps/s | |
test_nested_getitem | 56.7870μs | 10.3668μs | 96.4616 KOps/s | 96.4934 KOps/s | |
test_stacked_getitemleaf | 53.0290μs | 11.0686μs | 90.3456 KOps/s | 91.2466 KOps/s | |
test_stacked_getitem | 58.6200μs | 10.3961μs | 96.1902 KOps/s | 97.2583 KOps/s | |
test_lock_nested | 89.4535ms | 0.5997ms | 1.6675 KOps/s | 1.9778 KOps/s | |
test_lock_stack_nested | 0.8378ms | 0.4730ms | 2.1141 KOps/s | 2.1290 KOps/s | |
test_unlock_nested | 99.7273ms | 0.5322ms | 1.8792 KOps/s | 2.3616 KOps/s | |
test_unlock_stack_nested | 0.6320ms | 0.3850ms | 2.5972 KOps/s | 2.5791 KOps/s | |
test_flatten_speed | 0.1928ms | 0.1019ms | 9.8111 KOps/s | 10.0484 KOps/s | |
test_unflatten_speed | 0.9445ms | 0.5163ms | 1.9367 KOps/s | 1.9405 KOps/s | |
test_common_ops | 3.7999ms | 1.2220ms | 818.3106 Ops/s | 898.5405 Ops/s | |
test_creation | 36.0570μs | 2.1618μs | 462.5734 KOps/s | 472.4821 KOps/s | |
test_creation_empty | 54.5220μs | 21.6864μs | 46.1118 KOps/s | 56.1588 KOps/s | |
test_creation_nested_1 | 75.5920μs | 25.1915μs | 39.6960 KOps/s | 48.4045 KOps/s | |
test_creation_nested_2 | 0.1013ms | 29.3001μs | 34.1296 KOps/s | 39.3737 KOps/s | |
test_clone | 0.1469ms | 17.2667μs | 57.9149 KOps/s | 56.3933 KOps/s | |
test_getitem[int] | 1.2246ms | 16.9448μs | 59.0153 KOps/s | 59.8372 KOps/s | |
test_getitem[slice_int] | 0.1646ms | 31.2298μs | 32.0207 KOps/s | 33.1821 KOps/s | |
test_getitem[range] | 0.2556ms | 57.0881μs | 17.5168 KOps/s | 17.4704 KOps/s | |
test_getitem[tuple] | 0.1425ms | 25.4526μs | 39.2888 KOps/s | 40.0696 KOps/s | |
test_getitem[list] | 0.3315ms | 52.5512μs | 19.0291 KOps/s | 18.8991 KOps/s | |
test_setitem_dim[int] | 82.1750μs | 33.4065μs | 29.9343 KOps/s | 30.8317 KOps/s | |
test_setitem_dim[slice_int] | 0.1117ms | 61.2164μs | 16.3355 KOps/s | 16.4271 KOps/s | |
test_setitem_dim[range] | 0.1354ms | 84.1958μs | 11.8771 KOps/s | 12.0553 KOps/s | |
test_setitem_dim[tuple] | 0.1177ms | 49.4817μs | 20.2095 KOps/s | 20.4041 KOps/s | |
test_setitem | 0.1773ms | 31.8906μs | 31.3572 KOps/s | 33.0976 KOps/s | |
test_set | 0.1633ms | 30.9857μs | 32.2730 KOps/s | 34.3446 KOps/s | |
test_set_shared | 3.9570ms | 0.2221ms | 4.5015 KOps/s | 4.6107 KOps/s | |
test_update | 0.2035ms | 40.3438μs | 24.7870 KOps/s | 27.2428 KOps/s | |
test_update_nested | 0.1957ms | 50.6821μs | 19.7308 KOps/s | 20.9333 KOps/s | |
test_update__nested | 0.4235ms | 45.5109μs | 21.9728 KOps/s | 22.1768 KOps/s | |
test_set_nested | 0.1718ms | 33.7762μs | 29.6067 KOps/s | 31.2477 KOps/s | |
test_set_nested_new | 0.1635ms | 38.4937μs | 25.9783 KOps/s | 27.1531 KOps/s | |
test_select | 0.2140ms | 58.0130μs | 17.2375 KOps/s | 18.0311 KOps/s | |
test_select_nested | 0.1223ms | 59.8790μs | 16.7004 KOps/s | 16.8351 KOps/s | |
test_exclude_nested | 0.1548ms | 76.9315μs | 12.9986 KOps/s | 13.3060 KOps/s | |
test_empty[True] | 0.6459ms | 0.3577ms | 2.7954 KOps/s | 2.8330 KOps/s | |
test_empty[False] | 10.3870μs | 1.2660μs | 789.8949 KOps/s | 805.6344 KOps/s | |
test_unbind_speed | 0.5711ms | 0.3044ms | 3.2851 KOps/s | 3.2731 KOps/s | |
test_unbind_speed_stack0 | 0.6610ms | 0.2957ms | 3.3814 KOps/s | 3.4083 KOps/s | |
test_unbind_speed_stack1 | 96.5183ms | 0.8092ms | 1.2358 KOps/s | 1.3589 KOps/s | |
test_split | 91.1060ms | 2.1886ms | 456.9046 Ops/s | 456.5758 Ops/s | |
test_chunk | 3.1672ms | 2.0099ms | 497.5456 Ops/s | 455.3382 Ops/s | |
test_creation[device0] | 3.3161ms | 0.1201ms | 8.3270 KOps/s | 8.7671 KOps/s | |
test_creation_from_tensor | 0.3255ms | 0.1172ms | 8.5325 KOps/s | 8.5932 KOps/s | |
test_add_one[memmap_tensor0] | 0.1748ms | 6.9227μs | 144.4519 KOps/s | 137.5889 KOps/s | |
test_contiguous[memmap_tensor0] | 33.8330μs | 1.9477μs | 513.4239 KOps/s | 533.9819 KOps/s | |
test_stack[memmap_tensor0] | 61.5260μs | 5.5022μs | 181.7439 KOps/s | 177.3345 KOps/s | |
test_memmaptd_index | 0.6604ms | 0.4139ms | 2.4160 KOps/s | 2.4659 KOps/s | |
test_memmaptd_index_astensor | 1.0218ms | 0.5193ms | 1.9256 KOps/s | 1.9606 KOps/s | |
test_memmaptd_index_op | 1.6408ms | 1.1209ms | 892.1718 Ops/s | 968.5715 Ops/s | |
test_serialize_model | 0.2386s | 0.1337s | 7.4811 Ops/s | 8.4839 Ops/s | |
test_serialize_model_pickle | 0.4477s | 0.3950s | 2.5314 Ops/s | 2.4156 Ops/s | |
test_serialize_weights | 0.1238s | 0.1155s | 8.6598 Ops/s | 8.2248 Ops/s | |
test_serialize_weights_returnearly | 0.2639s | 0.1727s | 5.7908 Ops/s | 5.5926 Ops/s | |
test_serialize_weights_pickle | 1.0130s | 0.6918s | 1.4454 Ops/s | 2.5175 Ops/s | |
test_serialize_weights_filesystem | 0.1506s | 0.1431s | 6.9882 Ops/s | 7.1176 Ops/s | |
test_serialize_model_filesystem | 0.1559s | 0.1443s | 6.9319 Ops/s | 6.6802 Ops/s | |
test_reshape_pytree | 85.0590μs | 38.6935μs | 25.8441 KOps/s | 25.3629 KOps/s | |
test_reshape_td | 95.3090μs | 44.9784μs | 22.2329 KOps/s | 22.0480 KOps/s | |
test_view_pytree | 95.3790μs | 38.9839μs | 25.6516 KOps/s | 26.0911 KOps/s | |
test_view_td | 0.1137ms | 52.1441μs | 19.1776 KOps/s | 19.3224 KOps/s | |
test_unbind_pytree | 79.1890μs | 35.9758μs | 27.7965 KOps/s | 27.6849 KOps/s | |
test_unbind_td | 0.3032ms | 45.2341μs | 22.1072 KOps/s | 21.9040 KOps/s | |
test_split_pytree | 91.8120μs | 37.8812μs | 26.3983 KOps/s | 25.8690 KOps/s | |
test_split_td | 0.4998ms | 57.7244μs | 17.3237 KOps/s | 17.2897 KOps/s | |
test_add_pytree | 0.1019ms | 44.8682μs | 22.2875 KOps/s | 22.1637 KOps/s | |
test_add_td | 0.2580ms | 90.4067μs | 11.0611 KOps/s | 11.8552 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1543ms | 59.3970μs | 16.8359 KOps/s | 17.4008 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4438ms | 0.1943ms | 5.1479 KOps/s | 5.0858 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1292ms | 56.6683μs | 17.6466 KOps/s | 17.6242 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3300ms | 0.1368ms | 7.3089 KOps/s | 7.1393 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 58.8610μs | 23.4061μs | 42.7238 KOps/s | 41.7495 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1758ms | 76.4328μs | 13.0834 KOps/s | 13.3360 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1550ms | 75.4810μs | 13.2484 KOps/s | 12.9833 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1477ms | 69.0063μs | 14.4914 KOps/s | 14.5170 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2690ms | 0.1859ms | 5.3795 KOps/s | 5.5532 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4361ms | 0.2415ms | 4.1401 KOps/s | 4.1599 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1195ms | 49.5013μs | 20.2015 KOps/s | 21.0713 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.6788ms | 76.5115μs | 13.0699 KOps/s | 12.7764 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2779ms | 0.1731ms | 5.7756 KOps/s | 5.7547 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3792ms | 0.2801ms | 3.5695 KOps/s | 3.4207 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5562ms | 0.2781ms | 3.5953 KOps/s | 3.6315 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.6518ms | 0.1873ms | 5.3399 KOps/s | 5.5452 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1863ms | 76.5071μs | 13.0707 KOps/s | 13.4980 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1251ms | 49.3500μs | 20.2634 KOps/s | 20.6997 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4994ms | 0.2287ms | 4.3724 KOps/s | 4.2532 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2387ms | 0.1754ms | 5.7007 KOps/s | 5.7733 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2100ms | 0.1108ms | 9.0244 KOps/s | 9.0277 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1572ms | 82.8723μs | 12.0668 KOps/s | 12.1751 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1471ms | 77.9259μs | 12.8327 KOps/s | 12.3232 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1698ms | 69.9335μs | 14.2993 KOps/s | 13.6020 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2951ms | 0.1933ms | 5.1745 KOps/s | 5.1878 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.0485ms | 1.7646ms | 566.7028 Ops/s | 558.8199 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3601ms | 0.1892ms | 5.2862 KOps/s | 5.2120 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3326ms | 1.0641ms | 939.7382 Ops/s | 906.8611 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5336ms | 0.4097ms | 2.4408 KOps/s | 2.4296 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.4719ms | 4.2142ms | 237.2906 Ops/s | 255.8997 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 99.4360μs | 34.2576μs | 29.1906 KOps/s | 29.3765 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0700ms | 47.9226μs | 20.8670 KOps/s | 21.0265 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 99.5970μs | 29.7912μs | 33.5669 KOps/s | 33.9720 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 96.5610μs | 29.6479μs | 33.7292 KOps/s | 33.5001 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 98.8050μs | 29.4615μs | 33.9426 KOps/s | 34.4206 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1061ms | 29.4809μs | 33.9203 KOps/s | 34.6707 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1804ms | 74.6895μs | 13.3888 KOps/s | 13.6045 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5405ms | 27.8492μs | 35.9077 KOps/s | 35.2151 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1491ms | 68.4474μs | 14.6098 KOps/s | 14.6393 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 86.6620μs | 22.9814μs | 43.5134 KOps/s | 42.8390 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1601ms | 68.1011μs | 14.6841 KOps/s | 14.7263 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 75.1610μs | 22.8500μs | 43.7637 KOps/s | 42.3332 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1715ms | 72.7920μs | 13.7378 KOps/s | 13.4590 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.3984s | 43.4155μs | 23.0333 KOps/s | 36.7152 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1286ms | 67.9327μs | 14.7204 KOps/s | 14.7863 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 75.4320μs | 22.5543μs | 44.3375 KOps/s | 43.0920 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1441ms | 67.2737μs | 14.8647 KOps/s | 14.8154 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.0910μs | 22.7458μs | 43.9642 KOps/s | 43.1101 KOps/s | |
test_mod_add[eager] | 81.0420μs | 26.7137μs | 37.4339 KOps/s | 39.5779 KOps/s | |
test_mod_add[compile] | 82.6350μs | 38.1781μs | 26.1930 KOps/s | 26.6170 KOps/s | |
test_mod_add[compile-overhead] | 0.1118ms | 38.1698μs | 26.1987 KOps/s | 26.6869 KOps/s | |
test_mod_wrap[eager] | 0.3740ms | 0.2042ms | 4.8975 KOps/s | 4.8439 KOps/s | |
test_mod_wrap[compile] | 0.4345ms | 0.2291ms | 4.3640 KOps/s | 4.3471 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4312ms | 0.2280ms | 4.3852 KOps/s | 4.4068 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.8316ms | 11.8182ms | 84.6153 Ops/s | 90.7208 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.2215ms | 11.4195ms | 87.5695 Ops/s | 91.8601 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.6912ms | 12.3351ms | 81.0698 Ops/s | 92.1110 Ops/s | |
test_seq_add[eager] | 0.1714ms | 96.6755μs | 10.3439 KOps/s | 11.0069 KOps/s | |
test_seq_add[compile] | 0.1349ms | 64.1625μs | 15.5854 KOps/s | 15.4166 KOps/s | |
test_seq_add[compile-overhead] | 0.1405ms | 63.7505μs | 15.6862 KOps/s | 15.4104 KOps/s | |
test_seq_wrap[eager] | 0.7068ms | 0.3860ms | 2.5909 KOps/s | 2.6518 KOps/s | |
test_seq_wrap[compile] | 0.4818ms | 0.2641ms | 3.7869 KOps/s | 3.7308 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4964ms | 0.2636ms | 3.7939 KOps/s | 3.7605 KOps/s | |
test_func_call_runtime[False-eager] | 0.9878ms | 0.5084ms | 1.9671 KOps/s | 1.8828 KOps/s | |
test_func_call_runtime[False-compile] | 0.6433ms | 0.4891ms | 2.0448 KOps/s | 2.0304 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6603ms | 0.4919ms | 2.0329 KOps/s | 2.0339 KOps/s | |
test_func_call_runtime[True-eager] | 0.8584ms | 0.7250ms | 1.3793 KOps/s | 1.3272 KOps/s | |
test_func_call_runtime[True-compile] | 0.7807ms | 0.5082ms | 1.9677 KOps/s | 1.9675 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7129ms | 0.5109ms | 1.9574 KOps/s | 1.9560 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9305ms | 0.5169ms | 1.9348 KOps/s | 1.9200 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6114ms | 0.4923ms | 2.0311 KOps/s | 2.0185 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6484ms | 0.4976ms | 2.0096 KOps/s | 2.0228 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0805ms | 0.8730ms | 1.1454 KOps/s | 1.1191 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0016ms | 0.7143ms | 1.3999 KOps/s | 1.3499 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0233ms | 0.7109ms | 1.4067 KOps/s | 1.3476 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5226ms | 1.8812ms | 531.5659 Ops/s | 518.4103 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6473ms | 1.9369ms | 516.2857 Ops/s | 507.9148 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6104ms | 1.9308ms | 517.9296 Ops/s | 509.6742 Ops/s | |
test_distributed | 0.2952ms | 0.1275ms | 7.8420 KOps/s | 7.7174 KOps/s | |
test_tdmodule | 87.5940μs | 19.6569μs | 50.8728 KOps/s | 57.1917 KOps/s | |
test_tdmodule_dispatch | 58.1990μs | 38.8629μs | 25.7315 KOps/s | 28.4388 KOps/s | |
test_tdseq | 59.8420μs | 22.1425μs | 45.1619 KOps/s | 51.1099 KOps/s | |
test_tdseq_dispatch | 0.1061ms | 46.3251μs | 21.5866 KOps/s | 24.6488 KOps/s | |
test_instantiation_functorch | 1.7108ms | 1.5617ms | 640.3273 Ops/s | 627.5378 Ops/s | |
test_exec_functorch | 0.2721ms | 0.1829ms | 5.4685 KOps/s | 5.5778 KOps/s | |
test_exec_functional_call | 0.3031ms | 0.1694ms | 5.9040 KOps/s | 5.7886 KOps/s | |
test_exec_td_decorator | 0.5168ms | 0.2267ms | 4.4105 KOps/s | 4.2915 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8866ms | 0.6395ms | 1.5636 KOps/s | 1.5739 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0184ms | 0.6420ms | 1.5577 KOps/s | 1.5631 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9681ms | 0.5307ms | 1.8842 KOps/s | 1.8891 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7866ms | 0.5208ms | 1.9201 KOps/s | 1.8782 KOps/s | |
test_to_module_speed[True] | 1.7326ms | 1.4337ms | 697.5130 Ops/s | 704.9324 Ops/s | |
test_to_module_speed[False] | 1.9690ms | 1.4132ms | 707.6301 Ops/s | 723.3719 Ops/s | |
test_tc_init | 0.1245ms | 51.0511μs | 19.5882 KOps/s | 21.2275 KOps/s | |
test_tc_init_nested | 0.1885ms | 0.1037ms | 9.6409 KOps/s | 10.7345 KOps/s | |
test_tc_first_layer_tensor | 43.8920μs | 1.6058μs | 622.7614 KOps/s | 663.2911 KOps/s | |
test_tc_first_layer_nontensor | 29.0440μs | 4.9544μs | 201.8401 KOps/s | 211.4343 KOps/s | |
test_tc_second_layer_tensor | 42.3700μs | 2.9922μs | 334.2058 KOps/s | 356.0910 KOps/s | |
test_tc_second_layer_nontensor | 35.9170μs | 6.3340μs | 157.8789 KOps/s | 164.2914 KOps/s | |
test_unbind | 0.4440s | 13.0013ms | 76.9155 Ops/s | 75.2684 Ops/s | |
test_full_like | 8.5405ms | 7.2319ms | 138.2754 Ops/s | 134.3570 Ops/s | |
test_zeros_like | 12.8500ms | 7.3232ms | 136.5517 Ops/s | 351.0415 Ops/s | |
test_ones_like | 15.1793ms | 7.4877ms | 133.5517 Ops/s | 162.1008 Ops/s | |
test_clone | 15.0942ms | 8.9759ms | 111.4099 Ops/s | 125.0199 Ops/s | |
test_squeeze | 61.4660μs | 12.8262μs | 77.9656 KOps/s | 77.0240 KOps/s | |
test_unsqueeze | 0.1673ms | 92.7590μs | 10.7806 KOps/s | 10.5117 KOps/s | |
test_split | 0.4916ms | 0.1946ms | 5.1384 KOps/s | 5.0088 KOps/s | |
test_permute | 0.3569ms | 0.2143ms | 4.6673 KOps/s | 4.4731 KOps/s | |
test_stack | 28.9969ms | 24.2488ms | 41.2391 Ops/s | 39.1042 Ops/s | |
test_cat | 27.3352ms | 24.0222ms | 41.6282 Ops/s | 39.3059 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1638ms | 16.6855μs | 59.9321 KOps/s | 55.1854 KOps/s | |
test_plain_set_stack_nested | 44.3010μs | 16.7760μs | 59.6088 KOps/s | 54.5676 KOps/s | |
test_plain_set_nested_inplace | 51.6710μs | 18.0707μs | 55.3383 KOps/s | 51.2363 KOps/s | |
test_plain_set_stack_nested_inplace | 51.2110μs | 18.0208μs | 55.4915 KOps/s | 51.5282 KOps/s | |
test_items | 25.0100μs | 2.9777μs | 335.8285 KOps/s | 329.3874 KOps/s | |
test_items_nested | 0.3820ms | 0.3422ms | 2.9226 KOps/s | 2.9415 KOps/s | |
test_items_nested_locked | 0.3976ms | 0.3460ms | 2.8904 KOps/s | 2.9250 KOps/s | |
test_items_nested_leaf | 97.8620μs | 62.6628μs | 15.9584 KOps/s | 15.8679 KOps/s | |
test_items_stack_nested | 0.3993ms | 0.3469ms | 2.8823 KOps/s | 2.9574 KOps/s | |
test_items_stack_nested_leaf | 99.0730μs | 64.3699μs | 15.5352 KOps/s | 15.8721 KOps/s | |
test_items_stack_nested_locked | 0.3861ms | 0.3476ms | 2.8770 KOps/s | 2.9382 KOps/s | |
test_keys | 34.0510μs | 3.5176μs | 284.2809 KOps/s | 288.2366 KOps/s | |
test_keys_nested | 0.1435ms | 71.9040μs | 13.9074 KOps/s | 14.0864 KOps/s | |
test_keys_nested_locked | 2.3270ms | 78.0250μs | 12.8164 KOps/s | 12.8048 KOps/s | |
test_keys_nested_leaf | 85.5220μs | 63.0413μs | 15.8626 KOps/s | 16.1462 KOps/s | |
test_keys_stack_nested | 0.1078ms | 73.0579μs | 13.6878 KOps/s | 13.7224 KOps/s | |
test_keys_stack_nested_leaf | 92.8020μs | 64.9582μs | 15.3945 KOps/s | 15.7413 KOps/s | |
test_keys_stack_nested_locked | 0.1115ms | 78.8803μs | 12.6774 KOps/s | 12.8138 KOps/s | |
test_values | 6.6368μs | 0.8835μs | 1.1319 MOps/s | 1.1242 MOps/s | |
test_values_nested | 0.1162ms | 48.8264μs | 20.4807 KOps/s | 20.3163 KOps/s | |
test_values_nested_locked | 72.6820μs | 50.8498μs | 19.6657 KOps/s | 19.6397 KOps/s | |
test_values_nested_leaf | 65.9610μs | 43.5726μs | 22.9502 KOps/s | 23.2398 KOps/s | |
test_values_stack_nested | 74.2010μs | 50.8001μs | 19.6850 KOps/s | 20.1578 KOps/s | |
test_values_stack_nested_leaf | 80.2810μs | 44.5285μs | 22.4575 KOps/s | 22.8818 KOps/s | |
test_values_stack_nested_locked | 78.4120μs | 52.5020μs | 19.0469 KOps/s | 19.3749 KOps/s | |
test_membership | 1.7050μs | 0.5351μs | 1.8687 MOps/s | 1.8775 MOps/s | |
test_membership_nested | 19.2705μs | 1.9833μs | 504.2115 KOps/s | 500.5065 KOps/s | |
test_membership_nested_leaf | 14.8600μs | 1.9921μs | 501.9769 KOps/s | 513.0412 KOps/s | |
test_membership_stacked_nested | 35.8310μs | 2.0647μs | 484.3360 KOps/s | 499.8336 KOps/s | |
test_membership_stacked_nested_leaf | 33.1610μs | 2.0584μs | 485.8164 KOps/s | 494.3424 KOps/s | |
test_membership_nested_last | 33.0810μs | 3.1096μs | 321.5813 KOps/s | 324.3490 KOps/s | |
test_membership_nested_leaf_last | 36.0010μs | 3.1278μs | 319.7144 KOps/s | 326.1055 KOps/s | |
test_membership_stacked_nested_last | 39.9710μs | 4.4993μs | 222.2549 KOps/s | 327.2699 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.2310μs | 4.4850μs | 222.9636 KOps/s | 325.4662 KOps/s | |
test_nested_getleaf | 42.3410μs | 6.1998μs | 161.2947 KOps/s | 160.4330 KOps/s | |
test_nested_get | 39.6510μs | 5.8866μs | 169.8779 KOps/s | 169.4269 KOps/s | |
test_stacked_getleaf | 27.4310μs | 6.2417μs | 160.2133 KOps/s | 160.6070 KOps/s | |
test_stacked_get | 31.9010μs | 5.8209μs | 171.7957 KOps/s | 173.0850 KOps/s | |
test_nested_getitemleaf | 26.8310μs | 6.3190μs | 158.2532 KOps/s | 159.0815 KOps/s | |
test_nested_getitem | 41.4010μs | 5.7831μs | 172.9168 KOps/s | 168.6792 KOps/s | |
test_stacked_getitemleaf | 25.2200μs | 6.2621μs | 159.6908 KOps/s | 160.0689 KOps/s | |
test_stacked_getitem | 36.9610μs | 5.8546μs | 170.8061 KOps/s | 170.1766 KOps/s | |
test_lock_nested | 5.0274ms | 0.4549ms | 2.1984 KOps/s | 2.3029 KOps/s | |
test_lock_stack_nested | 0.4454ms | 0.4077ms | 2.4530 KOps/s | 2.5164 KOps/s | |
test_unlock_nested | 0.7779ms | 0.3773ms | 2.6501 KOps/s | 2.7161 KOps/s | |
test_unlock_stack_nested | 0.3734ms | 0.3404ms | 2.9379 KOps/s | 3.0092 KOps/s | |
test_flatten_speed | 0.1170ms | 77.6010μs | 12.8864 KOps/s | 12.9764 KOps/s | |
test_unflatten_speed | 0.3734ms | 0.3311ms | 3.0205 KOps/s | 3.0725 KOps/s | |
test_common_ops | 1.5734ms | 1.2308ms | 812.4669 Ops/s | 778.7113 Ops/s | |
test_creation | 22.3800μs | 1.5560μs | 642.6835 KOps/s | 651.5088 KOps/s | |
test_creation_empty | 45.2010μs | 15.1133μs | 66.1669 KOps/s | 54.6025 KOps/s | |
test_creation_nested_1 | 44.0010μs | 16.8857μs | 59.2219 KOps/s | 47.7056 KOps/s | |
test_creation_nested_2 | 54.2610μs | 19.6859μs | 50.7978 KOps/s | 41.9500 KOps/s | |
test_clone | 60.0310μs | 27.5019μs | 36.3612 KOps/s | 35.9424 KOps/s | |
test_getitem[int] | 1.2710ms | 16.1128μs | 62.0624 KOps/s | 63.3410 KOps/s | |
test_getitem[slice_int] | 92.7539ms | 38.8388μs | 25.7474 KOps/s | 36.8771 KOps/s | |
test_getitem[range] | 0.2148ms | 0.1042ms | 9.5940 KOps/s | 9.4816 KOps/s | |
test_getitem[tuple] | 0.1236ms | 23.8936μs | 41.8522 KOps/s | 40.7321 KOps/s | |
test_getitem[list] | 0.1838ms | 94.0290μs | 10.6350 KOps/s | 10.0418 KOps/s | |
test_setitem_dim[int] | 67.1720μs | 42.2750μs | 23.6546 KOps/s | 21.5672 KOps/s | |
test_setitem_dim[slice_int] | 90.4820μs | 64.0936μs | 15.6022 KOps/s | 15.5505 KOps/s | |
test_setitem_dim[range] | 0.1529ms | 0.1226ms | 8.1568 KOps/s | 8.1159 KOps/s | |
test_setitem_dim[tuple] | 85.9020μs | 62.8826μs | 15.9027 KOps/s | 17.0291 KOps/s | |
test_setitem | 88.7620μs | 44.2428μs | 22.6025 KOps/s | 23.9670 KOps/s | |
test_set | 85.6920μs | 43.4240μs | 23.0287 KOps/s | 24.3297 KOps/s | |
test_set_shared | 0.3498ms | 54.5132μs | 18.3442 KOps/s | 18.8673 KOps/s | |
test_update | 89.6120μs | 48.3047μs | 20.7019 KOps/s | 18.8238 KOps/s | |
test_update_nested | 99.9920μs | 55.4945μs | 18.0198 KOps/s | 16.0261 KOps/s | |
test_update__nested | 0.1837ms | 61.7679μs | 16.1896 KOps/s | 16.5999 KOps/s | |
test_set_nested | 76.9420μs | 41.6296μs | 24.0214 KOps/s | 22.6074 KOps/s | |
test_set_nested_new | 98.1130μs | 45.2258μs | 22.1113 KOps/s | 20.8934 KOps/s | |
test_select | 0.1038ms | 59.2452μs | 16.8790 KOps/s | 16.5600 KOps/s | |
test_select_nested | 0.4399ms | 45.6784μs | 21.8922 KOps/s | 22.3497 KOps/s | |
test_exclude_nested | 0.1115ms | 61.7320μs | 16.1990 KOps/s | 16.1919 KOps/s | |
test_empty[True] | 0.3034ms | 0.2653ms | 3.7689 KOps/s | 3.8101 KOps/s | |
test_empty[False] | 3.0501μs | 0.7592μs | 1.3172 MOps/s | 1.3077 MOps/s | |
test_to | 57.8110μs | 26.7585μs | 37.3713 KOps/s | 36.0802 KOps/s | |
test_to_nonblocking | 67.1320μs | 26.2691μs | 38.0676 KOps/s | 38.7117 KOps/s | |
test_unbind_speed | 0.3252ms | 0.2871ms | 3.4830 KOps/s | 3.5531 KOps/s | |
test_unbind_speed_stack0 | 0.3286ms | 0.2881ms | 3.4704 KOps/s | 3.6168 KOps/s | |
test_unbind_speed_stack1 | 92.1259ms | 0.7323ms | 1.3655 KOps/s | 1.3969 KOps/s | |
test_split | 94.1119ms | 2.1896ms | 456.7016 Ops/s | 459.7550 Ops/s | |
test_chunk | 95.5782ms | 2.1873ms | 457.1811 Ops/s | 459.7560 Ops/s | |
test_creation[device0] | 0.3556ms | 0.1257ms | 7.9562 KOps/s | 7.8731 KOps/s | |
test_creation_from_tensor | 0.4395ms | 0.1284ms | 7.7860 KOps/s | 7.5137 KOps/s | |
test_add_one[memmap_tensor0] | 0.2337ms | 8.7571μs | 114.1930 KOps/s | 109.4873 KOps/s | |
test_contiguous[memmap_tensor0] | 33.6210μs | 2.1764μs | 459.4643 KOps/s | 453.8293 KOps/s | |
test_stack[memmap_tensor0] | 38.1210μs | 6.8399μs | 146.2002 KOps/s | 153.5152 KOps/s | |
test_memmaptd_index | 1.2629ms | 0.4243ms | 2.3566 KOps/s | 2.3637 KOps/s | |
test_memmaptd_index_astensor | 0.9803ms | 0.4963ms | 2.0149 KOps/s | 2.0114 KOps/s | |
test_memmaptd_index_op | 1.4217ms | 1.0218ms | 978.6520 Ops/s | 940.8073 Ops/s | |
test_serialize_model | 0.1306s | 0.1299s | 7.6994 Ops/s | 7.6720 Ops/s | |
test_serialize_model_pickle | 1.3521s | 1.2175s | 0.8214 Ops/s | 0.8242 Ops/s | |
test_serialize_weights | 0.1306s | 0.1301s | 7.6868 Ops/s | 7.7118 Ops/s | |
test_serialize_weights_returnearly | 0.2158s | 56.6634ms | 17.6481 Ops/s | 17.8598 Ops/s | |
test_serialize_weights_pickle | 1.3471s | 1.1856s | 0.8435 Ops/s | 0.8218 Ops/s | |
test_reshape_pytree | 70.4820μs | 39.2319μs | 25.4894 KOps/s | 27.0514 KOps/s | |
test_reshape_td | 80.3320μs | 46.8819μs | 21.3302 KOps/s | 23.1507 KOps/s | |
test_view_pytree | 73.1920μs | 38.7283μs | 25.8209 KOps/s | 27.4944 KOps/s | |
test_view_td | 0.1116ms | 49.6978μs | 20.1216 KOps/s | 20.7581 KOps/s | |
test_unbind_pytree | 69.6810μs | 35.0751μs | 28.5103 KOps/s | 29.1902 KOps/s | |
test_unbind_td | 0.5031ms | 44.2393μs | 22.6044 KOps/s | 23.1951 KOps/s | |
test_split_pytree | 0.1080ms | 47.5342μs | 21.0375 KOps/s | 21.3982 KOps/s | |
test_split_td | 0.6599ms | 57.5211μs | 17.3849 KOps/s | 15.3502 KOps/s | |
test_add_pytree | 92.3320μs | 55.9325μs | 17.8787 KOps/s | 17.4784 KOps/s | |
test_add_td | 0.1340ms | 91.3042μs | 10.9524 KOps/s | 9.7555 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2451ms | 0.1602ms | 6.2435 KOps/s | 6.0786 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2917ms | 0.1629ms | 6.1397 KOps/s | 6.2090 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1999ms | 0.1522ms | 6.5688 KOps/s | 6.4492 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2331ms | 0.1824ms | 5.4813 KOps/s | 5.4822 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 60.1310μs | 22.2424μs | 44.9591 KOps/s | 45.2081 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 93.0620μs | 48.4121μs | 20.6560 KOps/s | 20.2864 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4360ms | 65.3987μs | 15.2908 KOps/s | 14.9866 KOps/s | |
test_compile_copy_nested[pytree-eager] | 80.8420μs | 51.3374μs | 19.4790 KOps/s | 19.3675 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3545ms | 0.3163ms | 3.1614 KOps/s | 3.1299 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3163ms | 0.2320ms | 4.3104 KOps/s | 4.3117 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1661ms | 0.1266ms | 7.8995 KOps/s | 7.8143 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1228ms | 65.0247μs | 15.3788 KOps/s | 14.9531 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3812ms | 0.3249ms | 3.0778 KOps/s | 3.0343 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6954ms | 0.6189ms | 1.6159 KOps/s | 1.6546 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4249ms | 0.2825ms | 3.5401 KOps/s | 3.5256 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3733ms | 0.3201ms | 3.1241 KOps/s | 3.1213 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1538ms | 78.2538μs | 12.7789 KOps/s | 12.8435 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1763ms | 0.1277ms | 7.8303 KOps/s | 7.7407 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6343ms | 0.5192ms | 1.9259 KOps/s | 1.9503 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3791ms | 0.3250ms | 3.0770 KOps/s | 3.0307 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 49.0520μs | 19.9900μs | 50.0250 KOps/s | 51.5626 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 70.9310μs | 39.7833μs | 25.1362 KOps/s | 25.2256 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1084ms | 71.7595μs | 13.9354 KOps/s | 13.9247 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.2220μs | 53.5551μs | 18.6724 KOps/s | 19.0043 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3736ms | 0.8339ms | 1.1991 KOps/s | 1.1234 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.2985ms | 3.1249ms | 320.0059 Ops/s | 315.0157 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3846ms | 0.8328ms | 1.2007 KOps/s | 1.0929 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.2782ms | 3.1069ms | 321.8671 Ops/s | 323.5072 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2000ms | 0.1202ms | 8.3165 KOps/s | 8.5074 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1880ms | 63.3903μs | 15.7753 KOps/s | 15.9382 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1768ms | 0.1110ms | 9.0126 KOps/s | 8.8531 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1038ms | 45.9610μs | 21.7576 KOps/s | 22.8745 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1652ms | 0.1168ms | 8.5587 KOps/s | 8.3283 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1104ms | 44.6229μs | 22.4100 KOps/s | 21.5430 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2030ms | 0.1476ms | 6.7764 KOps/s | 6.9179 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1459ms | 25.1804μs | 39.7134 KOps/s | 40.2599 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1967ms | 0.1432ms | 6.9843 KOps/s | 7.2206 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 59.9410μs | 20.8403μs | 47.9840 KOps/s | 46.0266 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1963ms | 0.1442ms | 6.9371 KOps/s | 7.1573 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.2710μs | 20.5491μs | 48.6639 KOps/s | 47.3919 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2587ms | 0.1476ms | 6.7729 KOps/s | 6.7884 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4697ms | 24.8065μs | 40.3120 KOps/s | 39.7571 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1961ms | 0.1432ms | 6.9825 KOps/s | 7.1772 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 57.9710μs | 20.8092μs | 48.0556 KOps/s | 47.8228 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1894ms | 0.1442ms | 6.9327 KOps/s | 7.1763 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.3293ms | 20.6698μs | 48.3799 KOps/s | 47.6125 KOps/s | |
test_mod_add[eager] | 75.1120μs | 33.1251μs | 30.1886 KOps/s | 30.1242 KOps/s | |
test_mod_add[compile] | 0.1229ms | 84.0373μs | 11.8995 KOps/s | 11.8354 KOps/s | |
test_mod_add[compile-overhead] | 0.2924ms | 0.1474ms | 6.7860 KOps/s | 6.5127 KOps/s | |
test_mod_wrap[eager] | 0.3157ms | 0.2440ms | 4.0984 KOps/s | 4.0047 KOps/s | |
test_mod_wrap[compile] | 1.3793ms | 0.2989ms | 3.3455 KOps/s | 3.3262 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6636ms | 4.0512ms | 246.8407 Ops/s | 247.9119 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4182ms | 1.2965ms | 771.3288 Ops/s | 714.9866 Ops/s | |
test_mod_wrap_and_backward[compile] | 7.5924ms | 1.3230ms | 755.8310 Ops/s | 708.8568 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3231ms | 0.8908ms | 1.1226 KOps/s | 942.6499 Ops/s | |
test_seq_add[eager] | 0.1471ms | 96.5888μs | 10.3532 KOps/s | 9.8774 KOps/s | |
test_seq_add[compile] | 0.2690ms | 89.6352μs | 11.1563 KOps/s | 11.2082 KOps/s | |
test_seq_add[compile-overhead] | 0.1674ms | 0.1229ms | 8.1397 KOps/s | 7.7378 KOps/s | |
test_seq_wrap[eager] | 0.4374ms | 0.3665ms | 2.7286 KOps/s | 2.4539 KOps/s | |
test_seq_wrap[compile] | 0.3756ms | 0.3092ms | 3.2337 KOps/s | 3.1272 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2678ms | 0.2171ms | 4.6059 KOps/s | 4.5697 KOps/s | |
test_func_call_runtime[False-eager] | 0.8411ms | 0.7643ms | 1.3083 KOps/s | 1.3203 KOps/s | |
test_func_call_runtime[False-compile] | 0.8971ms | 0.7779ms | 1.2855 KOps/s | 1.2040 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4365ms | 0.3548ms | 2.8185 KOps/s | 2.7834 KOps/s | |
test_func_call_runtime[True-eager] | 0.9487ms | 0.8667ms | 1.1538 KOps/s | 1.1314 KOps/s | |
test_func_call_runtime[True-compile] | 0.9175ms | 0.8014ms | 1.2478 KOps/s | 1.2594 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4562ms | 0.3759ms | 2.6603 KOps/s | 2.6280 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8849ms | 0.7001ms | 1.4284 KOps/s | 1.4204 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8642ms | 0.7821ms | 1.2787 KOps/s | 1.2803 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4187ms | 0.3573ms | 2.7986 KOps/s | 2.7698 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1066ms | 0.9829ms | 1.0174 KOps/s | 1.0055 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8957ms | 0.8262ms | 1.2103 KOps/s | 1.2035 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4615ms | 0.4060ms | 2.4629 KOps/s | 2.4548 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4633ms | 2.0203ms | 494.9743 Ops/s | 492.5395 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9746ms | 0.9027ms | 1.1078 KOps/s | 1.1941 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4501ms | 0.4041ms | 2.4747 KOps/s | 2.4534 KOps/s | |
test_distributed | 2.4661ms | 0.1876ms | 5.3305 KOps/s | 8.4803 KOps/s | |
test_tdmodule | 0.2523ms | 15.4241μs | 64.8335 KOps/s | 57.0932 KOps/s | |
test_tdmodule_dispatch | 49.2310μs | 29.3286μs | 34.0965 KOps/s | 30.9073 KOps/s | |
test_tdseq | 36.1500μs | 15.8427μs | 63.1205 KOps/s | 58.2916 KOps/s | |
test_tdseq_dispatch | 52.0210μs | 31.6685μs | 31.5771 KOps/s | 28.6993 KOps/s | |
test_instantiation_functorch | 2.0723ms | 1.8912ms | 528.7719 Ops/s | 529.8665 Ops/s | |
test_exec_functorch | 0.3268ms | 0.2094ms | 4.7749 KOps/s | 4.8022 KOps/s | |
test_exec_functional_call | 0.3037ms | 0.2132ms | 4.6901 KOps/s | 4.5580 KOps/s | |
test_exec_td_decorator | 0.4427ms | 0.2567ms | 3.8959 KOps/s | 3.7034 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8414ms | 0.6613ms | 1.5121 KOps/s | 1.4634 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8321ms | 0.6692ms | 1.4944 KOps/s | 1.4674 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7128ms | 0.5817ms | 1.7190 KOps/s | 1.6190 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7400ms | 0.5816ms | 1.7194 KOps/s | 1.7160 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.2268ms | 18.9621ms | 52.7366 Ops/s | 52.8356 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.0585ms | 18.9413ms | 52.7946 Ops/s | 52.8079 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.3243ms | 18.8541ms | 53.0389 Ops/s | 53.4392 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.5072ms | 18.9682ms | 52.7198 Ops/s | 53.1335 Ops/s | |
test_to_module_speed[True] | 1.4409ms | 1.0554ms | 947.5408 Ops/s | 957.5927 Ops/s | |
test_to_module_speed[False] | 1.4630ms | 1.0207ms | 979.7345 Ops/s | 966.3061 Ops/s | |
test_tc_init | 60.6510μs | 34.3597μs | 29.1039 KOps/s | 25.1654 KOps/s | |
test_tc_init_nested | 0.1262ms | 70.3515μs | 14.2143 KOps/s | 12.6109 KOps/s | |
test_tc_first_layer_tensor | 6.8862μs | 0.7063μs | 1.4157 MOps/s | 1.4242 MOps/s | |
test_tc_first_layer_nontensor | 19.9510μs | 2.2899μs | 436.6947 KOps/s | 435.5944 KOps/s | |
test_tc_second_layer_tensor | 20.9310μs | 1.4819μs | 674.7998 KOps/s | 710.0120 KOps/s | |
test_tc_second_layer_nontensor | 56.8010μs | 2.9344μs | 340.7887 KOps/s | 329.1007 KOps/s | |
test_unbind | 0.1893s | 11.9083ms | 83.9750 Ops/s | 91.4091 Ops/s | |
test_full_like | 0.6594ms | 0.5755ms | 1.7376 KOps/s | 1.7434 KOps/s | |
test_zeros_like | 0.2796ms | 0.1978ms | 5.0551 KOps/s | 5.0519 KOps/s | |
test_ones_like | 0.2380ms | 0.1977ms | 5.0578 KOps/s | 5.0549 KOps/s | |
test_clone | 0.4535ms | 0.4146ms | 2.4118 KOps/s | 2.4084 KOps/s | |
test_squeeze | 38.5210μs | 10.0275μs | 99.7257 KOps/s | 99.6260 KOps/s | |
test_unsqueeze | 0.2402ms | 77.4732μs | 12.9077 KOps/s | 12.9102 KOps/s | |
test_split | 0.4552ms | 0.1658ms | 6.0303 KOps/s | 6.2583 KOps/s | |
test_permute | 0.2993ms | 0.1867ms | 5.3553 KOps/s | 5.5243 KOps/s | |
test_stack | 1.2605ms | 0.8732ms | 1.1452 KOps/s | 1.1471 KOps/s | |
test_cat | 1.2495ms | 1.2312ms | 812.1946 Ops/s | 812.0175 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):