-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Avoid lazy stacks in stack if not asked explicitly #741
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 37.9910μs | 16.8653μs | 59.2934 KOps/s | 60.6679 KOps/s | |
test_plain_set_stack_nested | 34.3350μs | 16.9020μs | 59.1645 KOps/s | 61.0408 KOps/s | |
test_plain_set_nested_inplace | 52.6390μs | 19.2138μs | 52.0459 KOps/s | 53.2914 KOps/s | |
test_plain_set_stack_nested_inplace | 63.8500μs | 19.2798μs | 51.8678 KOps/s | 53.0946 KOps/s | |
test_items | 26.6700μs | 2.6370μs | 379.2171 KOps/s | 404.7363 KOps/s | |
test_items_nested | 0.3988ms | 0.2688ms | 3.7200 KOps/s | 3.7210 KOps/s | |
test_items_nested_locked | 1.3369ms | 0.2701ms | 3.7027 KOps/s | 3.6865 KOps/s | |
test_items_nested_leaf | 0.1458ms | 77.3302μs | 12.9316 KOps/s | 12.9325 KOps/s | |
test_items_stack_nested | 0.3383ms | 0.2712ms | 3.6871 KOps/s | 3.6914 KOps/s | |
test_items_stack_nested_leaf | 0.1559ms | 77.1024μs | 12.9698 KOps/s | 12.1852 KOps/s | |
test_items_stack_nested_locked | 1.3708ms | 0.2734ms | 3.6578 KOps/s | 3.6883 KOps/s | |
test_keys | 18.1540μs | 3.8836μs | 257.4943 KOps/s | 250.9740 KOps/s | |
test_keys_nested | 0.2409ms | 0.1349ms | 7.4139 KOps/s | 7.4525 KOps/s | |
test_keys_nested_locked | 0.7060ms | 0.1405ms | 7.1191 KOps/s | 7.0935 KOps/s | |
test_keys_nested_leaf | 0.2100ms | 0.1152ms | 8.6781 KOps/s | 8.7151 KOps/s | |
test_keys_stack_nested | 0.2875ms | 0.1359ms | 7.3588 KOps/s | 7.2093 KOps/s | |
test_keys_stack_nested_leaf | 0.2059ms | 0.1148ms | 8.7127 KOps/s | 8.6006 KOps/s | |
test_keys_stack_nested_locked | 0.2618ms | 0.1413ms | 7.0765 KOps/s | 7.1671 KOps/s | |
test_values | 8.9390μs | 1.1500μs | 869.5673 KOps/s | 869.2098 KOps/s | |
test_values_nested | 91.6020μs | 50.5258μs | 19.7919 KOps/s | 19.6348 KOps/s | |
test_values_nested_locked | 0.1004ms | 50.8488μs | 19.6661 KOps/s | 19.5522 KOps/s | |
test_values_nested_leaf | 90.5700μs | 45.8766μs | 21.7976 KOps/s | 21.6091 KOps/s | |
test_values_stack_nested | 97.6330μs | 51.0064μs | 19.6054 KOps/s | 19.6463 KOps/s | |
test_values_stack_nested_leaf | 91.2710μs | 46.1064μs | 21.6889 KOps/s | 21.5593 KOps/s | |
test_values_stack_nested_locked | 0.1009ms | 51.0062μs | 19.6055 KOps/s | 19.3876 KOps/s | |
test_membership | 10.3290μs | 1.3299μs | 751.9475 KOps/s | 726.9149 KOps/s | |
test_membership_nested | 40.4160μs | 3.3955μs | 294.5068 KOps/s | 293.2729 KOps/s | |
test_membership_nested_leaf | 21.6310μs | 3.4000μs | 294.1210 KOps/s | 292.1164 KOps/s | |
test_membership_stacked_nested | 41.3410μs | 3.3950μs | 294.5498 KOps/s | 282.9978 KOps/s | |
test_membership_stacked_nested_leaf | 26.9810μs | 3.4170μs | 292.6516 KOps/s | 284.2080 KOps/s | |
test_membership_nested_last | 25.5080μs | 4.1883μs | 238.7624 KOps/s | 240.4607 KOps/s | |
test_membership_nested_leaf_last | 20.6890μs | 4.1988μs | 238.1646 KOps/s | 239.6757 KOps/s | |
test_membership_stacked_nested_last | 21.9010μs | 4.2034μs | 237.9008 KOps/s | 242.4831 KOps/s | |
test_membership_stacked_nested_leaf_last | 20.3490μs | 4.1606μs | 240.3515 KOps/s | 240.4869 KOps/s | |
test_nested_getleaf | 53.1100μs | 10.6676μs | 93.7419 KOps/s | 95.3893 KOps/s | |
test_nested_get | 47.6490μs | 10.2403μs | 97.6529 KOps/s | 101.4319 KOps/s | |
test_stacked_getleaf | 30.1870μs | 10.7013μs | 93.4468 KOps/s | 90.5809 KOps/s | |
test_stacked_get | 48.2210μs | 10.0375μs | 99.6261 KOps/s | 100.6660 KOps/s | |
test_nested_getitemleaf | 46.7080μs | 11.3771μs | 87.8961 KOps/s | 89.8128 KOps/s | |
test_nested_getitem | 31.5890μs | 10.3815μs | 96.3251 KOps/s | 96.8545 KOps/s | |
test_stacked_getitemleaf | 45.2750μs | 11.3105μs | 88.4134 KOps/s | 89.7589 KOps/s | |
test_stacked_getitem | 50.3640μs | 10.2734μs | 97.3386 KOps/s | 97.7359 KOps/s | |
test_lock_nested | 46.3967ms | 0.3893ms | 2.5687 KOps/s | 2.9001 KOps/s | |
test_lock_stack_nested | 0.4983ms | 0.3161ms | 3.1637 KOps/s | 3.2064 KOps/s | |
test_unlock_nested | 73.0626ms | 0.4172ms | 2.3968 KOps/s | 2.3700 KOps/s | |
test_unlock_stack_nested | 0.6586ms | 0.3208ms | 3.1175 KOps/s | 3.1263 KOps/s | |
test_flatten_speed | 0.3444ms | 93.7554μs | 10.6661 KOps/s | 10.7679 KOps/s | |
test_unflatten_speed | 0.6631ms | 0.4038ms | 2.4765 KOps/s | 2.4621 KOps/s | |
test_common_ops | 4.6027ms | 0.7100ms | 1.4084 KOps/s | 1.4512 KOps/s | |
test_creation | 19.9880μs | 1.8662μs | 535.8560 KOps/s | 538.6131 KOps/s | |
test_creation_empty | 48.2400μs | 10.0343μs | 99.6583 KOps/s | 98.0101 KOps/s | |
test_creation_nested_1 | 40.9370μs | 12.7230μs | 78.5977 KOps/s | 79.9746 KOps/s | |
test_creation_nested_2 | 55.8140μs | 15.9289μs | 62.7791 KOps/s | 62.3078 KOps/s | |
test_clone | 61.3850μs | 13.6961μs | 73.0137 KOps/s | 76.2110 KOps/s | |
test_getitem[int] | 36.6790μs | 11.6859μs | 85.5732 KOps/s | 89.4407 KOps/s | |
test_getitem[slice_int] | 62.8280μs | 23.1488μs | 43.1987 KOps/s | 43.5609 KOps/s | |
test_getitem[range] | 0.1496ms | 42.5617μs | 23.4953 KOps/s | 24.0318 KOps/s | |
test_getitem[tuple] | 59.5220μs | 19.1110μs | 52.3258 KOps/s | 53.6353 KOps/s | |
test_getitem[list] | 0.1634ms | 38.2350μs | 26.1541 KOps/s | 26.2533 KOps/s | |
test_setitem_dim[int] | 61.7660μs | 35.0863μs | 28.5012 KOps/s | 29.6438 KOps/s | |
test_setitem_dim[slice_int] | 88.5860μs | 61.7142μs | 16.2037 KOps/s | 16.1458 KOps/s | |
test_setitem_dim[range] | 0.1417ms | 80.9019μs | 12.3606 KOps/s | 12.7977 KOps/s | |
test_setitem_dim[tuple] | 89.5980μs | 51.3648μs | 19.4686 KOps/s | 20.4427 KOps/s | |
test_setitem | 73.9390μs | 20.3916μs | 49.0399 KOps/s | 51.5327 KOps/s | |
test_set | 73.9390μs | 19.6654μs | 50.8508 KOps/s | 52.1421 KOps/s | |
test_set_shared | 4.5140ms | 0.1453ms | 6.8832 KOps/s | 7.1454 KOps/s | |
test_update | 87.5250μs | 21.4967μs | 46.5189 KOps/s | 49.1197 KOps/s | |
test_update_nested | 82.2040μs | 29.8179μs | 33.5369 KOps/s | 35.0195 KOps/s | |
test_update__nested | 79.2590μs | 25.1926μs | 39.6941 KOps/s | 40.5822 KOps/s | |
test_set_nested | 68.5580μs | 21.8545μs | 45.7572 KOps/s | 48.2058 KOps/s | |
test_set_nested_new | 70.5630μs | 25.6161μs | 39.0380 KOps/s | 39.8565 KOps/s | |
test_select | 94.1160μs | 40.5600μs | 24.6548 KOps/s | 25.1059 KOps/s | |
test_select_nested | 0.8625ms | 60.2280μs | 16.6036 KOps/s | 16.8835 KOps/s | |
test_exclude_nested | 0.2263ms | 0.1184ms | 8.4426 KOps/s | 8.4431 KOps/s | |
test_empty[True] | 0.4654ms | 0.3884ms | 2.5749 KOps/s | 2.5495 KOps/s | |
test_empty[False] | 7.8488μs | 1.0605μs | 942.9478 KOps/s | 948.3856 KOps/s | |
test_unbind_speed | 1.5905ms | 0.2546ms | 3.9279 KOps/s | 3.9386 KOps/s | |
test_unbind_speed_stack0 | 0.4635ms | 0.2515ms | 3.9755 KOps/s | 3.9571 KOps/s | |
test_unbind_speed_stack1 | 0.1138s | 0.6912ms | 1.4468 KOps/s | 1.4353 KOps/s | |
test_split | 1.7293ms | 1.5217ms | 657.1680 Ops/s | 598.8692 Ops/s | |
test_chunk | 0.1054s | 1.6788ms | 595.6663 Ops/s | 661.5490 Ops/s | |
test_creation[device0] | 5.6049ms | 0.1060ms | 9.4365 KOps/s | 9.8724 KOps/s | |
test_creation_from_tensor | 0.1797ms | 82.7910μs | 12.0786 KOps/s | 12.3546 KOps/s | |
test_add_one[memmap_tensor0] | 95.4700μs | 5.6713μs | 176.3265 KOps/s | 185.8565 KOps/s | |
test_contiguous[memmap_tensor0] | 13.7760μs | 0.6277μs | 1.5931 MOps/s | 1.5957 MOps/s | |
test_stack[memmap_tensor0] | 19.7370μs | 3.7178μs | 268.9772 KOps/s | 289.3241 KOps/s | |
test_memmaptd_index | 0.9085ms | 0.2452ms | 4.0776 KOps/s | 4.1117 KOps/s | |
test_memmaptd_index_astensor | 0.6869ms | 0.3056ms | 3.2727 KOps/s | 3.2414 KOps/s | |
test_memmaptd_index_op | 0.9179ms | 0.5972ms | 1.6744 KOps/s | 1.6742 KOps/s | |
test_serialize_model | 0.2261s | 0.1167s | 8.5659 Ops/s | 8.6452 Ops/s | |
test_serialize_model_pickle | 0.4584s | 0.3767s | 2.6549 Ops/s | 2.6254 Ops/s | |
test_serialize_weights | 0.1126s | 0.1004s | 9.9641 Ops/s | 9.9147 Ops/s | |
test_serialize_weights_returnearly | 0.2426s | 0.1364s | 7.3289 Ops/s | 7.9228 Ops/s | |
test_serialize_weights_pickle | 0.9951s | 0.5646s | 1.7712 Ops/s | 2.3854 Ops/s | |
test_serialize_weights_filesystem | 98.8387ms | 90.9870ms | 10.9906 Ops/s | 9.2639 Ops/s | |
test_serialize_model_filesystem | 0.1009s | 92.2321ms | 10.8422 Ops/s | 10.6292 Ops/s | |
test_reshape_pytree | 58.0900μs | 20.9537μs | 47.7243 KOps/s | 48.1205 KOps/s | |
test_reshape_td | 70.4930μs | 32.2198μs | 31.0368 KOps/s | 31.6235 KOps/s | |
test_view_pytree | 54.9330μs | 20.7368μs | 48.2234 KOps/s | 47.6439 KOps/s | |
test_view_td | 0.1174s | 58.3808μs | 17.1289 KOps/s | 16.2562 KOps/s | |
test_unbind_pytree | 60.6630μs | 25.0117μs | 39.9814 KOps/s | 40.4244 KOps/s | |
test_unbind_td | 0.1064ms | 37.6417μs | 26.5663 KOps/s | 26.9859 KOps/s | |
test_split_pytree | 52.1080μs | 24.5072μs | 40.8044 KOps/s | 41.7278 KOps/s | |
test_split_td | 0.1105ms | 41.2610μs | 24.2360 KOps/s | 24.6831 KOps/s | |
test_add_pytree | 73.0370μs | 31.0638μs | 32.1918 KOps/s | 33.2804 KOps/s | |
test_add_td | 95.7900μs | 56.4743μs | 17.7072 KOps/s | 18.0278 KOps/s | |
test_distributed | 0.2319ms | 99.1956μs | 10.0811 KOps/s | 9.9516 KOps/s | |
test_tdmodule | 39.3940μs | 17.2536μs | 57.9589 KOps/s | 57.4246 KOps/s | |
test_tdmodule_dispatch | 65.8030μs | 34.6716μs | 28.8420 KOps/s | 29.1216 KOps/s | |
test_tdseq | 42.3300μs | 20.2441μs | 49.3970 KOps/s | 50.3398 KOps/s | |
test_tdseq_dispatch | 70.9240μs | 39.3434μs | 25.4172 KOps/s | 26.0131 KOps/s | |
test_instantiation_functorch | 1.5600ms | 1.3140ms | 761.0140 Ops/s | 770.7965 Ops/s | |
test_instantiation_td | 1.5153ms | 1.0145ms | 985.7409 Ops/s | 946.5618 Ops/s | |
test_exec_functorch | 0.3077ms | 0.1617ms | 6.1836 KOps/s | 6.3425 KOps/s | |
test_exec_functional_call | 0.3848ms | 0.1518ms | 6.5875 KOps/s | 6.7629 KOps/s | |
test_exec_td | 0.2212ms | 0.1473ms | 6.7894 KOps/s | 6.8925 KOps/s | |
test_exec_td_decorator | 0.8643ms | 0.1992ms | 5.0206 KOps/s | 5.0973 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7377ms | 0.4742ms | 2.1089 KOps/s | 2.1054 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8242ms | 0.4690ms | 2.1323 KOps/s | 2.1084 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6384ms | 0.3980ms | 2.5128 KOps/s | 2.5884 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6730ms | 0.3839ms | 2.6048 KOps/s | 2.5949 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9807ms | 0.4921ms | 2.0320 KOps/s | 2.0229 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.6538ms | 0.4907ms | 2.0379 KOps/s | 2.0204 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6475ms | 0.3983ms | 2.5104 KOps/s | 2.4803 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7450ms | 0.3997ms | 2.5016 KOps/s | 2.4886 KOps/s | |
test_to_module_speed[True] | 1.4692ms | 1.3831ms | 723.0249 Ops/s | 701.8447 Ops/s | |
test_to_module_speed[False] | 1.4436ms | 1.3590ms | 735.8256 Ops/s | 718.7213 Ops/s |
AlexandreBrown
approved these changes
Apr 22, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
BC-breaking
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.