-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster update_ #705
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.6080μs | 16.7742μs | 59.6155 KOps/s | 65.3912 KOps/s | |
test_plain_set_stack_nested | 47.2080μs | 17.0917μs | 58.5079 KOps/s | 64.3961 KOps/s | |
test_plain_set_nested_inplace | 51.9970μs | 19.5930μs | 51.0386 KOps/s | 55.9707 KOps/s | |
test_plain_set_stack_nested_inplace | 51.0450μs | 19.6943μs | 50.7762 KOps/s | 55.3362 KOps/s | |
test_items | 26.1290μs | 2.4454μs | 408.9262 KOps/s | 417.6913 KOps/s | |
test_items_nested | 0.4140ms | 0.2752ms | 3.6331 KOps/s | 3.5859 KOps/s | |
test_items_nested_locked | 1.7412ms | 0.2749ms | 3.6377 KOps/s | 3.7054 KOps/s | |
test_items_nested_leaf | 0.6529ms | 0.1692ms | 5.9107 KOps/s | 6.0357 KOps/s | |
test_items_stack_nested | 0.4051ms | 0.2740ms | 3.6501 KOps/s | 3.6980 KOps/s | |
test_items_stack_nested_leaf | 0.9413ms | 0.1698ms | 5.8899 KOps/s | 5.9458 KOps/s | |
test_items_stack_nested_locked | 0.4559ms | 0.2750ms | 3.6362 KOps/s | 3.6969 KOps/s | |
test_keys | 22.4310μs | 3.8692μs | 258.4516 KOps/s | 263.1094 KOps/s | |
test_keys_nested | 0.7718ms | 0.1473ms | 6.7898 KOps/s | 6.8504 KOps/s | |
test_keys_nested_locked | 0.2093ms | 0.1524ms | 6.5634 KOps/s | 6.6245 KOps/s | |
test_keys_nested_leaf | 35.5739ms | 0.1365ms | 7.3268 KOps/s | 7.8628 KOps/s | |
test_keys_stack_nested | 0.2474ms | 0.1501ms | 6.6643 KOps/s | 6.8496 KOps/s | |
test_keys_stack_nested_leaf | 0.2588ms | 0.1324ms | 7.5555 KOps/s | 7.8474 KOps/s | |
test_keys_stack_nested_locked | 0.2710ms | 0.1544ms | 6.4776 KOps/s | 6.5755 KOps/s | |
test_values | 12.6410μs | 1.1917μs | 839.1459 KOps/s | 809.4615 KOps/s | |
test_values_nested | 0.1047ms | 50.8064μs | 19.6826 KOps/s | 19.4361 KOps/s | |
test_values_nested_locked | 91.2590μs | 51.2797μs | 19.5009 KOps/s | 19.5511 KOps/s | |
test_values_nested_leaf | 0.1513ms | 45.6529μs | 21.9044 KOps/s | 21.6493 KOps/s | |
test_values_stack_nested | 0.1201ms | 52.3516μs | 19.1016 KOps/s | 19.1898 KOps/s | |
test_values_stack_nested_leaf | 83.8660μs | 45.3760μs | 22.0381 KOps/s | 21.9390 KOps/s | |
test_values_stack_nested_locked | 98.5630μs | 52.2151μs | 19.1515 KOps/s | 19.5515 KOps/s | |
test_membership | 19.4460μs | 1.3747μs | 727.4266 KOps/s | 736.2559 KOps/s | |
test_membership_nested | 33.4030μs | 3.5432μs | 282.2272 KOps/s | 286.6400 KOps/s | |
test_membership_nested_leaf | 27.4710μs | 3.5428μs | 282.2662 KOps/s | 279.4608 KOps/s | |
test_membership_stacked_nested | 26.5990μs | 3.5375μs | 282.6875 KOps/s | 260.5286 KOps/s | |
test_membership_stacked_nested_leaf | 23.1430μs | 3.5349μs | 282.8967 KOps/s | 265.4931 KOps/s | |
test_membership_nested_last | 32.5100μs | 4.4272μs | 225.8759 KOps/s | 229.4920 KOps/s | |
test_membership_nested_leaf_last | 33.3220μs | 4.3917μs | 227.7031 KOps/s | 228.1332 KOps/s | |
test_membership_stacked_nested_last | 38.4110μs | 13.5856μs | 73.6075 KOps/s | 224.3840 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.8180μs | 13.6310μs | 73.3620 KOps/s | 230.0190 KOps/s | |
test_nested_getleaf | 34.8250μs | 10.7934μs | 92.6491 KOps/s | 95.2148 KOps/s | |
test_nested_get | 38.9120μs | 10.1420μs | 98.6003 KOps/s | 99.3382 KOps/s | |
test_stacked_getleaf | 76.9630μs | 10.5736μs | 94.5751 KOps/s | 95.6506 KOps/s | |
test_stacked_get | 32.3400μs | 10.0509μs | 99.4936 KOps/s | 99.8975 KOps/s | |
test_nested_getitemleaf | 35.0760μs | 11.2139μs | 89.1752 KOps/s | 91.5527 KOps/s | |
test_nested_getitem | 31.2180μs | 10.6397μs | 93.9878 KOps/s | 96.7237 KOps/s | |
test_stacked_getitemleaf | 39.7140μs | 11.1004μs | 90.0870 KOps/s | 91.2608 KOps/s | |
test_stacked_getitem | 29.8160μs | 10.3939μs | 96.2104 KOps/s | 97.4283 KOps/s | |
test_lock_nested | 0.7008ms | 0.3357ms | 2.9792 KOps/s | 3.0155 KOps/s | |
test_lock_stack_nested | 0.3457ms | 0.2888ms | 3.4625 KOps/s | 3.3494 KOps/s | |
test_unlock_nested | 92.4157ms | 0.4318ms | 2.3159 KOps/s | 2.3924 KOps/s | |
test_unlock_stack_nested | 0.3907ms | 0.2999ms | 3.3340 KOps/s | 3.2501 KOps/s | |
test_flatten_speed | 0.6391ms | 0.2683ms | 3.7269 KOps/s | 3.6207 KOps/s | |
test_unflatten_speed | 0.4760ms | 0.4029ms | 2.4819 KOps/s | 2.5003 KOps/s | |
test_common_ops | 1.1617ms | 0.6957ms | 1.4373 KOps/s | 1.5301 KOps/s | |
test_creation | 17.6420μs | 1.8988μs | 526.6521 KOps/s | 539.1511 KOps/s | |
test_creation_empty | 29.9860μs | 10.9832μs | 91.0484 KOps/s | 120.3333 KOps/s | |
test_creation_nested_1 | 33.6430μs | 13.6431μs | 73.2973 KOps/s | 91.4941 KOps/s | |
test_creation_nested_2 | 43.4600μs | 17.2192μs | 58.0747 KOps/s | 71.6715 KOps/s | |
test_clone | 56.0940μs | 13.4251μs | 74.4874 KOps/s | 75.7378 KOps/s | |
test_getitem[int] | 1.2785ms | 11.3559μs | 88.0598 KOps/s | 89.7148 KOps/s | |
test_getitem[slice_int] | 85.3820μs | 22.4455μs | 44.5524 KOps/s | 44.3274 KOps/s | |
test_getitem[range] | 0.1984ms | 42.5682μs | 23.4917 KOps/s | 24.8693 KOps/s | |
test_getitem[tuple] | 59.7310μs | 18.6999μs | 53.4761 KOps/s | 55.5157 KOps/s | |
test_getitem[list] | 0.1403ms | 37.2895μs | 26.8172 KOps/s | 28.1163 KOps/s | |
test_setitem_dim[int] | 78.1850μs | 35.7171μs | 27.9978 KOps/s | 32.5572 KOps/s | |
test_setitem_dim[slice_int] | 0.1184ms | 62.6953μs | 15.9502 KOps/s | 17.6092 KOps/s | |
test_setitem_dim[range] | 0.1415ms | 81.2298μs | 12.3108 KOps/s | 13.3178 KOps/s | |
test_setitem_dim[tuple] | 73.2360μs | 50.6564μs | 19.7408 KOps/s | 21.6933 KOps/s | |
test_setitem | 61.5950μs | 20.2807μs | 49.3079 KOps/s | 53.2781 KOps/s | |
test_set | 61.5640μs | 20.0386μs | 49.9037 KOps/s | 56.0829 KOps/s | |
test_set_shared | 3.7279ms | 0.1398ms | 7.1526 KOps/s | 7.1525 KOps/s | |
test_update | 0.1305ms | 22.9260μs | 43.6185 KOps/s | 49.4064 KOps/s | |
test_update_nested | 90.2470μs | 30.2586μs | 33.0484 KOps/s | 36.5053 KOps/s | |
test_update__nested | 57.1770μs | 24.1074μs | 41.4810 KOps/s | 20.9955 KOps/s | |
test_set_nested | 93.4840μs | 21.6617μs | 46.1644 KOps/s | 50.3663 KOps/s | |
test_set_nested_new | 0.1179ms | 25.4011μs | 39.3684 KOps/s | 41.3625 KOps/s | |
test_select | 0.1784ms | 40.6833μs | 24.5801 KOps/s | 25.8226 KOps/s | |
test_select_nested | 0.1257ms | 59.7363μs | 16.7402 KOps/s | 16.5139 KOps/s | |
test_exclude_nested | 0.2475ms | 0.1192ms | 8.3919 KOps/s | 8.5300 KOps/s | |
test_empty[True] | 0.6357ms | 0.4155ms | 2.4068 KOps/s | 2.4155 KOps/s | |
test_empty[False] | 5.7868μs | 1.0413μs | 960.3614 KOps/s | 940.0215 KOps/s | |
test_unbind_speed | 0.4469ms | 0.2470ms | 4.0483 KOps/s | 4.0688 KOps/s | |
test_unbind_speed_stack0 | 0.3655ms | 0.2343ms | 4.2684 KOps/s | 4.1318 KOps/s | |
test_unbind_speed_stack1 | 0.1232s | 0.6541ms | 1.5289 KOps/s | 1.4371 KOps/s | |
test_split | 0.1276s | 1.6887ms | 592.1848 Ops/s | 604.5364 Ops/s | |
test_chunk | 2.3566ms | 1.4746ms | 678.1602 Ops/s | 679.8037 Ops/s | |
test_creation[device0] | 0.2230ms | 0.1032ms | 9.6893 KOps/s | 9.9367 KOps/s | |
test_creation_from_tensor | 5.0387ms | 83.6333μs | 11.9570 KOps/s | 12.3383 KOps/s | |
test_add_one[memmap_tensor0] | 0.1013ms | 5.2147μs | 191.7674 KOps/s | 180.9024 KOps/s | |
test_contiguous[memmap_tensor0] | 16.1300μs | 0.6846μs | 1.4606 MOps/s | 1.5290 MOps/s | |
test_stack[memmap_tensor0] | 25.7380μs | 3.5565μs | 281.1752 KOps/s | 274.5264 KOps/s | |
test_memmaptd_index | 1.0109ms | 0.2413ms | 4.1443 KOps/s | 4.1259 KOps/s | |
test_memmaptd_index_astensor | 5.9069ms | 0.3030ms | 3.3007 KOps/s | 3.3037 KOps/s | |
test_memmaptd_index_op | 1.3396ms | 0.6207ms | 1.6112 KOps/s | 1.7818 KOps/s | |
test_serialize_model | 0.2376s | 0.1194s | 8.3769 Ops/s | 8.4475 Ops/s | |
test_serialize_model_pickle | 0.4769s | 0.3761s | 2.6590 Ops/s | 2.6223 Ops/s | |
test_serialize_weights | 0.1060s | 97.7062ms | 10.2348 Ops/s | 10.0345 Ops/s | |
test_serialize_weights_returnearly | 0.2465s | 0.1334s | 7.4974 Ops/s | 6.9689 Ops/s | |
test_serialize_weights_pickle | 1.0564s | 0.5621s | 1.7792 Ops/s | 2.4439 Ops/s | |
test_serialize_weights_filesystem | 95.5331ms | 91.2711ms | 10.9564 Ops/s | 10.5841 Ops/s | |
test_serialize_model_filesystem | 0.1034s | 93.8612ms | 10.6540 Ops/s | 10.2761 Ops/s | |
test_reshape_pytree | 54.0610μs | 20.9865μs | 47.6498 KOps/s | 47.6457 KOps/s | |
test_reshape_td | 71.7240μs | 32.1326μs | 31.1210 KOps/s | 32.1184 KOps/s | |
test_view_pytree | 49.1910μs | 20.9010μs | 47.8446 KOps/s | 48.1943 KOps/s | |
test_view_td | 0.1236s | 60.9269μs | 16.4131 KOps/s | 15.9129 KOps/s | |
test_unbind_pytree | 70.0100μs | 25.0766μs | 39.8778 KOps/s | 41.1612 KOps/s | |
test_unbind_td | 0.4225ms | 35.7528μs | 27.9699 KOps/s | 27.6845 KOps/s | |
test_split_pytree | 53.0590μs | 24.0299μs | 41.6148 KOps/s | 41.8565 KOps/s | |
test_split_td | 0.1073ms | 39.3643μs | 25.4037 KOps/s | 25.1911 KOps/s | |
test_add_pytree | 65.5620μs | 29.4182μs | 33.9925 KOps/s | 34.1365 KOps/s | |
test_add_td | 0.1555ms | 54.2659μs | 18.4278 KOps/s | 20.2241 KOps/s | |
test_distributed | 0.1748ms | 99.8269μs | 10.0173 KOps/s | 9.8705 KOps/s | |
test_tdmodule | 66.2740μs | 17.8206μs | 56.1149 KOps/s | 61.2206 KOps/s | |
test_tdmodule_dispatch | 52.0070μs | 33.7972μs | 29.5882 KOps/s | 32.7924 KOps/s | |
test_tdseq | 41.8780μs | 20.7147μs | 48.2748 KOps/s | 52.1299 KOps/s | |
test_tdseq_dispatch | 70.8110μs | 39.7952μs | 25.1286 KOps/s | 27.4682 KOps/s | |
test_instantiation_functorch | 1.4711ms | 1.2938ms | 772.9405 Ops/s | 776.5865 Ops/s | |
test_instantiation_td | 2.1660ms | 1.0269ms | 973.8241 Ops/s | 1.0003 KOps/s | |
test_exec_functorch | 0.2253ms | 0.1547ms | 6.4648 KOps/s | 6.4048 KOps/s | |
test_exec_functional_call | 0.2971ms | 0.1440ms | 6.9440 KOps/s | 6.9657 KOps/s | |
test_exec_td | 0.2055ms | 0.1380ms | 7.2468 KOps/s | 7.0968 KOps/s | |
test_exec_td_decorator | 0.3386ms | 0.1904ms | 5.2520 KOps/s | 5.1598 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.5528ms | 0.4627ms | 2.1613 KOps/s | 2.1541 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8920ms | 0.4769ms | 2.0968 KOps/s | 2.1371 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5663ms | 0.3804ms | 2.6290 KOps/s | 2.6200 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5853ms | 0.3800ms | 2.6317 KOps/s | 2.6084 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1575ms | 0.5060ms | 1.9763 KOps/s | 2.0732 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7723ms | 0.4907ms | 2.0378 KOps/s | 2.0636 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6143ms | 0.3996ms | 2.5027 KOps/s | 2.5040 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8557ms | 0.3972ms | 2.5179 KOps/s | 2.5230 KOps/s | |
test_to_module_speed[True] | 2.1178ms | 1.3855ms | 721.7693 Ops/s | 726.0526 Ops/s | |
test_to_module_speed[False] | 2.2069ms | 1.3575ms | 736.6596 Ops/s | 744.0239 Ops/s |
vmoens
added a commit
that referenced
this pull request
Mar 24, 2024
vmoens
added a commit
that referenced
this pull request
Mar 24, 2024
(cherry picked from commit ed22554)
vmoens
added a commit
that referenced
this pull request
Mar 25, 2024
(cherry picked from commit ed22554)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.