-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix lazy stack features (where and norm) #795
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 32.5110μs | 16.4753μs | 60.6969 KOps/s | 59.1149 KOps/s | |
test_plain_set_stack_nested | 33.5330μs | 16.7667μs | 59.6420 KOps/s | 59.0406 KOps/s | |
test_plain_set_nested_inplace | 83.6370μs | 18.5374μs | 53.9450 KOps/s | 52.8077 KOps/s | |
test_plain_set_stack_nested_inplace | 47.2080μs | 18.5799μs | 53.8217 KOps/s | 53.2443 KOps/s | |
test_items | 17.3630μs | 2.5743μs | 388.4596 KOps/s | 369.2930 KOps/s | |
test_items_nested | 0.4507ms | 0.2694ms | 3.7121 KOps/s | 3.7555 KOps/s | |
test_items_nested_locked | 1.0538ms | 0.2716ms | 3.6813 KOps/s | 3.6497 KOps/s | |
test_items_nested_leaf | 0.1787ms | 76.7361μs | 13.0317 KOps/s | 13.0042 KOps/s | |
test_items_stack_nested | 0.5015ms | 0.2702ms | 3.7014 KOps/s | 3.6879 KOps/s | |
test_items_stack_nested_leaf | 0.1378ms | 78.0173μs | 12.8177 KOps/s | 13.0026 KOps/s | |
test_items_stack_nested_locked | 0.5737ms | 0.2696ms | 3.7094 KOps/s | 3.7018 KOps/s | |
test_keys | 18.6750μs | 3.9043μs | 256.1291 KOps/s | 261.8254 KOps/s | |
test_keys_nested | 0.2322ms | 0.1366ms | 7.3192 KOps/s | 7.3772 KOps/s | |
test_keys_nested_locked | 0.7996ms | 0.1419ms | 7.0469 KOps/s | 7.0946 KOps/s | |
test_keys_nested_leaf | 0.2070ms | 0.1168ms | 8.5647 KOps/s | 8.6319 KOps/s | |
test_keys_stack_nested | 0.2652ms | 0.1371ms | 7.2916 KOps/s | 7.4708 KOps/s | |
test_keys_stack_nested_leaf | 0.1961ms | 0.1155ms | 8.6589 KOps/s | 8.7283 KOps/s | |
test_keys_stack_nested_locked | 0.3013ms | 0.1410ms | 7.0928 KOps/s | 7.2009 KOps/s | |
test_values | 7.8673μs | 1.2693μs | 787.8661 KOps/s | 851.3185 KOps/s | |
test_values_nested | 0.1066ms | 49.7647μs | 20.0946 KOps/s | 19.8498 KOps/s | |
test_values_nested_locked | 0.1250ms | 49.5554μs | 20.1794 KOps/s | 19.8123 KOps/s | |
test_values_nested_leaf | 86.1620μs | 45.3733μs | 22.0394 KOps/s | 21.8687 KOps/s | |
test_values_stack_nested | 0.1018ms | 51.2029μs | 19.5301 KOps/s | 19.3530 KOps/s | |
test_values_stack_nested_leaf | 83.3960μs | 45.3182μs | 22.0662 KOps/s | 22.2093 KOps/s | |
test_values_stack_nested_locked | 0.1117ms | 50.2956μs | 19.8825 KOps/s | 19.4123 KOps/s | |
test_membership | 24.8760μs | 1.3320μs | 750.7612 KOps/s | 735.5225 KOps/s | |
test_membership_nested | 18.3740μs | 3.4106μs | 293.2011 KOps/s | 292.8551 KOps/s | |
test_membership_nested_leaf | 25.3580μs | 3.4239μs | 292.0669 KOps/s | 293.0918 KOps/s | |
test_membership_stacked_nested | 21.5510μs | 3.4291μs | 291.6197 KOps/s | 289.2081 KOps/s | |
test_membership_stacked_nested_leaf | 30.3770μs | 3.4147μs | 292.8520 KOps/s | 294.1225 KOps/s | |
test_membership_nested_last | 19.7170μs | 4.0955μs | 244.1702 KOps/s | 237.5661 KOps/s | |
test_membership_nested_leaf_last | 22.9230μs | 4.1927μs | 238.5116 KOps/s | 237.9096 KOps/s | |
test_membership_stacked_nested_last | 21.3000μs | 4.1014μs | 243.8212 KOps/s | 160.8667 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.2430μs | 4.1681μs | 239.9197 KOps/s | 159.6156 KOps/s | |
test_nested_getleaf | 37.7510μs | 10.6827μs | 93.6092 KOps/s | 95.2766 KOps/s | |
test_nested_get | 48.4400μs | 10.0003μs | 99.9969 KOps/s | 100.2458 KOps/s | |
test_stacked_getleaf | 38.6830μs | 10.6759μs | 93.6685 KOps/s | 95.4106 KOps/s | |
test_stacked_get | 35.6960μs | 10.0283μs | 99.7182 KOps/s | 101.1248 KOps/s | |
test_nested_getitemleaf | 25.4770μs | 11.2833μs | 88.6267 KOps/s | 90.7655 KOps/s | |
test_nested_getitem | 30.7280μs | 10.2209μs | 97.8384 KOps/s | 98.1395 KOps/s | |
test_stacked_getitemleaf | 33.3730μs | 11.4015μs | 87.7074 KOps/s | 90.4865 KOps/s | |
test_stacked_getitem | 25.5080μs | 10.2776μs | 97.2993 KOps/s | 98.7549 KOps/s | |
test_lock_nested | 49.0281ms | 0.3874ms | 2.5814 KOps/s | 2.9020 KOps/s | |
test_lock_stack_nested | 0.4836ms | 0.3040ms | 3.2897 KOps/s | 3.2997 KOps/s | |
test_unlock_nested | 1.4767ms | 0.3496ms | 2.8604 KOps/s | 2.5450 KOps/s | |
test_unlock_stack_nested | 0.6807ms | 0.3116ms | 3.2089 KOps/s | 3.2095 KOps/s | |
test_flatten_speed | 0.5241ms | 95.7501μs | 10.4439 KOps/s | 10.5293 KOps/s | |
test_unflatten_speed | 0.5992ms | 0.4112ms | 2.4316 KOps/s | 2.4046 KOps/s | |
test_common_ops | 5.6873ms | 0.7047ms | 1.4191 KOps/s | 1.4097 KOps/s | |
test_creation | 27.1210μs | 1.8670μs | 535.6052 KOps/s | 521.9958 KOps/s | |
test_creation_empty | 35.1960μs | 9.7490μs | 102.5745 KOps/s | 95.4073 KOps/s | |
test_creation_nested_1 | 35.3860μs | 12.4046μs | 80.6151 KOps/s | 75.3363 KOps/s | |
test_creation_nested_2 | 63.5990μs | 15.7214μs | 63.6076 KOps/s | 60.3370 KOps/s | |
test_clone | 75.8820μs | 13.1255μs | 76.1875 KOps/s | 75.4294 KOps/s | |
test_getitem[int] | 51.9770μs | 11.1650μs | 89.5655 KOps/s | 85.9222 KOps/s | |
test_getitem[slice_int] | 65.1120μs | 22.5371μs | 44.3712 KOps/s | 42.5402 KOps/s | |
test_getitem[range] | 81.1120μs | 61.8478μs | 16.1687 KOps/s | 14.6524 KOps/s | |
test_getitem[tuple] | 58.4900μs | 18.8509μs | 53.0478 KOps/s | 52.2638 KOps/s | |
test_getitem[list] | 0.1209ms | 43.5720μs | 22.9505 KOps/s | 23.5021 KOps/s | |
test_setitem_dim[int] | 63.8190μs | 33.7638μs | 29.6175 KOps/s | 28.8157 KOps/s | |
test_setitem_dim[slice_int] | 0.1027ms | 60.1646μs | 16.6211 KOps/s | 16.5119 KOps/s | |
test_setitem_dim[range] | 0.1316ms | 81.3005μs | 12.3000 KOps/s | 11.6744 KOps/s | |
test_setitem_dim[tuple] | 94.7170μs | 48.7229μs | 20.5242 KOps/s | 20.1435 KOps/s | |
test_setitem | 63.9400μs | 20.0191μs | 49.9523 KOps/s | 50.8299 KOps/s | |
test_set | 52.9590μs | 19.0377μs | 52.5272 KOps/s | 52.1053 KOps/s | |
test_set_shared | 0.9093ms | 0.1399ms | 7.1500 KOps/s | 7.0564 KOps/s | |
test_update | 0.1330ms | 22.1822μs | 45.0812 KOps/s | 47.6303 KOps/s | |
test_update_nested | 79.3180μs | 30.2077μs | 33.1042 KOps/s | 34.0953 KOps/s | |
test_update__nested | 56.2050μs | 25.1701μs | 39.7297 KOps/s | 40.6232 KOps/s | |
test_set_nested | 58.2890μs | 21.2851μs | 46.9812 KOps/s | 44.4966 KOps/s | |
test_set_nested_new | 71.7940μs | 25.6730μs | 38.9514 KOps/s | 39.9760 KOps/s | |
test_select | 0.1232ms | 39.8591μs | 25.0884 KOps/s | 24.2024 KOps/s | |
test_select_nested | 0.1569ms | 58.8633μs | 16.9885 KOps/s | 16.5838 KOps/s | |
test_exclude_nested | 0.2644ms | 0.1181ms | 8.4642 KOps/s | 8.3087 KOps/s | |
test_empty[True] | 0.6123ms | 0.3922ms | 2.5496 KOps/s | 2.5406 KOps/s | |
test_empty[False] | 5.7628μs | 1.1402μs | 877.0767 KOps/s | 841.9907 KOps/s | |
test_unbind_speed | 1.4473ms | 0.2532ms | 3.9489 KOps/s | 3.8241 KOps/s | |
test_unbind_speed_stack0 | 0.4170ms | 0.2514ms | 3.9775 KOps/s | 4.0221 KOps/s | |
test_unbind_speed_stack1 | 68.2983ms | 0.7315ms | 1.3671 KOps/s | 1.3466 KOps/s | |
test_split | 65.1105ms | 1.6012ms | 624.5245 Ops/s | 611.3114 Ops/s | |
test_chunk | 61.0315ms | 1.6019ms | 624.2600 Ops/s | 609.2231 Ops/s | |
test_creation[device0] | 3.7769ms | 87.1928μs | 11.4688 KOps/s | 11.9263 KOps/s | |
test_creation_from_tensor | 0.1712ms | 85.8647μs | 11.6462 KOps/s | 11.7341 KOps/s | |
test_add_one[memmap_tensor0] | 58.5400μs | 5.6511μs | 176.9563 KOps/s | 182.8364 KOps/s | |
test_contiguous[memmap_tensor0] | 8.2050μs | 0.6509μs | 1.5362 MOps/s | 1.5201 MOps/s | |
test_stack[memmap_tensor0] | 21.5000μs | 3.5392μs | 282.5518 KOps/s | 271.8132 KOps/s | |
test_memmaptd_index | 0.9757ms | 0.2572ms | 3.8875 KOps/s | 3.8217 KOps/s | |
test_memmaptd_index_astensor | 0.7692ms | 0.3354ms | 2.9818 KOps/s | 2.9833 KOps/s | |
test_memmaptd_index_op | 0.9872ms | 0.6026ms | 1.6596 KOps/s | 1.6233 KOps/s | |
test_serialize_model | 0.1634s | 0.1101s | 9.0842 Ops/s | 8.8004 Ops/s | |
test_serialize_model_pickle | 0.4488s | 0.3776s | 2.6482 Ops/s | 2.5996 Ops/s | |
test_serialize_weights | 0.1649s | 0.1093s | 9.1480 Ops/s | 9.1307 Ops/s | |
test_serialize_weights_returnearly | 0.1316s | 0.1238s | 8.0783 Ops/s | 7.3695 Ops/s | |
test_serialize_weights_pickle | 0.4510s | 0.3917s | 2.5532 Ops/s | 2.3359 Ops/s | |
test_serialize_weights_filesystem | 0.1603s | 0.1003s | 9.9694 Ops/s | 10.7423 Ops/s | |
test_serialize_model_filesystem | 97.1785ms | 92.7006ms | 10.7874 Ops/s | 10.5045 Ops/s | |
test_reshape_pytree | 58.5700μs | 25.3277μs | 39.4824 KOps/s | 40.0666 KOps/s | |
test_reshape_td | 65.7930μs | 33.6412μs | 29.7255 KOps/s | 29.7921 KOps/s | |
test_view_pytree | 56.6260μs | 25.0168μs | 39.9731 KOps/s | 40.2436 KOps/s | |
test_view_td | 80.6000μs | 37.8568μs | 26.4153 KOps/s | 26.7092 KOps/s | |
test_unbind_pytree | 62.8080μs | 28.5312μs | 35.0493 KOps/s | 34.2720 KOps/s | |
test_unbind_td | 0.4270ms | 37.7013μs | 26.5243 KOps/s | 26.1913 KOps/s | |
test_split_pytree | 61.7450μs | 28.9277μs | 34.5689 KOps/s | 34.7007 KOps/s | |
test_split_td | 0.5112ms | 39.9226μs | 25.0485 KOps/s | 24.1559 KOps/s | |
test_add_pytree | 83.7270μs | 34.5010μs | 28.9847 KOps/s | 29.0142 KOps/s | |
test_add_td | 0.1142ms | 54.0898μs | 18.4878 KOps/s | 18.3283 KOps/s | |
test_distributed | 0.2058ms | 0.1013ms | 9.8724 KOps/s | 9.8461 KOps/s | |
test_tdmodule | 33.6530μs | 16.8534μs | 59.3352 KOps/s | 59.1954 KOps/s | |
test_tdmodule_dispatch | 51.6270μs | 33.6487μs | 29.7189 KOps/s | 29.7752 KOps/s | |
test_tdseq | 39.4140μs | 19.2784μs | 51.8715 KOps/s | 49.9093 KOps/s | |
test_tdseq_dispatch | 63.7890μs | 38.2924μs | 26.1148 KOps/s | 25.5545 KOps/s | |
test_instantiation_functorch | 1.9355ms | 1.3247ms | 754.8724 Ops/s | 772.7657 Ops/s | |
test_instantiation_td | 65.9535ms | 1.0731ms | 931.8640 Ops/s | 1.0009 KOps/s | |
test_exec_functorch | 0.2901ms | 0.1596ms | 6.2655 KOps/s | 6.3244 KOps/s | |
test_exec_functional_call | 0.2862ms | 0.1519ms | 6.5848 KOps/s | 6.8505 KOps/s | |
test_exec_td | 0.2700ms | 0.1470ms | 6.8043 KOps/s | 6.8867 KOps/s | |
test_exec_td_decorator | 0.8618ms | 0.2239ms | 4.4669 KOps/s | 4.6228 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.6758ms | 0.4840ms | 2.0663 KOps/s | 2.0637 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7562ms | 0.4823ms | 2.0736 KOps/s | 2.0733 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6551ms | 0.3958ms | 2.5263 KOps/s | 2.5501 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6696ms | 0.3961ms | 2.5246 KOps/s | 2.5659 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1063ms | 0.5649ms | 1.7703 KOps/s | 1.8051 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.6910ms | 0.5474ms | 1.8270 KOps/s | 1.8065 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9051ms | 0.4557ms | 2.1945 KOps/s | 2.2125 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.5710ms | 0.4525ms | 2.2099 KOps/s | 2.2151 KOps/s | |
test_to_module_speed[True] | 2.2224ms | 1.6543ms | 604.4762 Ops/s | 598.0521 Ops/s | |
test_to_module_speed[False] | 2.2241ms | 1.6349ms | 611.6489 Ops/s | 600.5241 Ops/s | |
test_tc_init | 56.2850μs | 26.7007μs | 37.4522 KOps/s | 34.7707 KOps/s | |
test_tc_init_nested | 0.1026ms | 51.6512μs | 19.3606 KOps/s | 16.9753 KOps/s | |
test_tc_first_layer_tensor | 3.4550μs | 0.6753μs | 1.4809 MOps/s | 1.5049 MOps/s | |
test_tc_first_layer_nontensor | 2.4336μs | 0.6541μs | 1.5287 MOps/s | 1.4803 MOps/s | |
test_tc_second_layer_tensor | 15.4890μs | 1.8301μs | 546.4168 KOps/s | 540.8885 KOps/s | |
test_tc_second_layer_nontensor | 9.2137μs | 1.5197μs | 658.0197 KOps/s | 654.8949 KOps/s | |
test_unbind | 80.2740ms | 7.4583ms | 134.0786 Ops/s | 189.9941 Ops/s | |
test_full_like | 15.8167ms | 11.0333ms | 90.6350 Ops/s | 100.9392 Ops/s | |
test_zeros_like | 11.9512ms | 5.4736ms | 182.6965 Ops/s | 179.2579 Ops/s | |
test_ones_like | 10.9852ms | 6.0765ms | 164.5673 Ops/s | 166.2605 Ops/s | |
test_clone | 11.3230ms | 7.2807ms | 137.3496 Ops/s | 135.7431 Ops/s | |
test_squeeze | 74.2590μs | 13.5525μs | 73.7869 KOps/s | 74.7391 KOps/s | |
test_unsqueeze | 0.1162ms | 65.4150μs | 15.2870 KOps/s | 15.0601 KOps/s | |
test_split | 0.2541ms | 0.1118ms | 8.9475 KOps/s | 9.0011 KOps/s | |
test_permute | 0.3499ms | 0.1359ms | 7.3576 KOps/s | 7.4351 KOps/s | |
test_stack | 24.4249ms | 21.2711ms | 47.0122 Ops/s | 48.6721 Ops/s | |
test_cat | 43.9749ms | 22.5347ms | 44.3760 Ops/s | 48.9653 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.4933ms | 12.4093μs | 80.5846 KOps/s | 76.3646 KOps/s | |
test_plain_set_stack_nested | 25.6110μs | 12.4985μs | 80.0096 KOps/s | 76.4620 KOps/s | |
test_plain_set_nested_inplace | 37.3220μs | 13.7142μs | 72.9173 KOps/s | 69.7825 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1348ms | 13.7713μs | 72.6149 KOps/s | 69.0948 KOps/s | |
test_items | 18.2410μs | 4.6779μs | 213.7689 KOps/s | 210.3029 KOps/s | |
test_items_nested | 0.3733ms | 0.3386ms | 2.9530 KOps/s | 2.9380 KOps/s | |
test_items_nested_locked | 0.5517ms | 0.3397ms | 2.9434 KOps/s | 2.9381 KOps/s | |
test_items_nested_leaf | 0.2729ms | 82.2863μs | 12.1527 KOps/s | 12.0922 KOps/s | |
test_items_stack_nested | 0.5272ms | 0.3451ms | 2.8975 KOps/s | 2.9423 KOps/s | |
test_items_stack_nested_leaf | 0.1041ms | 83.8028μs | 11.9328 KOps/s | 11.9515 KOps/s | |
test_items_stack_nested_locked | 0.3881ms | 0.3440ms | 2.9073 KOps/s | 2.8978 KOps/s | |
test_keys | 17.6510μs | 4.3427μs | 230.2733 KOps/s | 230.7186 KOps/s | |
test_keys_nested | 89.7840μs | 66.9066μs | 14.9462 KOps/s | 14.8680 KOps/s | |
test_keys_nested_locked | 0.7432ms | 71.4548μs | 13.9949 KOps/s | 13.8212 KOps/s | |
test_keys_nested_leaf | 77.7140μs | 57.2494μs | 17.4674 KOps/s | 17.2604 KOps/s | |
test_keys_stack_nested | 0.1206ms | 67.0212μs | 14.9206 KOps/s | 15.0144 KOps/s | |
test_keys_stack_nested_leaf | 89.9140μs | 58.2466μs | 17.1684 KOps/s | 17.2186 KOps/s | |
test_keys_stack_nested_locked | 0.1064ms | 72.1977μs | 13.8509 KOps/s | 14.0270 KOps/s | |
test_values | 11.1307μs | 1.8050μs | 554.0224 KOps/s | 551.7121 KOps/s | |
test_values_nested | 65.0030μs | 35.2330μs | 28.3825 KOps/s | 28.1191 KOps/s | |
test_values_nested_locked | 86.0940μs | 37.0810μs | 26.9680 KOps/s | 26.5568 KOps/s | |
test_values_nested_leaf | 56.5830μs | 31.6088μs | 31.6368 KOps/s | 31.7191 KOps/s | |
test_values_stack_nested | 0.2141ms | 36.3548μs | 27.5067 KOps/s | 27.7861 KOps/s | |
test_values_stack_nested_leaf | 0.2086ms | 32.4891μs | 30.7795 KOps/s | 30.7406 KOps/s | |
test_values_stack_nested_locked | 61.5730μs | 38.0014μs | 26.3148 KOps/s | 26.2372 KOps/s | |
test_membership | 3.9816μs | 0.7320μs | 1.3661 MOps/s | 1.3982 MOps/s | |
test_membership_nested | 0.1607ms | 2.6204μs | 381.6206 KOps/s | 389.3968 KOps/s | |
test_membership_nested_leaf | 18.6800μs | 2.6225μs | 381.3176 KOps/s | 387.8696 KOps/s | |
test_membership_stacked_nested | 23.5410μs | 2.6012μs | 384.4338 KOps/s | 386.0127 KOps/s | |
test_membership_stacked_nested_leaf | 33.4710μs | 2.5848μs | 386.8719 KOps/s | 385.8627 KOps/s | |
test_membership_nested_last | 0.1881ms | 3.0961μs | 322.9843 KOps/s | 325.3619 KOps/s | |
test_membership_nested_leaf_last | 42.7420μs | 3.1168μs | 320.8438 KOps/s | 325.1759 KOps/s | |
test_membership_stacked_nested_last | 31.8220μs | 3.5639μs | 280.5885 KOps/s | 257.7704 KOps/s | |
test_membership_stacked_nested_leaf_last | 21.3210μs | 3.5547μs | 281.3165 KOps/s | 258.4316 KOps/s | |
test_nested_getleaf | 23.8920μs | 8.3483μs | 119.7851 KOps/s | 118.0945 KOps/s | |
test_nested_get | 30.3710μs | 7.8741μs | 126.9991 KOps/s | 126.2290 KOps/s | |
test_stacked_getleaf | 35.7220μs | 8.4105μs | 118.8997 KOps/s | 117.4553 KOps/s | |
test_stacked_get | 56.8430μs | 7.9023μs | 126.5452 KOps/s | 125.5465 KOps/s | |
test_nested_getitemleaf | 32.2010μs | 8.5365μs | 117.1444 KOps/s | 115.8331 KOps/s | |
test_nested_getitem | 29.3410μs | 8.0519μs | 124.1938 KOps/s | 122.9695 KOps/s | |
test_stacked_getitemleaf | 22.5610μs | 8.5619μs | 116.7961 KOps/s | 115.3495 KOps/s | |
test_stacked_getitem | 37.0020μs | 8.0548μs | 124.1499 KOps/s | 123.0407 KOps/s | |
test_lock_nested | 59.7635ms | 0.4063ms | 2.4613 KOps/s | 2.4595 KOps/s | |
test_lock_stack_nested | 0.4110ms | 0.3007ms | 3.3259 KOps/s | 3.2987 KOps/s | |
test_unlock_nested | 0.7239ms | 0.3474ms | 2.8788 KOps/s | 2.8500 KOps/s | |
test_unlock_stack_nested | 0.4113ms | 0.3095ms | 3.2309 KOps/s | 3.1905 KOps/s | |
test_flatten_speed | 0.1872ms | 0.1035ms | 9.6600 KOps/s | 9.8445 KOps/s | |
test_unflatten_speed | 0.4286ms | 0.2921ms | 3.4235 KOps/s | 3.4708 KOps/s | |
test_common_ops | 1.1066ms | 0.5577ms | 1.7931 KOps/s | 1.7322 KOps/s | |
test_creation | 34.5220μs | 1.6194μs | 617.5174 KOps/s | 620.0551 KOps/s | |
test_creation_empty | 25.4210μs | 7.8557μs | 127.2956 KOps/s | 107.5311 KOps/s | |
test_creation_nested_1 | 27.2620μs | 9.5556μs | 104.6505 KOps/s | 90.5997 KOps/s | |
test_creation_nested_2 | 41.8020μs | 11.7843μs | 84.8587 KOps/s | 75.0125 KOps/s | |
test_clone | 0.2057ms | 11.4486μs | 87.3465 KOps/s | 87.7159 KOps/s | |
test_getitem[int] | 30.6110μs | 10.7323μs | 93.1763 KOps/s | 95.0563 KOps/s | |
test_getitem[slice_int] | 50.6420μs | 20.2518μs | 49.3783 KOps/s | 50.7703 KOps/s | |
test_getitem[range] | 63.1630μs | 44.7322μs | 22.3553 KOps/s | 21.9915 KOps/s | |
test_getitem[tuple] | 39.9820μs | 18.0194μs | 55.4959 KOps/s | 56.0926 KOps/s | |
test_getitem[list] | 0.1518ms | 31.6603μs | 31.5853 KOps/s | 31.7330 KOps/s | |
test_setitem_dim[int] | 45.0420μs | 28.6127μs | 34.9496 KOps/s | 33.4903 KOps/s | |
test_setitem_dim[slice_int] | 83.8630μs | 50.6383μs | 19.7479 KOps/s | 20.5910 KOps/s | |
test_setitem_dim[range] | 94.5540μs | 64.3174μs | 15.5479 KOps/s | 14.9980 KOps/s | |
test_setitem_dim[tuple] | 0.1479ms | 42.9187μs | 23.2999 KOps/s | 22.9398 KOps/s | |
test_setitem | 68.5230μs | 15.5277μs | 64.4011 KOps/s | 61.0391 KOps/s | |
test_set | 49.9830μs | 15.2202μs | 65.7023 KOps/s | 62.8732 KOps/s | |
test_set_shared | 1.1462ms | 94.5110μs | 10.5808 KOps/s | 10.5403 KOps/s | |
test_update | 0.1085ms | 17.0096μs | 58.7905 KOps/s | 53.7541 KOps/s | |
test_update_nested | 87.9240μs | 22.2227μs | 44.9990 KOps/s | 42.6199 KOps/s | |
test_update__nested | 86.9340μs | 21.7987μs | 45.8743 KOps/s | 45.6262 KOps/s | |
test_set_nested | 63.0930μs | 16.2837μs | 61.4113 KOps/s | 59.0093 KOps/s | |
test_set_nested_new | 58.8730μs | 18.6396μs | 53.6493 KOps/s | 50.3140 KOps/s | |
test_select | 0.1368ms | 32.0809μs | 31.1712 KOps/s | 30.8488 KOps/s | |
test_select_nested | 0.7607ms | 54.5383μs | 18.3357 KOps/s | 18.4053 KOps/s | |
test_exclude_nested | 0.2489ms | 0.1099ms | 9.0953 KOps/s | 9.0050 KOps/s | |
test_empty[True] | 0.3814ms | 0.3430ms | 2.9153 KOps/s | 2.8525 KOps/s | |
test_empty[False] | 17.7209μs | 0.9387μs | 1.0653 MOps/s | 1.0850 MOps/s | |
test_to | 0.1006ms | 73.4259μs | 13.6192 KOps/s | 13.5282 KOps/s | |
test_to_nonblocking | 0.2094ms | 61.5963μs | 16.2348 KOps/s | 16.6467 KOps/s | |
test_unbind_speed | 0.3171ms | 0.2645ms | 3.7802 KOps/s | 3.7869 KOps/s | |
test_unbind_speed_stack0 | 0.4396ms | 0.2670ms | 3.7456 KOps/s | 3.7302 KOps/s | |
test_unbind_speed_stack1 | 77.4090ms | 0.8014ms | 1.2477 KOps/s | 1.2279 KOps/s | |
test_split | 76.7394ms | 1.6245ms | 615.5565 Ops/s | 614.7972 Ops/s | |
test_chunk | 77.2103ms | 1.6202ms | 617.2063 Ops/s | 614.7506 Ops/s | |
test_creation[device0] | 0.2031ms | 56.3542μs | 17.7449 KOps/s | 17.5383 KOps/s | |
test_creation_from_tensor | 0.2107ms | 53.4156μs | 18.7211 KOps/s | 18.3623 KOps/s | |
test_add_one[memmap_tensor0] | 0.1097ms | 6.6156μs | 151.1583 KOps/s | 149.8920 KOps/s | |
test_contiguous[memmap_tensor0] | 13.6110μs | 0.6714μs | 1.4894 MOps/s | 1.5191 MOps/s | |
test_stack[memmap_tensor0] | 30.7810μs | 4.4320μs | 225.6307 KOps/s | 226.0126 KOps/s | |
test_memmaptd_index | 1.1647ms | 0.2790ms | 3.5842 KOps/s | 3.5856 KOps/s | |
test_memmaptd_index_astensor | 0.6596ms | 0.3501ms | 2.8561 KOps/s | 2.8617 KOps/s | |
test_memmaptd_index_op | 1.1424ms | 0.6276ms | 1.5933 KOps/s | 1.5507 KOps/s | |
test_serialize_model | 0.1802s | 0.1101s | 9.0861 Ops/s | 8.6961 Ops/s | |
test_serialize_model_pickle | 1.3652s | 1.2383s | 0.8076 Ops/s | 0.8064 Ops/s | |
test_serialize_weights | 0.1784s | 0.1078s | 9.2797 Ops/s | 8.7875 Ops/s | |
test_serialize_weights_returnearly | 0.2268s | 96.3780ms | 10.3758 Ops/s | 10.8432 Ops/s | |
test_serialize_weights_pickle | 1.3579s | 1.2482s | 0.8011 Ops/s | 0.8011 Ops/s | |
test_reshape_pytree | 50.4530μs | 25.5003μs | 39.2153 KOps/s | 39.3801 KOps/s | |
test_reshape_td | 0.1101ms | 30.5062μs | 32.7802 KOps/s | 32.1407 KOps/s | |
test_view_pytree | 0.1399ms | 25.3473μs | 39.4520 KOps/s | 39.5215 KOps/s | |
test_view_td | 0.1339ms | 34.7984μs | 28.7370 KOps/s | 26.9273 KOps/s | |
test_unbind_pytree | 0.1895ms | 31.2076μs | 32.0435 KOps/s | 32.0596 KOps/s | |
test_unbind_td | 0.4710ms | 41.4711μs | 24.1132 KOps/s | 24.8210 KOps/s | |
test_split_pytree | 0.3873ms | 38.9177μs | 25.6952 KOps/s | 29.6908 KOps/s | |
test_split_td | 0.5721ms | 39.1479μs | 25.5441 KOps/s | 26.4058 KOps/s | |
test_add_pytree | 0.1822ms | 37.5023μs | 26.6651 KOps/s | 27.3117 KOps/s | |
test_add_td | 0.1770ms | 49.9626μs | 20.0150 KOps/s | 20.3138 KOps/s | |
test_distributed | 0.2255ms | 66.7298μs | 14.9858 KOps/s | 11.6997 KOps/s | |
test_tdmodule | 0.1455ms | 14.4143μs | 69.3757 KOps/s | 63.8250 KOps/s | |
test_tdmodule_dispatch | 43.7920μs | 27.8616μs | 35.8917 KOps/s | 32.8784 KOps/s | |
test_tdseq | 32.0420μs | 16.1504μs | 61.9181 KOps/s | 58.6231 KOps/s | |
test_tdseq_dispatch | 0.1373ms | 31.0856μs | 32.1693 KOps/s | 30.1497 KOps/s | |
test_instantiation_functorch | 80.7402ms | 1.6518ms | 605.3963 Ops/s | 656.9609 Ops/s | |
test_instantiation_td | 1.5563ms | 1.0500ms | 952.3976 Ops/s | 884.7582 Ops/s | |
test_exec_functorch | 0.2383ms | 0.1451ms | 6.8916 KOps/s | 6.9855 KOps/s | |
test_exec_functional_call | 0.2295ms | 0.1337ms | 7.4771 KOps/s | 7.5365 KOps/s | |
test_exec_td | 0.1854ms | 0.1311ms | 7.6297 KOps/s | 7.6856 KOps/s | |
test_exec_td_decorator | 0.8023ms | 0.2054ms | 4.8676 KOps/s | 4.9466 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8212ms | 0.5883ms | 1.6999 KOps/s | 1.7057 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8993ms | 0.5935ms | 1.6849 KOps/s | 1.6757 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6893ms | 0.5290ms | 1.8905 KOps/s | 1.9083 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6910ms | 0.5191ms | 1.9265 KOps/s | 1.9003 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0025ms | 0.6618ms | 1.5110 KOps/s | 1.5325 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9293ms | 0.6556ms | 1.5253 KOps/s | 1.5200 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8184ms | 0.5687ms | 1.7583 KOps/s | 1.7161 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7586ms | 0.5656ms | 1.7680 KOps/s | 1.7223 KOps/s | |
test_vmap_transformer_speed[True-True] | 7.9217ms | 7.6793ms | 130.2197 Ops/s | 130.5505 Ops/s | |
test_vmap_transformer_speed[True-False] | 7.8132ms | 7.6458ms | 130.7914 Ops/s | 129.7299 Ops/s | |
test_vmap_transformer_speed[False-True] | 9.0051ms | 8.0433ms | 124.3275 Ops/s | 131.4705 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.4899ms | 7.8549ms | 127.3093 Ops/s | 131.2682 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.3330ms | 19.3967ms | 51.5551 Ops/s | 53.5636 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.3669ms | 19.6905ms | 50.7859 Ops/s | 53.5854 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.1219ms | 19.4104ms | 51.5187 Ops/s | 53.7673 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.9952ms | 19.2322ms | 51.9961 Ops/s | 53.7679 Ops/s | |
test_to_module_speed[True] | 1.8027ms | 1.5461ms | 646.7781 Ops/s | 594.5193 Ops/s | |
test_to_module_speed[False] | 1.7661ms | 1.5290ms | 654.0290 Ops/s | 665.9482 Ops/s | |
test_tc_init | 0.1526ms | 22.8614μs | 43.7418 KOps/s | 38.6579 KOps/s | |
test_tc_init_nested | 0.1839ms | 51.1931μs | 19.5339 KOps/s | 18.0414 KOps/s | |
test_tc_first_layer_tensor | 4.4964μs | 0.3638μs | 2.7485 MOps/s | 2.7988 MOps/s | |
test_tc_first_layer_nontensor | 9.4227μs | 0.3966μs | 2.5214 MOps/s | 2.5884 MOps/s | |
test_tc_second_layer_tensor | 24.0230μs | 0.9918μs | 1.0083 MOps/s | 928.9901 KOps/s | |
test_tc_second_layer_nontensor | 34.6333μs | 0.8427μs | 1.1866 MOps/s | 1.2575 MOps/s | |
test_unbind | 0.1013s | 6.7728ms | 147.6487 Ops/s | 144.2992 Ops/s | |
test_full_like | 14.4599ms | 13.8243ms | 72.3363 Ops/s | 71.9612 Ops/s | |
test_zeros_like | 7.7296ms | 7.0648ms | 141.5468 Ops/s | 140.5136 Ops/s | |
test_ones_like | 8.5813ms | 7.9991ms | 125.0134 Ops/s | 125.3927 Ops/s | |
test_clone | 9.8975ms | 9.5309ms | 104.9214 Ops/s | 105.0241 Ops/s | |
test_squeeze | 62.6930μs | 11.1303μs | 89.8447 KOps/s | 90.0583 KOps/s | |
test_unsqueeze | 0.2060ms | 59.6988μs | 16.7508 KOps/s | 16.4910 KOps/s | |
test_split | 0.2424ms | 96.6557μs | 10.3460 KOps/s | 10.4109 KOps/s | |
test_permute | 0.2655ms | 0.1236ms | 8.0882 KOps/s | 8.3160 KOps/s | |
test_stack | 30.3348ms | 27.7650ms | 36.0165 Ops/s | 36.2238 Ops/s | |
test_cat | 28.1897ms | 27.6509ms | 36.1651 Ops/s | 36.1333 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.