-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Faster empty_like for MemoryMappedTensor (dup) #586
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 32.3500μs | 15.7147μs | 63.6349 KOps/s | 58.4044 KOps/s | |
test_plain_set_stack_nested | 0.1856ms | 0.1413ms | 7.0748 KOps/s | 6.4709 KOps/s | |
test_plain_set_nested_inplace | 43.1100μs | 18.9639μs | 52.7317 KOps/s | 48.2482 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3256ms | 0.1730ms | 5.7798 KOps/s | 5.3360 KOps/s | |
test_items | 29.9830μs | 2.3980μs | 417.0157 KOps/s | 385.0612 KOps/s | |
test_items_nested | 0.3285ms | 0.2695ms | 3.7105 KOps/s | 3.7163 KOps/s | |
test_items_nested_locked | 1.2234ms | 0.2700ms | 3.7043 KOps/s | 3.6897 KOps/s | |
test_items_nested_leaf | 0.3153ms | 0.1645ms | 6.0787 KOps/s | 5.9750 KOps/s | |
test_items_stack_nested | 2.2205ms | 1.4841ms | 673.8048 Ops/s | 646.4464 Ops/s | |
test_items_stack_nested_leaf | 2.0577ms | 1.3457ms | 743.1312 Ops/s | 718.6469 Ops/s | |
test_items_stack_nested_locked | 1.2870ms | 0.7642ms | 1.3085 KOps/s | 1.2843 KOps/s | |
test_keys | 20.6590μs | 3.9152μs | 255.4123 KOps/s | 258.5095 KOps/s | |
test_keys_nested | 3.3161ms | 0.1403ms | 7.1280 KOps/s | 6.6527 KOps/s | |
test_keys_nested_locked | 0.2711ms | 0.1381ms | 7.2416 KOps/s | 6.9381 KOps/s | |
test_keys_nested_leaf | 0.4106ms | 0.1391ms | 7.1901 KOps/s | 6.6329 KOps/s | |
test_keys_stack_nested | 2.3084ms | 1.4059ms | 711.3059 Ops/s | 683.0801 Ops/s | |
test_keys_stack_nested_leaf | 1.5775ms | 1.4056ms | 711.4519 Ops/s | 680.4736 Ops/s | |
test_keys_stack_nested_locked | 0.8078ms | 0.6744ms | 1.4827 KOps/s | 1.4229 KOps/s | |
test_values | 8.9770μs | 1.2052μs | 829.7110 KOps/s | 835.5122 KOps/s | |
test_values_nested | 98.2630μs | 49.3377μs | 20.2685 KOps/s | 19.5174 KOps/s | |
test_values_nested_locked | 88.5250μs | 49.7591μs | 20.0968 KOps/s | 19.7274 KOps/s | |
test_values_nested_leaf | 64.5500μs | 43.9020μs | 22.7780 KOps/s | 21.3623 KOps/s | |
test_values_stack_nested | 1.9200ms | 1.1926ms | 838.5027 Ops/s | 795.8124 Ops/s | |
test_values_stack_nested_leaf | 1.8629ms | 1.2075ms | 828.1355 Ops/s | 809.8044 Ops/s | |
test_values_stack_nested_locked | 0.9592ms | 0.5144ms | 1.9441 KOps/s | 1.8922 KOps/s | |
test_membership | 11.6320μs | 1.3690μs | 730.4756 KOps/s | 747.7469 KOps/s | |
test_membership_nested | 20.9290μs | 2.8257μs | 353.8975 KOps/s | 351.6043 KOps/s | |
test_membership_nested_leaf | 20.6480μs | 2.8437μs | 351.6594 KOps/s | 319.7146 KOps/s | |
test_membership_stacked_nested | 35.4760μs | 11.7878μs | 84.8338 KOps/s | 81.4649 KOps/s | |
test_membership_stacked_nested_leaf | 46.2560μs | 11.8537μs | 84.3620 KOps/s | 81.7635 KOps/s | |
test_membership_nested_last | 24.4850μs | 5.9907μs | 166.9242 KOps/s | 163.7231 KOps/s | |
test_membership_nested_leaf_last | 25.2370μs | 5.9866μs | 167.0408 KOps/s | 163.3438 KOps/s | |
test_membership_stacked_nested_last | 0.2359ms | 0.1691ms | 5.9133 KOps/s | 5.7499 KOps/s | |
test_membership_stacked_nested_leaf_last | 79.5390μs | 13.7308μs | 72.8289 KOps/s | 68.8470 KOps/s | |
test_nested_getleaf | 33.7320μs | 10.9538μs | 91.2929 KOps/s | 90.3576 KOps/s | |
test_nested_get | 26.4800μs | 10.3786μs | 96.3525 KOps/s | 94.8474 KOps/s | |
test_stacked_getleaf | 1.2122ms | 0.6413ms | 1.5592 KOps/s | 1.5144 KOps/s | |
test_stacked_get | 5.0487ms | 0.6182ms | 1.6176 KOps/s | 1.5885 KOps/s | |
test_nested_getitemleaf | 27.6510μs | 10.7848μs | 92.7229 KOps/s | 90.4617 KOps/s | |
test_nested_getitem | 45.2740μs | 10.1970μs | 98.0683 KOps/s | 94.4652 KOps/s | |
test_stacked_getitemleaf | 1.1055ms | 0.6385ms | 1.5662 KOps/s | 1.4836 KOps/s | |
test_stacked_getitem | 0.9658ms | 0.6084ms | 1.6435 KOps/s | 1.5674 KOps/s | |
test_lock_nested | 7.4964ms | 0.5667ms | 1.7647 KOps/s | 1.7472 KOps/s | |
test_lock_stack_nested | 7.5857ms | 5.0192ms | 199.2364 Ops/s | 193.9755 Ops/s | |
test_unlock_nested | 76.4495ms | 0.5172ms | 1.9334 KOps/s | 2.2127 KOps/s | |
test_unlock_stack_nested | 71.3180ms | 6.9242ms | 144.4217 Ops/s | 139.8136 Ops/s | |
test_flatten_speed | 0.5899ms | 0.2714ms | 3.6841 KOps/s | 3.5543 KOps/s | |
test_unflatten_speed | 0.7834ms | 0.4660ms | 2.1458 KOps/s | 2.0972 KOps/s | |
test_common_ops | 1.2149ms | 0.6642ms | 1.5055 KOps/s | 1.3979 KOps/s | |
test_creation | 59.7510μs | 2.4620μs | 406.1704 KOps/s | 401.0375 KOps/s | |
test_creation_empty | 45.4050μs | 8.0693μs | 123.9262 KOps/s | 111.9301 KOps/s | |
test_creation_nested_1 | 31.3590μs | 11.2872μs | 88.5963 KOps/s | 80.2395 KOps/s | |
test_creation_nested_2 | 32.6710μs | 14.8109μs | 67.5179 KOps/s | 62.3953 KOps/s | |
test_clone | 0.1046ms | 14.0140μs | 71.3571 KOps/s | 70.8623 KOps/s | |
test_getitem[int] | 48.9010μs | 13.3278μs | 75.0310 KOps/s | 75.5807 KOps/s | |
test_getitem[slice_int] | 0.1003ms | 26.1206μs | 38.2840 KOps/s | 39.0660 KOps/s | |
test_getitem[range] | 0.1036ms | 45.1239μs | 22.1612 KOps/s | 21.2577 KOps/s | |
test_getitem[tuple] | 60.2020μs | 21.0192μs | 47.5756 KOps/s | 48.3943 KOps/s | |
test_getitem[list] | 0.2094ms | 39.4847μs | 25.3263 KOps/s | 23.8507 KOps/s | |
test_setitem_dim[int] | 54.0410μs | 28.0697μs | 35.6255 KOps/s | 33.3169 KOps/s | |
test_setitem_dim[slice_int] | 85.7500μs | 52.9914μs | 18.8710 KOps/s | 18.0037 KOps/s | |
test_setitem_dim[range] | 0.1077ms | 73.9568μs | 13.5214 KOps/s | 13.1371 KOps/s | |
test_setitem_dim[tuple] | 68.4580μs | 41.7295μs | 23.9639 KOps/s | 23.0422 KOps/s | |
test_setitem | 84.2370μs | 18.6788μs | 53.5366 KOps/s | 49.0663 KOps/s | |
test_set | 83.0350μs | 17.8829μs | 55.9193 KOps/s | 51.1588 KOps/s | |
test_set_shared | 3.7122ms | 0.1402ms | 7.1317 KOps/s | 6.9349 KOps/s | |
test_update | 0.1398ms | 19.0417μs | 52.5164 KOps/s | 46.6363 KOps/s | |
test_update_nested | 75.0800μs | 26.5568μs | 37.6552 KOps/s | 33.6362 KOps/s | |
test_set_nested | 0.1407ms | 19.5550μs | 51.1377 KOps/s | 45.7392 KOps/s | |
test_set_nested_new | 80.5510μs | 24.6949μs | 40.4942 KOps/s | 35.4099 KOps/s | |
test_select | 0.1286ms | 49.3752μs | 20.2531 KOps/s | 18.4452 KOps/s | |
test_unbind_speed | 0.7775ms | 0.3784ms | 2.6426 KOps/s | 2.6513 KOps/s | |
test_unbind_speed_stack0 | 70.5046ms | 4.7333ms | 211.2703 Ops/s | 231.8049 Ops/s | |
test_unbind_speed_stack1 | 2.1415μs | 0.6221μs | 1.6075 MOps/s | 1.5725 MOps/s | |
test_split | 60.2952ms | 1.7778ms | 562.4863 Ops/s | 596.9287 Ops/s | |
test_chunk | 59.9495ms | 1.7452ms | 573.0077 Ops/s | 594.5460 Ops/s | |
test_creation[device0] | 0.4307ms | 0.2968ms | 3.3694 KOps/s | 2.9400 KOps/s | |
test_creation_from_tensor | 4.4269ms | 0.3333ms | 3.0001 KOps/s | 2.9992 KOps/s | |
test_add_one[memmap_tensor0] | 79.8390μs | 25.5037μs | 39.2100 KOps/s | 38.6742 KOps/s | |
test_contiguous[memmap_tensor0] | 27.8720μs | 5.9762μs | 167.3306 KOps/s | 172.3632 KOps/s | |
test_stack[memmap_tensor0] | 86.5310μs | 19.5312μs | 51.2001 KOps/s | 50.0867 KOps/s | |
test_memmaptd_index | 0.4740ms | 0.1980ms | 5.0498 KOps/s | 4.9071 KOps/s | |
test_memmaptd_index_astensor | 0.5269ms | 0.2564ms | 3.9002 KOps/s | 3.7649 KOps/s | |
test_memmaptd_index_op | 0.6012ms | 0.5064ms | 1.9746 KOps/s | 1.8775 KOps/s | |
test_reshape_pytree | 59.5510μs | 23.7644μs | 42.0798 KOps/s | 40.8690 KOps/s | |
test_reshape_td | 78.0960μs | 32.8546μs | 30.4371 KOps/s | 30.2399 KOps/s | |
test_view_pytree | 71.6440μs | 23.4450μs | 42.6531 KOps/s | 40.7374 KOps/s | |
test_view_td | 18.8850μs | 4.9731μs | 201.0806 KOps/s | 202.9827 KOps/s | |
test_unbind_pytree | 55.8340μs | 26.4294μs | 37.8367 KOps/s | 36.3180 KOps/s | |
test_unbind_td | 0.1229ms | 60.3295μs | 16.5756 KOps/s | 16.3909 KOps/s | |
test_split_pytree | 64.6010μs | 26.3619μs | 37.9335 KOps/s | 36.2440 KOps/s | |
test_split_td | 0.1395ms | 47.6734μs | 20.9760 KOps/s | 20.8094 KOps/s | |
test_add_pytree | 75.5910μs | 32.1217μs | 31.1316 KOps/s | 29.5056 KOps/s | |
test_add_td | 0.1098ms | 45.6047μs | 21.9275 KOps/s | 20.0098 KOps/s | |
test_distributed | 20.4680μs | 6.0705μs | 164.7318 KOps/s | 162.0530 KOps/s | |
test_tdmodule | 0.1617ms | 20.7210μs | 48.2601 KOps/s | 43.1558 KOps/s | |
test_tdmodule_dispatch | 0.1719ms | 38.9960μs | 25.6437 KOps/s | 24.7318 KOps/s | |
test_tdseq | 50.1240μs | 23.5306μs | 42.4979 KOps/s | 40.3764 KOps/s | |
test_tdseq_dispatch | 0.4339ms | 41.9900μs | 23.8152 KOps/s | 22.1751 KOps/s | |
test_instantiation_functorch | 2.0506ms | 1.3045ms | 766.5915 Ops/s | 738.7450 Ops/s | |
test_instantiation_td | 1.5820ms | 1.0224ms | 978.0639 Ops/s | 949.6779 Ops/s | |
test_exec_functorch | 0.2504ms | 0.1576ms | 6.3470 KOps/s | 6.1270 KOps/s | |
test_exec_functional_call | 0.3568ms | 0.1482ms | 6.7477 KOps/s | 6.6817 KOps/s | |
test_exec_td | 0.2124ms | 0.1439ms | 6.9495 KOps/s | 6.6377 KOps/s | |
test_exec_td_decorator | 0.9921ms | 0.1762ms | 5.6756 KOps/s | 4.9413 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4355ms | 0.9068ms | 1.1028 KOps/s | 1.0779 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6442ms | 0.4677ms | 2.1380 KOps/s | 2.0481 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.3129ms | 0.7935ms | 1.2602 KOps/s | 1.2422 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5561ms | 0.3853ms | 2.5956 KOps/s | 2.5112 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 4.3440ms | 1.9668ms | 508.4402 Ops/s | 545.3326 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0076ms | 0.5122ms | 1.9525 KOps/s | 1.8666 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.9865ms | 1.4775ms | 676.8255 Ops/s | 657.2215 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.4797ms | 0.4069ms | 2.4576 KOps/s | 2.4481 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.4324ms | 12.7048μs | 78.7102 KOps/s | 78.8013 KOps/s | |
test_plain_set_stack_nested | 0.2956ms | 0.1148ms | 8.7077 KOps/s | 8.6159 KOps/s | |
test_plain_set_nested_inplace | 37.8220μs | 15.1438μs | 66.0335 KOps/s | 65.6276 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3287ms | 0.1408ms | 7.1003 KOps/s | 7.0893 KOps/s | |
test_items | 0.1871ms | 4.6392μs | 215.5540 KOps/s | 214.4428 KOps/s | |
test_items_nested | 0.5173ms | 0.3357ms | 2.9791 KOps/s | 2.9473 KOps/s | |
test_items_nested_locked | 0.5295ms | 0.3390ms | 2.9498 KOps/s | 2.9171 KOps/s | |
test_items_nested_leaf | 0.2260ms | 0.1980ms | 5.0512 KOps/s | 4.9794 KOps/s | |
test_items_stack_nested | 1.6867ms | 1.4740ms | 678.4414 Ops/s | 669.8370 Ops/s | |
test_items_stack_nested_leaf | 1.5601ms | 1.3091ms | 763.8784 Ops/s | 752.9782 Ops/s | |
test_items_stack_nested_locked | 0.9782ms | 0.8110ms | 1.2330 KOps/s | 1.1987 KOps/s | |
test_keys | 0.2398ms | 4.5531μs | 219.6301 KOps/s | 217.8906 KOps/s | |
test_keys_nested | 3.2905ms | 90.3565μs | 11.0673 KOps/s | 11.0174 KOps/s | |
test_keys_nested_locked | 0.1144ms | 90.0552μs | 11.1043 KOps/s | 11.0592 KOps/s | |
test_keys_nested_leaf | 41.2258ms | 86.8247μs | 11.5175 KOps/s | 12.2659 KOps/s | |
test_keys_stack_nested | 1.4788ms | 1.2972ms | 770.9083 Ops/s | 758.1007 Ops/s | |
test_keys_stack_nested_leaf | 1.4805ms | 1.2985ms | 770.1440 Ops/s | 764.0835 Ops/s | |
test_keys_stack_nested_locked | 0.7862ms | 0.6282ms | 1.5918 KOps/s | 1.5507 KOps/s | |
test_values | 60.5503μs | 1.8908μs | 528.8835 KOps/s | 526.8461 KOps/s | |
test_values_nested | 70.5810μs | 42.7336μs | 23.4008 KOps/s | 23.3112 KOps/s | |
test_values_nested_locked | 0.2167ms | 44.9050μs | 22.2692 KOps/s | 22.0385 KOps/s | |
test_values_nested_leaf | 0.2173ms | 36.9636μs | 27.0536 KOps/s | 26.7961 KOps/s | |
test_values_stack_nested | 1.3269ms | 1.1316ms | 883.7411 Ops/s | 876.3233 Ops/s | |
test_values_stack_nested_leaf | 1.3229ms | 1.1076ms | 902.8678 Ops/s | 886.5659 Ops/s | |
test_values_stack_nested_locked | 0.7310ms | 0.4971ms | 2.0116 KOps/s | 1.9572 KOps/s | |
test_membership | 36.7644μs | 0.9432μs | 1.0602 MOps/s | 937.4679 KOps/s | |
test_membership_nested | 18.9500μs | 2.1908μs | 456.4499 KOps/s | 456.5388 KOps/s | |
test_membership_nested_leaf | 96.6210μs | 2.0973μs | 476.8023 KOps/s | 473.1796 KOps/s | |
test_membership_stacked_nested | 0.2039ms | 10.9728μs | 91.1345 KOps/s | 91.4223 KOps/s | |
test_membership_stacked_nested_leaf | 49.0400μs | 10.9167μs | 91.6027 KOps/s | 91.7035 KOps/s | |
test_membership_nested_last | 0.2012ms | 4.5734μs | 218.6550 KOps/s | 217.7921 KOps/s | |
test_membership_nested_leaf_last | 0.1689ms | 4.5980μs | 217.4882 KOps/s | 217.7050 KOps/s | |
test_membership_stacked_nested_last | 0.3328ms | 0.1336ms | 7.4862 KOps/s | 7.4050 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.6210μs | 12.7825μs | 78.2321 KOps/s | 78.2056 KOps/s | |
test_nested_getleaf | 0.1946ms | 8.4201μs | 118.7637 KOps/s | 119.1052 KOps/s | |
test_nested_get | 0.2243ms | 7.9975μs | 125.0389 KOps/s | 124.9539 KOps/s | |
test_stacked_getleaf | 0.8600ms | 0.5664ms | 1.7655 KOps/s | 1.7442 KOps/s | |
test_stacked_get | 0.7265ms | 0.5311ms | 1.8828 KOps/s | 1.8785 KOps/s | |
test_nested_getitemleaf | 34.1000μs | 8.4085μs | 118.9269 KOps/s | 118.3651 KOps/s | |
test_nested_getitem | 0.1756ms | 7.9372μs | 125.9892 KOps/s | 125.3536 KOps/s | |
test_stacked_getitemleaf | 0.7669ms | 0.5669ms | 1.7639 KOps/s | 1.7489 KOps/s | |
test_stacked_getitem | 0.7172ms | 0.5294ms | 1.8890 KOps/s | 1.8406 KOps/s | |
test_lock_nested | 3.2298ms | 0.5578ms | 1.7928 KOps/s | 1.7674 KOps/s | |
test_lock_stack_nested | 81.9395ms | 7.2192ms | 138.5199 Ops/s | 137.4667 Ops/s | |
test_unlock_nested | 2.4443ms | 0.4353ms | 2.2972 KOps/s | 2.3309 KOps/s | |
test_unlock_stack_nested | 66.8960ms | 6.2636ms | 159.6519 Ops/s | 158.9672 Ops/s | |
test_flatten_speed | 0.3823ms | 0.1877ms | 5.3271 KOps/s | 5.3588 KOps/s | |
test_unflatten_speed | 0.5448ms | 0.3663ms | 2.7299 KOps/s | 2.7555 KOps/s | |
test_common_ops | 1.1481ms | 0.6161ms | 1.6232 KOps/s | 1.6593 KOps/s | |
test_creation | 0.2025ms | 2.0641μs | 484.4629 KOps/s | 470.5455 KOps/s | |
test_creation_empty | 41.3910μs | 7.1169μs | 140.5112 KOps/s | 141.8243 KOps/s | |
test_creation_nested_1 | 28.6500μs | 9.5198μs | 105.0448 KOps/s | 106.8206 KOps/s | |
test_creation_nested_2 | 0.1749ms | 12.0935μs | 82.6893 KOps/s | 83.2139 KOps/s | |
test_clone | 0.1167ms | 14.6695μs | 68.1687 KOps/s | 70.2588 KOps/s | |
test_getitem[int] | 31.9700μs | 12.3745μs | 80.8116 KOps/s | 81.0366 KOps/s | |
test_getitem[slice_int] | 0.2193ms | 24.2419μs | 41.2510 KOps/s | 40.8534 KOps/s | |
test_getitem[range] | 83.9710μs | 42.2844μs | 23.6494 KOps/s | 23.9528 KOps/s | |
test_getitem[tuple] | 59.1410μs | 22.3563μs | 44.7301 KOps/s | 49.0861 KOps/s | |
test_getitem[list] | 69.1810μs | 35.9232μs | 27.8372 KOps/s | 26.7278 KOps/s | |
test_setitem_dim[int] | 41.8410μs | 25.9023μs | 38.6066 KOps/s | 37.7186 KOps/s | |
test_setitem_dim[slice_int] | 72.5110μs | 46.3888μs | 21.5569 KOps/s | 21.0737 KOps/s | |
test_setitem_dim[range] | 0.2641ms | 63.6025μs | 15.7226 KOps/s | 15.5568 KOps/s | |
test_setitem_dim[tuple] | 66.9610μs | 39.5039μs | 25.3139 KOps/s | 24.6110 KOps/s | |
test_setitem | 97.2310μs | 18.4774μs | 54.1201 KOps/s | 53.5202 KOps/s | |
test_set | 0.2084ms | 18.0783μs | 55.3150 KOps/s | 55.8734 KOps/s | |
test_set_shared | 2.6649ms | 0.1056ms | 9.4692 KOps/s | 8.4441 KOps/s | |
test_update | 0.1012ms | 19.4826μs | 51.3279 KOps/s | 51.2423 KOps/s | |
test_update_nested | 93.6820μs | 25.8805μs | 38.6392 KOps/s | 37.8714 KOps/s | |
test_set_nested | 90.4510μs | 19.5247μs | 51.2172 KOps/s | 51.2014 KOps/s | |
test_set_nested_new | 94.2710μs | 23.4235μs | 42.6922 KOps/s | 41.3084 KOps/s | |
test_select | 0.1067ms | 45.8638μs | 21.8037 KOps/s | 20.6956 KOps/s | |
test_to | 75.0610μs | 54.6005μs | 18.3149 KOps/s | 18.3784 KOps/s | |
test_to_nonblocking | 65.4310μs | 34.6090μs | 28.8942 KOps/s | 28.0389 KOps/s | |
test_unbind_speed | 0.3960ms | 0.3632ms | 2.7531 KOps/s | 2.7718 KOps/s | |
test_unbind_speed_stack0 | 62.8789ms | 4.3839ms | 228.1079 Ops/s | 245.8537 Ops/s | |
test_unbind_speed_stack1 | 1.3126μs | 0.5265μs | 1.8992 MOps/s | 1.8955 MOps/s | |
test_split | 53.6680ms | 1.8489ms | 540.8683 Ops/s | 543.5124 Ops/s | |
test_chunk | 53.3020ms | 1.8418ms | 542.9454 Ops/s | 548.1018 Ops/s | |
test_creation[device0] | 0.5618ms | 0.3124ms | 3.2007 KOps/s | 3.2494 KOps/s | |
test_creation[device1] | 0.9407ms | 0.3187ms | 3.1380 KOps/s | 3.2172 KOps/s | |
test_creation_from_tensor | 57.9628ms | 0.3626ms | 2.7578 KOps/s | 2.9650 KOps/s | |
test_add_one[memmap_tensor0] | 70.6510μs | 23.0537μs | 43.3770 KOps/s | 40.6979 KOps/s | |
test_add_one[memmap_tensor1] | 0.2057ms | 72.5518μs | 13.7833 KOps/s | 13.5834 KOps/s | |
test_contiguous[memmap_tensor0] | 26.0910μs | 5.7302μs | 174.5125 KOps/s | 178.3858 KOps/s | |
test_contiguous[memmap_tensor1] | 44.2410μs | 21.1118μs | 47.3668 KOps/s | 45.9570 KOps/s | |
test_stack[memmap_tensor0] | 39.9610μs | 18.5539μs | 53.8970 KOps/s | 53.0594 KOps/s | |
test_stack[memmap_tensor1] | 0.1532ms | 72.4787μs | 13.7972 KOps/s | 13.5325 KOps/s | |
test_memmaptd_index | 0.2971ms | 0.2383ms | 4.1968 KOps/s | 4.1099 KOps/s | |
test_memmaptd_index_astensor | 0.3738ms | 0.2946ms | 3.3950 KOps/s | 3.3380 KOps/s | |
test_memmaptd_index_op | 0.6300ms | 0.5551ms | 1.8014 KOps/s | 1.7488 KOps/s | |
test_reshape_pytree | 37.4300μs | 21.0231μs | 47.5668 KOps/s | 47.4455 KOps/s | |
test_reshape_td | 64.1300μs | 30.3800μs | 32.9164 KOps/s | 32.7014 KOps/s | |
test_view_pytree | 40.3100μs | 20.7903μs | 48.0994 KOps/s | 48.3743 KOps/s | |
test_view_td | 15.6910μs | 4.0240μs | 248.5105 KOps/s | 248.6434 KOps/s | |
test_unbind_pytree | 0.5975ms | 25.9899μs | 38.4765 KOps/s | 37.8777 KOps/s | |
test_unbind_td | 90.2910μs | 56.7266μs | 17.6284 KOps/s | 17.4529 KOps/s | |
test_split_pytree | 39.8600μs | 23.9518μs | 41.7506 KOps/s | 41.4543 KOps/s | |
test_split_td | 73.2510μs | 43.9277μs | 22.7647 KOps/s | 22.0217 KOps/s | |
test_add_pytree | 57.1700μs | 31.9360μs | 31.3127 KOps/s | 31.0576 KOps/s | |
test_add_td | 76.8710μs | 44.5553μs | 22.4440 KOps/s | 21.1448 KOps/s | |
test_distributed | 19.4000μs | 5.5425μs | 180.4237 KOps/s | 179.1373 KOps/s | |
test_tdmodule | 31.9610μs | 16.8308μs | 59.4149 KOps/s | 58.1935 KOps/s | |
test_tdmodule_dispatch | 0.2202ms | 33.5048μs | 29.8465 KOps/s | 29.3838 KOps/s | |
test_tdseq | 35.6810μs | 19.7803μs | 50.5553 KOps/s | 49.4485 KOps/s | |
test_tdseq_dispatch | 52.2010μs | 36.2912μs | 27.5549 KOps/s | 27.7366 KOps/s | |
test_instantiation_functorch | 1.7709ms | 1.6899ms | 591.7516 Ops/s | 596.2229 Ops/s | |
test_instantiation_td | 1.6710ms | 1.1840ms | 844.5845 Ops/s | 844.3364 Ops/s | |
test_exec_functorch | 0.2180ms | 0.1588ms | 6.2985 KOps/s | 6.3201 KOps/s | |
test_exec_functional_call | 0.2215ms | 0.1580ms | 6.3275 KOps/s | 6.4764 KOps/s | |
test_exec_td | 0.1852ms | 0.1487ms | 6.7249 KOps/s | 6.7325 KOps/s | |
test_exec_td_decorator | 0.7071ms | 0.1844ms | 5.4233 KOps/s | 5.3756 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1765ms | 1.0822ms | 924.0014 Ops/s | 923.8306 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7240ms | 0.6252ms | 1.5996 KOps/s | 1.6000 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0853ms | 0.9985ms | 1.0015 KOps/s | 1.0036 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6322ms | 0.5485ms | 1.8231 KOps/s | 1.8166 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.8878ms | 2.0417ms | 489.7875 Ops/s | 486.5477 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0596ms | 0.6629ms | 1.5086 KOps/s | 1.5037 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.2081ms | 1.7678ms | 565.6695 Ops/s | 560.0388 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9498ms | 0.5643ms | 1.7722 KOps/s | 1.7775 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.7418ms | 12.6500ms | 79.0514 Ops/s | 79.1770 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.3924ms | 8.3126ms | 120.2987 Ops/s | 120.6385 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.6558ms | 12.5405ms | 79.7413 Ops/s | 79.5146 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.5368ms | 8.2259ms | 121.5666 Ops/s | 121.0841 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 66.2524ms | 64.8493ms | 15.4204 Ops/s | 15.2701 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 22.2305ms | 20.0401ms | 49.8999 Ops/s | 49.8328 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 60.1096ms | 58.7430ms | 17.0233 Ops/s | 15.5847 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.6214ms | 19.6341ms | 50.9317 Ops/s | 50.9411 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.