-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster tensorclass set #880
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.5680μs | 16.4427μs | 60.8174 KOps/s | 57.0725 KOps/s | |
test_plain_set_stack_nested | 44.5230μs | 16.7501μs | 59.7013 KOps/s | 57.5169 KOps/s | |
test_plain_set_nested_inplace | 50.2040μs | 18.5411μs | 53.9344 KOps/s | 51.9336 KOps/s | |
test_plain_set_stack_nested_inplace | 52.3080μs | 18.5047μs | 54.0404 KOps/s | 51.8639 KOps/s | |
test_items | 22.5820μs | 2.6250μs | 380.9463 KOps/s | 375.6204 KOps/s | |
test_items_nested | 1.6499ms | 0.3669ms | 2.7258 KOps/s | 2.7369 KOps/s | |
test_items_nested_locked | 0.8026ms | 0.3667ms | 2.7271 KOps/s | 2.7113 KOps/s | |
test_items_nested_leaf | 0.1482ms | 85.9318μs | 11.6371 KOps/s | 11.6459 KOps/s | |
test_items_stack_nested | 0.6719ms | 0.3721ms | 2.6872 KOps/s | 2.7231 KOps/s | |
test_items_stack_nested_leaf | 0.1562ms | 85.3369μs | 11.7183 KOps/s | 11.9082 KOps/s | |
test_items_stack_nested_locked | 0.6933ms | 0.3710ms | 2.6957 KOps/s | 2.7353 KOps/s | |
test_keys | 48.8110μs | 3.9720μs | 251.7620 KOps/s | 254.0530 KOps/s | |
test_keys_nested | 0.3092ms | 0.1457ms | 6.8645 KOps/s | 6.8826 KOps/s | |
test_keys_nested_locked | 2.1192ms | 0.1518ms | 6.5863 KOps/s | 6.6413 KOps/s | |
test_keys_nested_leaf | 0.2328ms | 0.1241ms | 8.0595 KOps/s | 8.1027 KOps/s | |
test_keys_stack_nested | 0.2506ms | 0.1450ms | 6.8958 KOps/s | 6.9860 KOps/s | |
test_keys_stack_nested_leaf | 0.2374ms | 0.1238ms | 8.0745 KOps/s | 8.2219 KOps/s | |
test_keys_stack_nested_locked | 0.2621ms | 0.1504ms | 6.6508 KOps/s | 6.7671 KOps/s | |
test_values | 7.1985μs | 1.2333μs | 810.8359 KOps/s | 844.6142 KOps/s | |
test_values_nested | 0.1124ms | 49.2816μs | 20.2916 KOps/s | 20.1597 KOps/s | |
test_values_nested_locked | 0.1472ms | 49.8826μs | 20.0471 KOps/s | 20.3494 KOps/s | |
test_values_nested_leaf | 84.6480μs | 45.1046μs | 22.1707 KOps/s | 22.4476 KOps/s | |
test_values_stack_nested | 0.1427ms | 49.5468μs | 20.1829 KOps/s | 19.5765 KOps/s | |
test_values_stack_nested_leaf | 0.1122ms | 44.1921μs | 22.6285 KOps/s | 22.7514 KOps/s | |
test_values_stack_nested_locked | 90.2580μs | 49.5433μs | 20.1843 KOps/s | 19.7605 KOps/s | |
test_membership | 4.5256μs | 0.7367μs | 1.3573 MOps/s | 1.0897 MOps/s | |
test_membership_nested | 45.0640μs | 2.6547μs | 376.6916 KOps/s | 334.8930 KOps/s | |
test_membership_nested_leaf | 24.8170μs | 2.7081μs | 369.2583 KOps/s | 368.0026 KOps/s | |
test_membership_stacked_nested | 31.0880μs | 2.6676μs | 374.8625 KOps/s | 369.9455 KOps/s | |
test_membership_stacked_nested_leaf | 18.3040μs | 2.7147μs | 368.3654 KOps/s | 370.8934 KOps/s | |
test_membership_nested_last | 36.4080μs | 4.0026μs | 249.8363 KOps/s | 250.2794 KOps/s | |
test_membership_nested_leaf_last | 31.1180μs | 4.0094μs | 249.4111 KOps/s | 244.5662 KOps/s | |
test_membership_stacked_nested_last | 30.7080μs | 3.9566μs | 252.7404 KOps/s | 78.4059 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.9640μs | 4.0196μs | 248.7803 KOps/s | 77.5393 KOps/s | |
test_nested_getleaf | 35.8570μs | 10.9676μs | 91.1779 KOps/s | 92.1690 KOps/s | |
test_nested_get | 38.5820μs | 10.3522μs | 96.5981 KOps/s | 98.4252 KOps/s | |
test_stacked_getleaf | 48.4200μs | 10.9302μs | 91.4900 KOps/s | 94.0127 KOps/s | |
test_stacked_get | 36.7780μs | 10.3663μs | 96.4663 KOps/s | 98.3106 KOps/s | |
test_nested_getitemleaf | 37.7510μs | 11.3697μs | 87.9532 KOps/s | 89.2116 KOps/s | |
test_nested_getitem | 38.9130μs | 10.6048μs | 94.2967 KOps/s | 96.7293 KOps/s | |
test_stacked_getitemleaf | 50.5960μs | 11.4027μs | 87.6988 KOps/s | 89.0341 KOps/s | |
test_stacked_getitem | 32.4810μs | 10.5830μs | 94.4913 KOps/s | 98.2178 KOps/s | |
test_lock_nested | 3.4835ms | 0.4419ms | 2.2630 KOps/s | 2.2845 KOps/s | |
test_lock_stack_nested | 0.7341ms | 0.4101ms | 2.4383 KOps/s | 2.5299 KOps/s | |
test_unlock_nested | 0.7243ms | 0.3593ms | 2.7829 KOps/s | 2.3652 KOps/s | |
test_unlock_stack_nested | 0.5388ms | 0.3268ms | 3.0603 KOps/s | 3.2324 KOps/s | |
test_flatten_speed | 0.2373ms | 0.1050ms | 9.5252 KOps/s | 9.6057 KOps/s | |
test_unflatten_speed | 0.7708ms | 0.4425ms | 2.2598 KOps/s | 2.2702 KOps/s | |
test_common_ops | 5.2313ms | 0.7163ms | 1.3961 KOps/s | 1.2824 KOps/s | |
test_creation | 70.9820μs | 2.2993μs | 434.9102 KOps/s | 434.9955 KOps/s | |
test_creation_empty | 51.1950μs | 9.3236μs | 107.2547 KOps/s | 87.1683 KOps/s | |
test_creation_nested_1 | 38.9520μs | 12.6059μs | 79.3280 KOps/s | 67.8378 KOps/s | |
test_creation_nested_2 | 44.0820μs | 16.0634μs | 62.2533 KOps/s | 55.4781 KOps/s | |
test_clone | 69.2090μs | 13.2030μs | 75.7401 KOps/s | 78.0975 KOps/s | |
test_getitem[int] | 37.8610μs | 11.7856μs | 84.8492 KOps/s | 87.1455 KOps/s | |
test_getitem[slice_int] | 76.0520μs | 23.9848μs | 41.6931 KOps/s | 43.4937 KOps/s | |
test_getitem[range] | 0.1975ms | 44.3853μs | 22.5300 KOps/s | 21.5570 KOps/s | |
test_getitem[tuple] | 56.3750μs | 19.4061μs | 51.5303 KOps/s | 52.7721 KOps/s | |
test_getitem[list] | 0.1564ms | 39.8530μs | 25.0922 KOps/s | 24.3392 KOps/s | |
test_setitem_dim[int] | 56.9660μs | 29.4446μs | 33.9621 KOps/s | 30.5968 KOps/s | |
test_setitem_dim[slice_int] | 0.1120ms | 56.7043μs | 17.6354 KOps/s | 16.0631 KOps/s | |
test_setitem_dim[range] | 0.1133ms | 76.5954μs | 13.0556 KOps/s | 12.0372 KOps/s | |
test_setitem_dim[tuple] | 69.1790μs | 45.4444μs | 22.0049 KOps/s | 20.0201 KOps/s | |
test_setitem | 82.3340μs | 18.8865μs | 52.9480 KOps/s | 50.8290 KOps/s | |
test_set | 91.2800μs | 18.3934μs | 54.3675 KOps/s | 51.0313 KOps/s | |
test_set_shared | 2.3313ms | 0.1669ms | 5.9934 KOps/s | 6.0280 KOps/s | |
test_update | 0.1320ms | 20.2026μs | 49.4985 KOps/s | 44.3283 KOps/s | |
test_update_nested | 0.1454ms | 30.0125μs | 33.3195 KOps/s | 31.1137 KOps/s | |
test_update__nested | 83.0550μs | 25.0295μs | 39.9529 KOps/s | 40.4570 KOps/s | |
test_set_nested | 0.1529ms | 19.9493μs | 50.1271 KOps/s | 46.5141 KOps/s | |
test_set_nested_new | 0.1316ms | 24.8988μs | 40.1626 KOps/s | 38.1614 KOps/s | |
test_select | 0.1658ms | 40.7220μs | 24.5568 KOps/s | 23.9314 KOps/s | |
test_select_nested | 0.1306ms | 61.9022μs | 16.1545 KOps/s | 17.0040 KOps/s | |
test_exclude_nested | 0.1437ms | 81.7474μs | 12.2328 KOps/s | 12.8058 KOps/s | |
test_empty[True] | 0.4070ms | 0.3412ms | 2.9310 KOps/s | 2.9905 KOps/s | |
test_empty[False] | 11.6192μs | 1.3142μs | 760.9007 KOps/s | 775.2453 KOps/s | |
test_unbind_speed | 0.3349ms | 0.2596ms | 3.8523 KOps/s | 3.8981 KOps/s | |
test_unbind_speed_stack0 | 0.3600ms | 0.2582ms | 3.8727 KOps/s | 4.0297 KOps/s | |
test_unbind_speed_stack1 | 80.8449ms | 0.7617ms | 1.3129 KOps/s | 1.5107 KOps/s | |
test_split | 77.9265ms | 1.6480ms | 606.8031 Ops/s | 620.4178 Ops/s | |
test_chunk | 75.7063ms | 1.6539ms | 604.6136 Ops/s | 617.7223 Ops/s | |
test_creation[device0] | 4.3493ms | 95.3848μs | 10.4838 KOps/s | 10.6288 KOps/s | |
test_creation_from_tensor | 0.2532ms | 96.7175μs | 10.3394 KOps/s | 10.4599 KOps/s | |
test_add_one[memmap_tensor0] | 0.1986ms | 5.5040μs | 181.6852 KOps/s | 182.0623 KOps/s | |
test_contiguous[memmap_tensor0] | 18.2040μs | 0.6353μs | 1.5741 MOps/s | 1.5394 MOps/s | |
test_stack[memmap_tensor0] | 43.3910μs | 3.6931μs | 270.7787 KOps/s | 269.6970 KOps/s | |
test_memmaptd_index | 0.9684ms | 0.2528ms | 3.9551 KOps/s | 3.8869 KOps/s | |
test_memmaptd_index_astensor | 0.8899ms | 0.3269ms | 3.0592 KOps/s | 3.0229 KOps/s | |
test_memmaptd_index_op | 1.3610ms | 0.5785ms | 1.7286 KOps/s | 1.6251 KOps/s | |
test_serialize_model | 0.1287s | 0.1217s | 8.2168 Ops/s | 7.1286 Ops/s | |
test_serialize_model_pickle | 0.4600s | 0.3912s | 2.5562 Ops/s | 2.5418 Ops/s | |
test_serialize_weights | 0.2079s | 0.1333s | 7.5027 Ops/s | 8.0350 Ops/s | |
test_serialize_weights_returnearly | 0.1721s | 0.1610s | 6.2113 Ops/s | 6.0853 Ops/s | |
test_serialize_weights_pickle | 0.4617s | 0.4129s | 2.4217 Ops/s | 2.5344 Ops/s | |
test_serialize_weights_filesystem | 0.1522s | 0.1420s | 7.0414 Ops/s | 6.9212 Ops/s | |
test_serialize_model_filesystem | 0.1609s | 0.1523s | 6.5649 Ops/s | 6.5889 Ops/s | |
test_reshape_pytree | 70.6410μs | 25.8727μs | 38.6508 KOps/s | 38.9750 KOps/s | |
test_reshape_td | 78.0560μs | 34.2945μs | 29.1592 KOps/s | 30.0694 KOps/s | |
test_view_pytree | 92.3890μs | 25.6274μs | 39.0207 KOps/s | 38.5869 KOps/s | |
test_view_td | 0.1309ms | 40.3641μs | 24.7745 KOps/s | 26.0369 KOps/s | |
test_unbind_pytree | 80.2100μs | 29.8295μs | 33.5238 KOps/s | 34.2462 KOps/s | |
test_unbind_td | 0.3550ms | 38.5228μs | 25.9587 KOps/s | 26.4676 KOps/s | |
test_split_pytree | 61.3350μs | 29.5111μs | 33.8855 KOps/s | 34.2265 KOps/s | |
test_split_td | 0.4707ms | 39.9424μs | 25.0360 KOps/s | 25.3058 KOps/s | |
test_add_pytree | 83.4350μs | 35.4789μs | 28.1858 KOps/s | 28.9087 KOps/s | |
test_add_td | 0.1138ms | 51.9229μs | 19.2593 KOps/s | 17.8895 KOps/s | |
test_distributed | 0.2678ms | 0.1310ms | 7.6360 KOps/s | 7.5354 KOps/s | |
test_tdmodule | 28.5640μs | 15.2074μs | 65.7573 KOps/s | 58.8963 KOps/s | |
test_tdmodule_dispatch | 62.7480μs | 32.3615μs | 30.9009 KOps/s | 28.3202 KOps/s | |
test_tdseq | 36.6690μs | 16.9253μs | 59.0831 KOps/s | 52.2886 KOps/s | |
test_tdseq_dispatch | 71.3330μs | 35.7599μs | 27.9643 KOps/s | 24.9445 KOps/s | |
test_instantiation_functorch | 1.6107ms | 1.3212ms | 756.9161 Ops/s | 749.0238 Ops/s | |
test_instantiation_td | 1.6382ms | 1.0375ms | 963.8401 Ops/s | 890.2906 Ops/s | |
test_exec_functorch | 0.2864ms | 0.1674ms | 5.9735 KOps/s | 6.2156 KOps/s | |
test_exec_functional_call | 0.2824ms | 0.1517ms | 6.5918 KOps/s | 6.6425 KOps/s | |
test_exec_td | 0.2406ms | 0.1510ms | 6.6236 KOps/s | 6.7799 KOps/s | |
test_exec_td_decorator | 0.5210ms | 0.2372ms | 4.2157 KOps/s | 4.3415 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.6865ms | 0.4800ms | 2.0832 KOps/s | 2.0493 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8128ms | 0.4787ms | 2.0888 KOps/s | 2.0600 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6767ms | 0.3984ms | 2.5099 KOps/s | 2.5217 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5969ms | 0.3996ms | 2.5027 KOps/s | 2.5204 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1672ms | 0.5780ms | 1.7301 KOps/s | 1.7176 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8237ms | 0.5807ms | 1.7220 KOps/s | 1.7394 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7912ms | 0.4751ms | 2.1047 KOps/s | 2.1320 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7863ms | 0.4816ms | 2.0763 KOps/s | 2.1226 KOps/s | |
test_to_module_speed[True] | 2.4602ms | 1.8351ms | 544.9220 Ops/s | 560.9202 Ops/s | |
test_to_module_speed[False] | 2.8385ms | 1.7848ms | 560.2867 Ops/s | 569.6100 Ops/s | |
test_tc_init | 98.5840μs | 33.4518μs | 29.8937 KOps/s | 17.8970 KOps/s | |
test_tc_init_nested | 0.1275ms | 68.8372μs | 14.5270 KOps/s | 8.6041 KOps/s | |
test_tc_first_layer_tensor | 45.8960μs | 8.0951μs | 123.5322 KOps/s | 120.8341 KOps/s | |
test_tc_first_layer_nontensor | 35.8160μs | 8.0456μs | 124.2913 KOps/s | 120.8220 KOps/s | |
test_tc_second_layer_tensor | 40.3350μs | 2.5023μs | 399.6248 KOps/s | 412.2442 KOps/s | |
test_tc_second_layer_nontensor | 37.3490μs | 9.0181μs | 110.8879 KOps/s | 109.6898 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 72.3483ms | 16.2061μs | 61.7052 KOps/s | 84.9381 KOps/s | |
test_plain_set_stack_nested | 29.6700μs | 13.0026μs | 76.9077 KOps/s | 84.7436 KOps/s | |
test_plain_set_nested_inplace | 29.8800μs | 13.5403μs | 73.8537 KOps/s | 77.6066 KOps/s | |
test_plain_set_stack_nested_inplace | 36.5900μs | 13.5905μs | 73.5805 KOps/s | 77.8230 KOps/s | |
test_items | 19.0500μs | 4.7778μs | 209.3008 KOps/s | 210.7509 KOps/s | |
test_items_nested | 0.4634ms | 0.3935ms | 2.5416 KOps/s | 2.5398 KOps/s | |
test_items_nested_locked | 0.4772ms | 0.3949ms | 2.5323 KOps/s | 2.5103 KOps/s | |
test_items_nested_leaf | 0.1144ms | 87.0381μs | 11.4892 KOps/s | 11.6094 KOps/s | |
test_items_stack_nested | 0.4924ms | 0.3996ms | 2.5026 KOps/s | 2.5250 KOps/s | |
test_items_stack_nested_leaf | 0.1129ms | 85.9541μs | 11.6341 KOps/s | 11.5961 KOps/s | |
test_items_stack_nested_locked | 0.4824ms | 0.3976ms | 2.5154 KOps/s | 2.5289 KOps/s | |
test_keys | 19.7800μs | 4.3852μs | 228.0384 KOps/s | 229.5498 KOps/s | |
test_keys_nested | 93.6610μs | 68.5627μs | 14.5852 KOps/s | 14.7949 KOps/s | |
test_keys_nested_locked | 2.0834ms | 75.5931μs | 13.2287 KOps/s | 13.5152 KOps/s | |
test_keys_nested_leaf | 87.3510μs | 57.7691μs | 17.3103 KOps/s | 17.1791 KOps/s | |
test_keys_stack_nested | 86.9810μs | 67.3466μs | 14.8486 KOps/s | 14.6798 KOps/s | |
test_keys_stack_nested_leaf | 75.2210μs | 57.1981μs | 17.4831 KOps/s | 17.0538 KOps/s | |
test_keys_stack_nested_locked | 99.6620μs | 72.4353μs | 13.8054 KOps/s | 13.4693 KOps/s | |
test_values | 7.3633μs | 1.7595μs | 568.3308 KOps/s | 572.6121 KOps/s | |
test_values_nested | 47.6410μs | 34.3500μs | 29.1120 KOps/s | 29.0164 KOps/s | |
test_values_nested_locked | 58.2110μs | 36.1995μs | 27.6247 KOps/s | 27.3766 KOps/s | |
test_values_nested_leaf | 0.5280ms | 30.5224μs | 32.7628 KOps/s | 32.5515 KOps/s | |
test_values_stack_nested | 54.3700μs | 35.0870μs | 28.5006 KOps/s | 28.3969 KOps/s | |
test_values_stack_nested_leaf | 52.5810μs | 31.1516μs | 32.1010 KOps/s | 31.7247 KOps/s | |
test_values_stack_nested_locked | 54.3710μs | 36.7451μs | 27.2145 KOps/s | 27.1170 KOps/s | |
test_membership | 1.2340μs | 0.5556μs | 1.7998 MOps/s | 1.8781 MOps/s | |
test_membership_nested | 15.3300μs | 2.0887μs | 478.7612 KOps/s | 479.2769 KOps/s | |
test_membership_nested_leaf | 10.3800μs | 2.0057μs | 498.5733 KOps/s | 494.4746 KOps/s | |
test_membership_stacked_nested | 15.5400μs | 2.0857μs | 479.4552 KOps/s | 480.4516 KOps/s | |
test_membership_stacked_nested_leaf | 24.2100μs | 2.0744μs | 482.0559 KOps/s | 481.8150 KOps/s | |
test_membership_nested_last | 21.1800μs | 2.9743μs | 336.2175 KOps/s | 337.9324 KOps/s | |
test_membership_nested_leaf_last | 16.9700μs | 3.0297μs | 330.0707 KOps/s | 329.3451 KOps/s | |
test_membership_stacked_nested_last | 28.2710μs | 9.1980μs | 108.7187 KOps/s | 334.0949 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.6910μs | 9.1629μs | 109.1356 KOps/s | 329.8273 KOps/s | |
test_nested_getleaf | 26.5610μs | 8.0147μs | 124.7710 KOps/s | 124.4205 KOps/s | |
test_nested_get | 22.8600μs | 7.5312μs | 132.7817 KOps/s | 132.3651 KOps/s | |
test_stacked_getleaf | 29.6910μs | 8.0245μs | 124.6190 KOps/s | 124.4203 KOps/s | |
test_stacked_get | 0.1928ms | 7.5401μs | 132.6241 KOps/s | 132.7370 KOps/s | |
test_nested_getitemleaf | 22.5800μs | 8.2482μs | 121.2382 KOps/s | 122.4789 KOps/s | |
test_nested_getitem | 23.0800μs | 7.7018μs | 129.8400 KOps/s | 129.6281 KOps/s | |
test_stacked_getitemleaf | 31.8300μs | 8.1955μs | 122.0179 KOps/s | 122.4285 KOps/s | |
test_stacked_getitem | 23.3000μs | 7.6928μs | 129.9921 KOps/s | 130.1241 KOps/s | |
test_lock_nested | 4.0169ms | 0.4225ms | 2.3666 KOps/s | 2.4132 KOps/s | |
test_lock_stack_nested | 0.4011ms | 0.3750ms | 2.6669 KOps/s | 2.6189 KOps/s | |
test_unlock_nested | 88.1824ms | 0.4248ms | 2.3543 KOps/s | 3.0033 KOps/s | |
test_unlock_stack_nested | 0.3205ms | 0.2932ms | 3.4105 KOps/s | 3.3373 KOps/s | |
test_flatten_speed | 0.3496ms | 0.1066ms | 9.3836 KOps/s | 9.4314 KOps/s | |
test_unflatten_speed | 0.3183ms | 0.2878ms | 3.4746 KOps/s | 3.4212 KOps/s | |
test_common_ops | 0.9630ms | 0.5616ms | 1.7807 KOps/s | 1.6346 KOps/s | |
test_creation | 15.5400μs | 1.8588μs | 537.9711 KOps/s | 533.0910 KOps/s | |
test_creation_empty | 23.5510μs | 8.8128μs | 113.4718 KOps/s | 133.1204 KOps/s | |
test_creation_nested_1 | 29.6900μs | 10.6035μs | 94.3087 KOps/s | 109.2056 KOps/s | |
test_creation_nested_2 | 28.9400μs | 12.9554μs | 77.1878 KOps/s | 86.0788 KOps/s | |
test_clone | 57.0610μs | 10.9108μs | 91.6520 KOps/s | 91.6961 KOps/s | |
test_getitem[int] | 28.6810μs | 10.0680μs | 99.3250 KOps/s | 97.3141 KOps/s | |
test_getitem[slice_int] | 43.9510μs | 19.5044μs | 51.2705 KOps/s | 51.2673 KOps/s | |
test_getitem[range] | 0.1574ms | 38.0440μs | 26.2853 KOps/s | 26.8312 KOps/s | |
test_getitem[tuple] | 33.8600μs | 17.4255μs | 57.3872 KOps/s | 56.9187 KOps/s | |
test_getitem[list] | 0.1619ms | 31.2780μs | 31.9714 KOps/s | 31.4708 KOps/s | |
test_setitem_dim[int] | 59.4410μs | 24.5652μs | 40.7080 KOps/s | 43.3845 KOps/s | |
test_setitem_dim[slice_int] | 73.5510μs | 44.8001μs | 22.3214 KOps/s | 22.1633 KOps/s | |
test_setitem_dim[range] | 94.5110μs | 60.7692μs | 16.4557 KOps/s | 16.8510 KOps/s | |
test_setitem_dim[tuple] | 65.3010μs | 38.9330μs | 25.6851 KOps/s | 26.0204 KOps/s | |
test_setitem | 65.1310μs | 15.3387μs | 65.1948 KOps/s | 68.9567 KOps/s | |
test_set | 67.0610μs | 14.8874μs | 67.1707 KOps/s | 70.8883 KOps/s | |
test_set_shared | 2.7831ms | 95.4996μs | 10.4712 KOps/s | 10.1596 KOps/s | |
test_update | 84.7410μs | 17.5662μs | 56.9274 KOps/s | 61.8422 KOps/s | |
test_update_nested | 92.5210μs | 23.1134μs | 43.2649 KOps/s | 47.0844 KOps/s | |
test_update__nested | 79.2810μs | 21.3643μs | 46.8070 KOps/s | 47.5971 KOps/s | |
test_set_nested | 78.3410μs | 15.8513μs | 63.0865 KOps/s | 66.9140 KOps/s | |
test_set_nested_new | 0.1107ms | 19.1289μs | 52.2771 KOps/s | 54.3748 KOps/s | |
test_select | 0.1064ms | 31.9616μs | 31.2876 KOps/s | 32.2158 KOps/s | |
test_select_nested | 86.1310μs | 52.7404μs | 18.9608 KOps/s | 18.2226 KOps/s | |
test_exclude_nested | 95.4710μs | 71.0778μs | 14.0691 KOps/s | 13.8422 KOps/s | |
test_empty[True] | 0.3459ms | 0.2980ms | 3.3553 KOps/s | 3.3729 KOps/s | |
test_empty[False] | 2.1990μs | 0.9153μs | 1.0925 MOps/s | 1.0268 MOps/s | |
test_to | 87.8310μs | 58.2296μs | 17.1734 KOps/s | 16.7980 KOps/s | |
test_to_nonblocking | 61.3110μs | 34.6040μs | 28.8984 KOps/s | 27.3899 KOps/s | |
test_unbind_speed | 0.2744ms | 0.2544ms | 3.9312 KOps/s | 3.9417 KOps/s | |
test_unbind_speed_stack0 | 0.2960ms | 0.2481ms | 4.0305 KOps/s | 3.9251 KOps/s | |
test_unbind_speed_stack1 | 91.4297ms | 0.7697ms | 1.2991 KOps/s | 1.3646 KOps/s | |
test_split | 89.4906ms | 1.5678ms | 637.8270 Ops/s | 624.6162 Ops/s | |
test_chunk | 1.4810ms | 1.4286ms | 700.0103 Ops/s | 686.6939 Ops/s | |
test_creation[device0] | 0.1272ms | 53.5222μs | 18.6838 KOps/s | 17.6267 KOps/s | |
test_creation_from_tensor | 0.1902ms | 53.2773μs | 18.7697 KOps/s | 18.1860 KOps/s | |
test_add_one[memmap_tensor0] | 76.2610μs | 6.4768μs | 154.3972 KOps/s | 157.7866 KOps/s | |
test_contiguous[memmap_tensor0] | 23.9110μs | 0.5792μs | 1.7264 MOps/s | 1.7006 MOps/s | |
test_stack[memmap_tensor0] | 30.8800μs | 4.3154μs | 231.7257 KOps/s | 230.0374 KOps/s | |
test_memmaptd_index | 1.1580ms | 0.2514ms | 3.9780 KOps/s | 4.0027 KOps/s | |
test_memmaptd_index_astensor | 0.6472ms | 0.3140ms | 3.1845 KOps/s | 3.1548 KOps/s | |
test_memmaptd_index_op | 0.8466ms | 0.5796ms | 1.7253 KOps/s | 1.6254 KOps/s | |
test_serialize_model | 0.1867s | 0.1014s | 9.8609 Ops/s | 10.5217 Ops/s | |
test_serialize_model_pickle | 1.3507s | 1.2355s | 0.8094 Ops/s | 0.8078 Ops/s | |
test_serialize_weights | 92.4095ms | 88.1544ms | 11.3437 Ops/s | 9.6534 Ops/s | |
test_serialize_weights_returnearly | 0.1700s | 71.0804ms | 14.0686 Ops/s | 12.7420 Ops/s | |
test_serialize_weights_pickle | 1.3506s | 1.2434s | 0.8042 Ops/s | 0.8012 Ops/s | |
test_reshape_pytree | 57.3710μs | 24.9392μs | 40.0975 KOps/s | 40.1308 KOps/s | |
test_reshape_td | 54.1910μs | 29.8241μs | 33.5299 KOps/s | 33.7264 KOps/s | |
test_view_pytree | 54.7410μs | 24.6256μs | 40.6082 KOps/s | 39.0081 KOps/s | |
test_view_td | 61.7010μs | 36.6280μs | 27.3015 KOps/s | 27.2293 KOps/s | |
test_unbind_pytree | 0.1382ms | 30.4208μs | 32.8722 KOps/s | 33.2645 KOps/s | |
test_unbind_td | 0.4929ms | 37.8550μs | 26.4166 KOps/s | 26.2325 KOps/s | |
test_split_pytree | 57.8910μs | 32.5895μs | 30.6847 KOps/s | 29.9882 KOps/s | |
test_split_td | 0.1728ms | 36.8942μs | 27.1045 KOps/s | 27.8412 KOps/s | |
test_add_pytree | 0.1977ms | 38.1657μs | 26.2015 KOps/s | 27.5814 KOps/s | |
test_add_td | 79.9310μs | 47.2661μs | 21.1568 KOps/s | 21.5317 KOps/s | |
test_distributed | 1.8492ms | 71.7807μs | 13.9313 KOps/s | 12.6709 KOps/s | |
test_tdmodule | 0.1382ms | 14.2458μs | 70.1959 KOps/s | 76.8187 KOps/s | |
test_tdmodule_dispatch | 45.8110μs | 28.3875μs | 35.2267 KOps/s | 37.5329 KOps/s | |
test_tdseq | 30.8200μs | 15.1658μs | 65.9380 KOps/s | 69.2038 KOps/s | |
test_tdseq_dispatch | 51.3200μs | 30.8989μs | 32.3637 KOps/s | 34.1273 KOps/s | |
test_instantiation_functorch | 1.5365ms | 1.3667ms | 731.6872 Ops/s | 734.3950 Ops/s | |
test_instantiation_td | 92.6883ms | 1.0848ms | 921.8122 Ops/s | 1.0397 KOps/s | |
test_exec_functorch | 0.1805ms | 0.1421ms | 7.0350 KOps/s | 6.9509 KOps/s | |
test_exec_functional_call | 0.1693ms | 0.1339ms | 7.4692 KOps/s | 7.6727 KOps/s | |
test_exec_td | 0.1704ms | 0.1334ms | 7.4973 KOps/s | 7.7864 KOps/s | |
test_exec_td_decorator | 0.7454ms | 0.2122ms | 4.7117 KOps/s | 4.8847 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7677ms | 0.5903ms | 1.6941 KOps/s | 1.7364 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6456ms | 0.5896ms | 1.6960 KOps/s | 1.7454 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6996ms | 0.5340ms | 1.8726 KOps/s | 1.9572 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6971ms | 0.5317ms | 1.8807 KOps/s | 1.9433 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2255ms | 0.6610ms | 1.5128 KOps/s | 1.5423 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9377ms | 0.6705ms | 1.4913 KOps/s | 1.5451 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8116ms | 0.5758ms | 1.7368 KOps/s | 1.7439 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7733ms | 0.5751ms | 1.7389 KOps/s | 1.7506 KOps/s | |
test_vmap_transformer_speed[True-True] | 7.7946ms | 7.6034ms | 131.5196 Ops/s | 126.4953 Ops/s | |
test_vmap_transformer_speed[True-False] | 7.9651ms | 7.5830ms | 131.8735 Ops/s | 128.8308 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.9126ms | 7.5354ms | 132.7065 Ops/s | 129.4671 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.4337ms | 7.8352ms | 127.6294 Ops/s | 131.0116 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.4653ms | 19.1562ms | 52.2023 Ops/s | 52.5511 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.6081ms | 19.1634ms | 52.1829 Ops/s | 52.7703 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6467ms | 19.0140ms | 52.5928 Ops/s | 52.9564 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.0737ms | 19.1902ms | 52.1099 Ops/s | 52.8977 Ops/s | |
test_to_module_speed[True] | 2.9206ms | 1.5976ms | 625.9245 Ops/s | 648.6298 Ops/s | |
test_to_module_speed[False] | 2.0358ms | 1.5688ms | 637.4180 Ops/s | 651.8659 Ops/s | |
test_tc_init | 0.1689ms | 34.2705μs | 29.1796 KOps/s | 19.8105 KOps/s | |
test_tc_init_nested | 0.3959ms | 70.1147μs | 14.2624 KOps/s | 10.0658 KOps/s | |
test_tc_first_layer_tensor | 0.1355ms | 3.5831μs | 279.0878 KOps/s | 285.2919 KOps/s | |
test_tc_first_layer_nontensor | 0.1284ms | 3.5945μs | 278.2041 KOps/s | 281.5400 KOps/s | |
test_tc_second_layer_tensor | 27.5964μs | 1.1356μs | 880.5617 KOps/s | 906.5597 KOps/s | |
test_tc_second_layer_nontensor | 0.1168ms | 4.1234μs | 242.5176 KOps/s | 247.0890 KOps/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.