Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Patch pad_sequence #742

Merged
merged 2 commits into from
Apr 22, 2024
Merged

[BugFix] Patch pad_sequence #742

merged 2 commits into from
Apr 22, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Apr 22, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 22, 2024
@vmoens vmoens added the bug Something isn't working label Apr 22, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 45.3650μs 16.0962μs 62.1263 KOps/s 57.4723 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_plain_set_stack_nested 58.8590μs 16.2609μs 61.4972 KOps/s 57.7234 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_plain_set_nested_inplace 63.7890μs 18.3240μs 54.5733 KOps/s 51.1304 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_plain_set_stack_nested_inplace 61.2440μs 19.6300μs 50.9423 KOps/s 50.7700 KOps/s $\color{#35bf28}+0.34\%$
test_items 31.9300μs 2.6009μs 384.4777 KOps/s 400.4404 KOps/s $\color{#d91a1a}-3.99\%$
test_items_nested 0.4807ms 0.2703ms 3.6993 KOps/s 3.6164 KOps/s $\color{#35bf28}+2.29\%$
test_items_nested_locked 0.6212ms 0.2714ms 3.6846 KOps/s 3.6941 KOps/s $\color{#d91a1a}-0.26\%$
test_items_nested_leaf 0.6404ms 76.6506μs 13.0462 KOps/s 12.9245 KOps/s $\color{#35bf28}+0.94\%$
test_items_stack_nested 0.4007ms 0.2736ms 3.6544 KOps/s 3.6691 KOps/s $\color{#d91a1a}-0.40\%$
test_items_stack_nested_leaf 0.1562ms 78.4766μs 12.7427 KOps/s 12.9322 KOps/s $\color{#d91a1a}-1.47\%$
test_items_stack_nested_locked 0.3695ms 0.2728ms 3.6650 KOps/s 3.6647 KOps/s $+0.01\%$
test_keys 18.9450μs 3.8350μs 260.7578 KOps/s 257.7531 KOps/s $\color{#35bf28}+1.17\%$
test_keys_nested 0.2672ms 0.1371ms 7.2913 KOps/s 7.2526 KOps/s $\color{#35bf28}+0.53\%$
test_keys_nested_locked 2.5458ms 0.1426ms 7.0110 KOps/s 7.0247 KOps/s $\color{#d91a1a}-0.20\%$
test_keys_nested_leaf 0.1974ms 0.1164ms 8.5893 KOps/s 8.5451 KOps/s $\color{#35bf28}+0.52\%$
test_keys_stack_nested 0.2558ms 0.1354ms 7.3836 KOps/s 7.5178 KOps/s $\color{#d91a1a}-1.79\%$
test_keys_stack_nested_leaf 0.2149ms 0.1165ms 8.5846 KOps/s 8.6536 KOps/s $\color{#d91a1a}-0.80\%$
test_keys_stack_nested_locked 0.2653ms 0.1409ms 7.0953 KOps/s 6.7169 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_values 7.5640μs 1.1808μs 846.8965 KOps/s 824.2440 KOps/s $\color{#35bf28}+2.75\%$
test_values_nested 2.3781ms 51.6838μs 19.3484 KOps/s 19.7643 KOps/s $\color{#d91a1a}-2.10\%$
test_values_nested_locked 0.1064ms 51.6356μs 19.3665 KOps/s 19.8008 KOps/s $\color{#d91a1a}-2.19\%$
test_values_nested_leaf 90.2170μs 45.9165μs 21.7787 KOps/s 22.0139 KOps/s $\color{#d91a1a}-1.07\%$
test_values_stack_nested 98.2020μs 51.4889μs 19.4216 KOps/s 19.5892 KOps/s $\color{#d91a1a}-0.86\%$
test_values_stack_nested_leaf 97.2400μs 46.1627μs 21.6625 KOps/s 22.0887 KOps/s $\color{#d91a1a}-1.93\%$
test_values_stack_nested_locked 0.1089ms 51.3215μs 19.4850 KOps/s 19.1471 KOps/s $\color{#35bf28}+1.76\%$
test_membership 15.4180μs 1.4033μs 712.5902 KOps/s 747.1353 KOps/s $\color{#d91a1a}-4.62\%$
test_membership_nested 21.7810μs 3.4973μs 285.9349 KOps/s 290.0796 KOps/s $\color{#d91a1a}-1.43\%$
test_membership_nested_leaf 30.4960μs 3.5068μs 285.1634 KOps/s 290.9084 KOps/s $\color{#d91a1a}-1.97\%$
test_membership_stacked_nested 29.4050μs 3.5020μs 285.5473 KOps/s 286.1488 KOps/s $\color{#d91a1a}-0.21\%$
test_membership_stacked_nested_leaf 34.6450μs 3.5855μs 278.9050 KOps/s 289.3580 KOps/s $\color{#d91a1a}-3.61\%$
test_membership_nested_last 24.6960μs 4.3252μs 231.2016 KOps/s 236.2278 KOps/s $\color{#d91a1a}-2.13\%$
test_membership_nested_leaf_last 37.8900μs 4.3181μs 231.5831 KOps/s 238.1023 KOps/s $\color{#d91a1a}-2.74\%$
test_membership_stacked_nested_last 22.1510μs 4.2458μs 235.5253 KOps/s 160.2664 KOps/s $\textbf{\color{#35bf28}+46.96\%}$
test_membership_stacked_nested_leaf_last 36.0670μs 4.2758μs 233.8763 KOps/s 159.3658 KOps/s $\textbf{\color{#35bf28}+46.75\%}$
test_nested_getleaf 30.4560μs 10.5225μs 95.0348 KOps/s 96.5537 KOps/s $\color{#d91a1a}-1.57\%$
test_nested_get 48.5400μs 10.0149μs 99.8512 KOps/s 100.5815 KOps/s $\color{#d91a1a}-0.73\%$
test_stacked_getleaf 44.9130μs 10.6538μs 93.8636 KOps/s 97.6684 KOps/s $\color{#d91a1a}-3.90\%$
test_stacked_get 22.0710μs 10.0158μs 99.8425 KOps/s 101.8340 KOps/s $\color{#d91a1a}-1.96\%$
test_nested_getitemleaf 62.8860μs 11.2341μs 89.0146 KOps/s 92.3797 KOps/s $\color{#d91a1a}-3.64\%$
test_nested_getitem 40.5150μs 10.3186μs 96.9128 KOps/s 100.7683 KOps/s $\color{#d91a1a}-3.83\%$
test_stacked_getitemleaf 49.6030μs 11.0800μs 90.2530 KOps/s 93.1764 KOps/s $\color{#d91a1a}-3.14\%$
test_stacked_getitem 32.8110μs 10.3512μs 96.6069 KOps/s 100.5636 KOps/s $\color{#d91a1a}-3.93\%$
test_lock_nested 48.4379ms 0.3887ms 2.5729 KOps/s 2.9592 KOps/s $\textbf{\color{#d91a1a}-13.05\%}$
test_lock_stack_nested 0.6005ms 0.3097ms 3.2292 KOps/s 3.3609 KOps/s $\color{#d91a1a}-3.92\%$
test_unlock_nested 78.5967ms 0.4248ms 2.3540 KOps/s 2.4118 KOps/s $\color{#d91a1a}-2.40\%$
test_unlock_stack_nested 0.4995ms 0.3202ms 3.1230 KOps/s 3.2695 KOps/s $\color{#d91a1a}-4.48\%$
test_flatten_speed 0.4241ms 90.7919μs 11.0142 KOps/s 11.0650 KOps/s $\color{#d91a1a}-0.46\%$
test_unflatten_speed 0.6864ms 0.4009ms 2.4944 KOps/s 2.5106 KOps/s $\color{#d91a1a}-0.64\%$
test_common_ops 3.9207ms 0.6799ms 1.4708 KOps/s 1.4158 KOps/s $\color{#35bf28}+3.88\%$
test_creation 18.0730μs 1.9839μs 504.0581 KOps/s 543.2029 KOps/s $\textbf{\color{#d91a1a}-7.21\%}$
test_creation_empty 31.8490μs 9.6026μs 104.1380 KOps/s 86.8463 KOps/s $\textbf{\color{#35bf28}+19.91\%}$
test_creation_nested_1 33.7830μs 12.5480μs 79.6937 KOps/s 71.5536 KOps/s $\textbf{\color{#35bf28}+11.38\%}$
test_creation_nested_2 59.3000μs 15.5097μs 64.4756 KOps/s 57.6432 KOps/s $\textbf{\color{#35bf28}+11.85\%}$
test_clone 59.4700μs 13.1098μs 76.2790 KOps/s 77.2276 KOps/s $\color{#d91a1a}-1.23\%$
test_getitem[int] 33.5620μs 11.4193μs 87.5710 KOps/s 90.5901 KOps/s $\color{#d91a1a}-3.33\%$
test_getitem[slice_int] 67.0750μs 22.4340μs 44.5751 KOps/s 44.1965 KOps/s $\color{#35bf28}+0.86\%$
test_getitem[range] 0.1482ms 40.8994μs 24.4502 KOps/s 25.3286 KOps/s $\color{#d91a1a}-3.47\%$
test_getitem[tuple] 52.4780μs 18.3042μs 54.6324 KOps/s 53.9146 KOps/s $\color{#35bf28}+1.33\%$
test_getitem[list] 0.1496ms 36.3459μs 27.5134 KOps/s 27.8562 KOps/s $\color{#d91a1a}-1.23\%$
test_setitem_dim[int] 88.9060μs 32.3365μs 30.9248 KOps/s 27.6065 KOps/s $\textbf{\color{#35bf28}+12.02\%}$
test_setitem_dim[slice_int] 83.4050μs 59.2965μs 16.8644 KOps/s 16.0230 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_setitem_dim[range] 0.1300ms 77.4454μs 12.9123 KOps/s 12.4403 KOps/s $\color{#35bf28}+3.79\%$
test_setitem_dim[tuple] 75.2900μs 47.6413μs 20.9902 KOps/s 19.6667 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_setitem 67.0550μs 19.0092μs 52.6062 KOps/s 50.3702 KOps/s $\color{#35bf28}+4.44\%$
test_set 63.6880μs 18.5663μs 53.8609 KOps/s 51.7838 KOps/s $\color{#35bf28}+4.01\%$
test_set_shared 3.3885ms 0.1399ms 7.1496 KOps/s 7.0690 KOps/s $\color{#35bf28}+1.14\%$
test_update 0.1326ms 19.8750μs 50.3144 KOps/s 45.1939 KOps/s $\textbf{\color{#35bf28}+11.33\%}$
test_update_nested 68.8180μs 27.7005μs 36.1004 KOps/s 33.8351 KOps/s $\textbf{\color{#35bf28}+6.70\%}$
test_update__nested 62.7660μs 23.9874μs 41.6885 KOps/s 41.5944 KOps/s $\color{#35bf28}+0.23\%$
test_set_nested 64.4500μs 19.9817μs 50.0458 KOps/s 46.9248 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_set_nested_new 61.0830μs 23.8595μs 41.9119 KOps/s 39.8964 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_select 0.1043ms 39.3510μs 25.4123 KOps/s 25.9081 KOps/s $\color{#d91a1a}-1.91\%$
test_select_nested 0.1140ms 59.5427μs 16.7947 KOps/s 17.0576 KOps/s $\color{#d91a1a}-1.54\%$
test_exclude_nested 0.2612ms 0.1173ms 8.5222 KOps/s 8.3211 KOps/s $\color{#35bf28}+2.42\%$
test_empty[True] 0.5677ms 0.3888ms 2.5717 KOps/s 2.5791 KOps/s $\color{#d91a1a}-0.29\%$
test_empty[False] 6.6404μs 1.0517μs 950.8729 KOps/s 965.0780 KOps/s $\color{#d91a1a}-1.47\%$
test_unbind_speed 1.6176ms 0.2483ms 4.0268 KOps/s 3.9879 KOps/s $\color{#35bf28}+0.98\%$
test_unbind_speed_stack0 0.3643ms 0.2467ms 4.0529 KOps/s 4.1756 KOps/s $\color{#d91a1a}-2.94\%$
test_unbind_speed_stack1 0.1181s 0.6893ms 1.4508 KOps/s 1.6634 KOps/s $\textbf{\color{#d91a1a}-12.78\%}$
test_split 0.1099s 1.6437ms 608.3711 Ops/s 601.2203 Ops/s $\color{#35bf28}+1.19\%$
test_chunk 2.2225ms 1.4845ms 673.6134 Ops/s 686.1736 Ops/s $\color{#d91a1a}-1.83\%$
test_creation[device0] 0.1806ms 98.9344μs 10.1077 KOps/s 9.7014 KOps/s $\color{#35bf28}+4.19\%$
test_creation_from_tensor 4.4736ms 80.3441μs 12.4465 KOps/s 12.4873 KOps/s $\color{#d91a1a}-0.33\%$
test_add_one[memmap_tensor0] 86.4310μs 5.1410μs 194.5158 KOps/s 190.9020 KOps/s $\color{#35bf28}+1.89\%$
test_contiguous[memmap_tensor0] 29.7660μs 0.6370μs 1.5699 MOps/s 1.6132 MOps/s $\color{#d91a1a}-2.69\%$
test_stack[memmap_tensor0] 40.3150μs 3.5263μs 283.5826 KOps/s 285.4268 KOps/s $\color{#d91a1a}-0.65\%$
test_memmaptd_index 1.0463ms 0.2342ms 4.2697 KOps/s 4.2679 KOps/s $\color{#35bf28}+0.04\%$
test_memmaptd_index_astensor 0.6379ms 0.2951ms 3.3887 KOps/s 3.3566 KOps/s $\color{#35bf28}+0.96\%$
test_memmaptd_index_op 0.9083ms 0.5626ms 1.7774 KOps/s 1.6721 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_serialize_model 0.2240s 0.1153s 8.6698 Ops/s 8.4975 Ops/s $\color{#35bf28}+2.03\%$
test_serialize_model_pickle 0.4500s 0.3761s 2.6591 Ops/s 2.6425 Ops/s $\color{#35bf28}+0.63\%$
test_serialize_weights 0.1016s 96.5435ms 10.3580 Ops/s 10.0393 Ops/s $\color{#35bf28}+3.18\%$
test_serialize_weights_returnearly 0.1291s 0.1233s 8.1091 Ops/s 8.1256 Ops/s $\color{#d91a1a}-0.20\%$
test_serialize_weights_pickle 1.0937s 0.5937s 1.6843 Ops/s 2.5282 Ops/s $\textbf{\color{#d91a1a}-33.38\%}$
test_serialize_weights_filesystem 95.3136ms 92.7908ms 10.7769 Ops/s 10.9738 Ops/s $\color{#d91a1a}-1.79\%$
test_serialize_model_filesystem 98.3222ms 92.9316ms 10.7606 Ops/s 10.3527 Ops/s $\color{#35bf28}+3.94\%$
test_reshape_pytree 42.9200μs 20.8340μs 47.9986 KOps/s 48.6905 KOps/s $\color{#d91a1a}-1.42\%$
test_reshape_td 83.7950μs 32.1415μs 31.1124 KOps/s 31.8844 KOps/s $\color{#d91a1a}-2.42\%$
test_view_pytree 57.3470μs 20.7463μs 48.2014 KOps/s 48.4764 KOps/s $\color{#d91a1a}-0.57\%$
test_view_td 0.1268s 58.8098μs 17.0040 KOps/s 17.8728 KOps/s $\color{#d91a1a}-4.86\%$
test_unbind_pytree 78.1550μs 23.9607μs 41.7349 KOps/s 41.6272 KOps/s $\color{#35bf28}+0.26\%$
test_unbind_td 0.1568ms 37.4095μs 26.7312 KOps/s 27.2801 KOps/s $\color{#d91a1a}-2.01\%$
test_split_pytree 60.1920μs 23.9622μs 41.7325 KOps/s 42.8293 KOps/s $\color{#d91a1a}-2.56\%$
test_split_td 0.4156ms 39.8237μs 25.1107 KOps/s 25.0632 KOps/s $\color{#35bf28}+0.19\%$
test_add_pytree 78.0650μs 29.3751μs 34.0425 KOps/s 34.2313 KOps/s $\color{#d91a1a}-0.55\%$
test_add_td 0.1115ms 50.0860μs 19.9657 KOps/s 18.1327 KOps/s $\textbf{\color{#35bf28}+10.11\%}$
test_distributed 0.2920ms 99.3095μs 10.0695 KOps/s 9.9592 KOps/s $\color{#35bf28}+1.11\%$
test_tdmodule 71.0120μs 16.5863μs 60.2908 KOps/s 55.0877 KOps/s $\textbf{\color{#35bf28}+9.45\%}$
test_tdmodule_dispatch 70.2210μs 32.7508μs 30.5336 KOps/s 26.7057 KOps/s $\textbf{\color{#35bf28}+14.33\%}$
test_tdseq 37.0490μs 20.0451μs 49.8875 KOps/s 47.9871 KOps/s $\color{#35bf28}+3.96\%$
test_tdseq_dispatch 70.1700μs 39.3884μs 25.3882 KOps/s 24.7303 KOps/s $\color{#35bf28}+2.66\%$
test_instantiation_functorch 2.1128ms 1.3014ms 768.3958 Ops/s 768.5483 Ops/s $\color{#d91a1a}-0.02\%$
test_instantiation_td 1.4704ms 1.0023ms 997.7134 Ops/s 1.0045 KOps/s $\color{#d91a1a}-0.68\%$
test_exec_functorch 0.3371ms 0.1559ms 6.4159 KOps/s 6.4153 KOps/s $+0.01\%$
test_exec_functional_call 0.2868ms 0.1445ms 6.9204 KOps/s 6.8150 KOps/s $\color{#35bf28}+1.55\%$
test_exec_td 0.2258ms 0.1398ms 7.1538 KOps/s 7.0987 KOps/s $\color{#35bf28}+0.78\%$
test_exec_td_decorator 0.5601ms 0.1943ms 5.1475 KOps/s 5.2442 KOps/s $\color{#d91a1a}-1.85\%$
test_vmap_mlp_speed[True-True] 0.7160ms 0.4668ms 2.1422 KOps/s 2.1128 KOps/s $\color{#35bf28}+1.39\%$
test_vmap_mlp_speed[True-False] 0.7047ms 0.4623ms 2.1632 KOps/s 2.1382 KOps/s $\color{#35bf28}+1.17\%$
test_vmap_mlp_speed[False-True] 0.5228ms 0.3748ms 2.6684 KOps/s 2.6409 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed[False-False] 0.7084ms 0.3758ms 2.6612 KOps/s 2.6292 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed_decorator[True-True] 0.9376ms 0.4865ms 2.0554 KOps/s 2.0342 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[True-False] 0.8548ms 0.4879ms 2.0498 KOps/s 2.0331 KOps/s $\color{#35bf28}+0.82\%$
test_vmap_mlp_speed_decorator[False-True] 0.5729ms 0.3952ms 2.5304 KOps/s 2.4978 KOps/s $\color{#35bf28}+1.31\%$
test_vmap_mlp_speed_decorator[False-False] 0.5545ms 0.3940ms 2.5382 KOps/s 2.4978 KOps/s $\color{#35bf28}+1.62\%$
test_to_module_speed[True] 1.5075ms 1.4154ms 706.5291 Ops/s 724.8744 Ops/s $\color{#d91a1a}-2.53\%$
test_to_module_speed[False] 2.2433ms 1.4020ms 713.2728 Ops/s 729.8641 Ops/s $\color{#d91a1a}-2.27\%$

@vmoens vmoens merged commit dc5d451 into main Apr 22, 2024
44 of 48 checks passed
@vmoens vmoens deleted the patch-pad-dim branch April 22, 2024 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants