Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Make functorch.dim optional #737

Merged
merged 1 commit into from
Apr 20, 2024
Merged

[BugFix] Make functorch.dim optional #737

merged 1 commit into from
Apr 20, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Apr 19, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 19, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}31$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 32.6310μs 17.4729μs 57.2316 KOps/s 62.3447 KOps/s $\textbf{\color{#d91a1a}-8.20\%}$
test_plain_set_stack_nested 46.5870μs 17.6414μs 56.6848 KOps/s 61.9329 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_plain_set_nested_inplace 69.4090μs 20.1565μs 49.6117 KOps/s 54.4922 KOps/s $\textbf{\color{#d91a1a}-8.96\%}$
test_plain_set_stack_nested_inplace 51.5360μs 19.9819μs 50.0453 KOps/s 54.0047 KOps/s $\textbf{\color{#d91a1a}-7.33\%}$
test_items 44.2320μs 2.5303μs 395.2030 KOps/s 390.3154 KOps/s $\color{#35bf28}+1.25\%$
test_items_nested 0.4528ms 0.2707ms 3.6943 KOps/s 3.7143 KOps/s $\color{#d91a1a}-0.54\%$
test_items_nested_locked 1.4587ms 0.2803ms 3.5679 KOps/s 3.6679 KOps/s $\color{#d91a1a}-2.73\%$
test_items_nested_leaf 0.2382ms 80.1937μs 12.4698 KOps/s 12.9561 KOps/s $\color{#d91a1a}-3.75\%$
test_items_stack_nested 0.4410ms 0.2737ms 3.6542 KOps/s 3.6899 KOps/s $\color{#d91a1a}-0.97\%$
test_items_stack_nested_leaf 0.1353ms 75.9638μs 13.1642 KOps/s 12.5490 KOps/s $\color{#35bf28}+4.90\%$
test_items_stack_nested_locked 0.4840ms 0.2741ms 3.6484 KOps/s 3.6373 KOps/s $\color{#35bf28}+0.31\%$
test_keys 40.4960μs 3.7985μs 263.2638 KOps/s 261.2755 KOps/s $\color{#35bf28}+0.76\%$
test_keys_nested 0.8705ms 0.1403ms 7.1286 KOps/s 7.2918 KOps/s $\color{#d91a1a}-2.24\%$
test_keys_nested_locked 0.8067ms 0.1416ms 7.0646 KOps/s 6.9939 KOps/s $\color{#35bf28}+1.01\%$
test_keys_nested_leaf 0.2123ms 0.1160ms 8.6201 KOps/s 8.6476 KOps/s $\color{#d91a1a}-0.32\%$
test_keys_stack_nested 0.2221ms 0.1335ms 7.4908 KOps/s 7.3026 KOps/s $\color{#35bf28}+2.58\%$
test_keys_stack_nested_leaf 0.1855ms 0.1131ms 8.8422 KOps/s 8.6058 KOps/s $\color{#35bf28}+2.75\%$
test_keys_stack_nested_locked 0.5880ms 0.1383ms 7.2291 KOps/s 7.1005 KOps/s $\color{#35bf28}+1.81\%$
test_values 7.8120μs 1.1628μs 860.0295 KOps/s 863.2631 KOps/s $\color{#d91a1a}-0.37\%$
test_values_nested 0.1087ms 51.0354μs 19.5942 KOps/s 19.6216 KOps/s $\color{#d91a1a}-0.14\%$
test_values_nested_locked 0.1088ms 50.7580μs 19.7013 KOps/s 19.6233 KOps/s $\color{#35bf28}+0.40\%$
test_values_nested_leaf 0.1018ms 45.8984μs 21.7873 KOps/s 21.7525 KOps/s $\color{#35bf28}+0.16\%$
test_values_stack_nested 0.1069ms 51.6477μs 19.3620 KOps/s 19.4903 KOps/s $\color{#d91a1a}-0.66\%$
test_values_stack_nested_leaf 98.7440μs 44.9473μs 22.2483 KOps/s 21.8753 KOps/s $\color{#35bf28}+1.70\%$
test_values_stack_nested_locked 91.6910μs 51.6510μs 19.3607 KOps/s 19.5091 KOps/s $\color{#d91a1a}-0.76\%$
test_membership 27.7320μs 1.3342μs 749.5349 KOps/s 750.4514 KOps/s $\color{#d91a1a}-0.12\%$
test_membership_nested 31.9700μs 3.3593μs 297.6796 KOps/s 299.0077 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_leaf 47.4080μs 3.3785μs 295.9883 KOps/s 296.6851 KOps/s $\color{#d91a1a}-0.23\%$
test_membership_stacked_nested 19.8370μs 3.3385μs 299.5345 KOps/s 295.9701 KOps/s $\color{#35bf28}+1.20\%$
test_membership_stacked_nested_leaf 28.0120μs 3.3689μs 296.8349 KOps/s 299.0230 KOps/s $\color{#d91a1a}-0.73\%$
test_membership_nested_last 24.7060μs 4.1894μs 238.7003 KOps/s 238.3609 KOps/s $\color{#35bf28}+0.14\%$
test_membership_nested_leaf_last 46.5260μs 4.1455μs 241.2230 KOps/s 240.3908 KOps/s $\color{#35bf28}+0.35\%$
test_membership_stacked_nested_last 51.0350μs 13.2400μs 75.5285 KOps/s 242.2201 KOps/s $\textbf{\color{#d91a1a}-68.82\%}$
test_membership_stacked_nested_leaf_last 89.1460μs 13.3736μs 74.7743 KOps/s 241.8073 KOps/s $\textbf{\color{#d91a1a}-69.08\%}$
test_nested_getleaf 73.6470μs 10.6847μs 93.5921 KOps/s 96.2710 KOps/s $\color{#d91a1a}-2.78\%$
test_nested_get 32.0690μs 10.0374μs 99.6271 KOps/s 100.7491 KOps/s $\color{#d91a1a}-1.11\%$
test_stacked_getleaf 67.3550μs 10.5169μs 95.0851 KOps/s 96.3927 KOps/s $\color{#d91a1a}-1.36\%$
test_stacked_get 35.5560μs 9.9315μs 100.6896 KOps/s 101.4956 KOps/s $\color{#d91a1a}-0.79\%$
test_nested_getitemleaf 44.3730μs 11.2524μs 88.8698 KOps/s 90.2692 KOps/s $\color{#d91a1a}-1.55\%$
test_nested_getitem 47.1370μs 10.3847μs 96.2958 KOps/s 98.7894 KOps/s $\color{#d91a1a}-2.52\%$
test_stacked_getitemleaf 57.4370μs 11.2055μs 89.2418 KOps/s 91.0385 KOps/s $\color{#d91a1a}-1.97\%$
test_stacked_getitem 39.7140μs 10.3627μs 96.4996 KOps/s 99.0067 KOps/s $\color{#d91a1a}-2.53\%$
test_lock_nested 51.5105ms 0.3897ms 2.5658 KOps/s 2.9051 KOps/s $\textbf{\color{#d91a1a}-11.68\%}$
test_lock_stack_nested 0.3870ms 0.2911ms 3.4357 KOps/s 3.2718 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_unlock_nested 96.0812ms 0.4421ms 2.2619 KOps/s 2.2431 KOps/s $\color{#35bf28}+0.84\%$
test_unlock_stack_nested 0.5089ms 0.3008ms 3.3240 KOps/s 3.1617 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_flatten_speed 0.4226ms 92.2586μs 10.8391 KOps/s 10.8128 KOps/s $\color{#35bf28}+0.24\%$
test_unflatten_speed 0.9413ms 0.4037ms 2.4774 KOps/s 2.4733 KOps/s $\color{#35bf28}+0.16\%$
test_common_ops 4.7463ms 0.7316ms 1.3669 KOps/s 1.5043 KOps/s $\textbf{\color{#d91a1a}-9.13\%}$
test_creation 60.8130μs 1.8502μs 540.4676 KOps/s 533.7655 KOps/s $\color{#35bf28}+1.26\%$
test_creation_empty 39.5430μs 11.6438μs 85.8826 KOps/s 112.1533 KOps/s $\textbf{\color{#d91a1a}-23.42\%}$
test_creation_nested_1 48.1200μs 14.3510μs 69.6814 KOps/s 87.7240 KOps/s $\textbf{\color{#d91a1a}-20.57\%}$
test_creation_nested_2 45.8450μs 17.6015μs 56.8132 KOps/s 67.5687 KOps/s $\textbf{\color{#d91a1a}-15.92\%}$
test_clone 98.2330μs 13.7055μs 72.9632 KOps/s 75.7303 KOps/s $\color{#d91a1a}-3.65\%$
test_getitem[int] 0.1684ms 11.8494μs 84.3927 KOps/s 89.0819 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_getitem[slice_int] 69.2990μs 22.7451μs 43.9656 KOps/s 44.9182 KOps/s $\color{#d91a1a}-2.12\%$
test_getitem[range] 0.2467ms 41.1018μs 24.3298 KOps/s 25.0553 KOps/s $\color{#d91a1a}-2.90\%$
test_getitem[tuple] 61.6570μs 18.0969μs 55.2582 KOps/s 54.1376 KOps/s $\color{#35bf28}+2.07\%$
test_getitem[list] 0.2537ms 37.1176μs 26.9414 KOps/s 27.3783 KOps/s $\color{#d91a1a}-1.60\%$
test_setitem_dim[int] 82.4040μs 36.3106μs 27.5402 KOps/s 31.9629 KOps/s $\textbf{\color{#d91a1a}-13.84\%}$
test_setitem_dim[slice_int] 0.1076ms 61.8712μs 16.1626 KOps/s 17.7527 KOps/s $\textbf{\color{#d91a1a}-8.96\%}$
test_setitem_dim[range] 0.1432ms 88.0681μs 11.3549 KOps/s 13.1834 KOps/s $\textbf{\color{#d91a1a}-13.87\%}$
test_setitem_dim[tuple] 0.1060ms 51.8402μs 19.2900 KOps/s 22.0299 KOps/s $\textbf{\color{#d91a1a}-12.44\%}$
test_setitem 0.1239ms 21.6248μs 46.2433 KOps/s 53.2083 KOps/s $\textbf{\color{#d91a1a}-13.09\%}$
test_set 0.1169ms 20.5637μs 48.6295 KOps/s 53.9025 KOps/s $\textbf{\color{#d91a1a}-9.78\%}$
test_set_shared 2.3718ms 0.1420ms 7.0404 KOps/s 7.0196 KOps/s $\color{#35bf28}+0.30\%$
test_update 0.2273ms 23.1870μs 43.1275 KOps/s 51.6359 KOps/s $\textbf{\color{#d91a1a}-16.48\%}$
test_update_nested 0.2895ms 32.6377μs 30.6394 KOps/s 36.2263 KOps/s $\textbf{\color{#d91a1a}-15.42\%}$
test_update__nested 81.7720μs 25.4307μs 39.3225 KOps/s 40.3720 KOps/s $\color{#d91a1a}-2.60\%$
test_set_nested 0.1031ms 23.0810μs 43.3256 KOps/s 49.6631 KOps/s $\textbf{\color{#d91a1a}-12.76\%}$
test_set_nested_new 2.1882ms 26.7191μs 37.4264 KOps/s 42.0416 KOps/s $\textbf{\color{#d91a1a}-10.98\%}$
test_select 0.1319ms 41.0378μs 24.3678 KOps/s 25.4899 KOps/s $\color{#d91a1a}-4.40\%$
test_select_nested 0.1294ms 59.4082μs 16.8327 KOps/s 16.3542 KOps/s $\color{#35bf28}+2.93\%$
test_exclude_nested 0.3072ms 0.1189ms 8.4079 KOps/s 8.3995 KOps/s $\color{#35bf28}+0.10\%$
test_empty[True] 1.0504ms 0.4060ms 2.4632 KOps/s 2.5543 KOps/s $\color{#d91a1a}-3.56\%$
test_empty[False] 7.5762μs 1.0691μs 935.3651 KOps/s 927.0303 KOps/s $\color{#35bf28}+0.90\%$
test_unbind_speed 1.7041ms 0.2481ms 4.0300 KOps/s 3.9897 KOps/s $\color{#35bf28}+1.01\%$
test_unbind_speed_stack0 0.5127ms 0.2371ms 4.2177 KOps/s 4.1094 KOps/s $\color{#35bf28}+2.64\%$
test_unbind_speed_stack1 0.1333s 0.6685ms 1.4960 KOps/s 1.4134 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_split 0.1311s 1.6997ms 588.3338 Ops/s 601.0074 Ops/s $\color{#d91a1a}-2.11\%$
test_chunk 1.7298ms 1.4857ms 673.1051 Ops/s 682.8477 Ops/s $\color{#d91a1a}-1.43\%$
test_creation[device0] 0.2499ms 0.1018ms 9.8210 KOps/s 9.8620 KOps/s $\color{#d91a1a}-0.42\%$
test_creation_from_tensor 4.3133ms 80.7937μs 12.3772 KOps/s 12.2644 KOps/s $\color{#35bf28}+0.92\%$
test_add_one[memmap_tensor0] 0.2164ms 5.4410μs 183.7905 KOps/s 188.6869 KOps/s $\color{#d91a1a}-2.59\%$
test_contiguous[memmap_tensor0] 8.0860μs 0.6318μs 1.5828 MOps/s 1.6178 MOps/s $\color{#d91a1a}-2.16\%$
test_stack[memmap_tensor0] 21.3600μs 3.6828μs 271.5316 KOps/s 286.9762 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_memmaptd_index 0.9071ms 0.2361ms 4.2353 KOps/s 4.2679 KOps/s $\color{#d91a1a}-0.76\%$
test_memmaptd_index_astensor 0.6724ms 0.2982ms 3.3529 KOps/s 3.3610 KOps/s $\color{#d91a1a}-0.24\%$
test_memmaptd_index_op 0.9668ms 0.6054ms 1.6518 KOps/s 1.7953 KOps/s $\textbf{\color{#d91a1a}-7.99\%}$
test_serialize_model 0.2281s 0.1145s 8.7301 Ops/s 8.3091 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_serialize_model_pickle 0.4514s 0.3796s 2.6344 Ops/s 2.6065 Ops/s $\color{#35bf28}+1.07\%$
test_serialize_weights 0.1032s 99.2803ms 10.0725 Ops/s 9.7464 Ops/s $\color{#35bf28}+3.35\%$
test_serialize_weights_returnearly 0.2467s 0.1344s 7.4421 Ops/s 8.0773 Ops/s $\textbf{\color{#d91a1a}-7.86\%}$
test_serialize_weights_pickle 0.7550s 0.4833s 2.0693 Ops/s 2.2939 Ops/s $\textbf{\color{#d91a1a}-9.79\%}$
test_serialize_weights_filesystem 96.1295ms 91.3945ms 10.9416 Ops/s 9.4225 Ops/s $\textbf{\color{#35bf28}+16.12\%}$
test_serialize_model_filesystem 95.5151ms 92.3947ms 10.8231 Ops/s 10.4820 Ops/s $\color{#35bf28}+3.25\%$
test_reshape_pytree 72.9480μs 20.5540μs 48.6523 KOps/s 48.4349 KOps/s $\color{#35bf28}+0.45\%$
test_reshape_td 66.0430μs 32.1481μs 31.1060 KOps/s 32.1555 KOps/s $\color{#d91a1a}-3.26\%$
test_view_pytree 48.6400μs 20.6136μs 48.5117 KOps/s 48.3263 KOps/s $\color{#35bf28}+0.38\%$
test_view_td 0.1386s 62.3404μs 16.0410 KOps/s 15.8833 KOps/s $\color{#35bf28}+0.99\%$
test_unbind_pytree 58.7800μs 24.2070μs 41.3104 KOps/s 41.0211 KOps/s $\color{#35bf28}+0.71\%$
test_unbind_td 0.1374ms 36.2309μs 27.6007 KOps/s 27.5613 KOps/s $\color{#35bf28}+0.14\%$
test_split_pytree 52.8380μs 23.7198μs 42.1589 KOps/s 42.8822 KOps/s $\color{#d91a1a}-1.69\%$
test_split_td 89.6060μs 39.9188μs 25.0508 KOps/s 25.1313 KOps/s $\color{#d91a1a}-0.32\%$
test_add_pytree 84.1260μs 29.6527μs 33.7238 KOps/s 34.5052 KOps/s $\color{#d91a1a}-2.26\%$
test_add_td 0.1807ms 55.7588μs 17.9344 KOps/s 19.8686 KOps/s $\textbf{\color{#d91a1a}-9.74\%}$
test_distributed 0.2249ms 99.3268μs 10.0678 KOps/s 9.9486 KOps/s $\color{#35bf28}+1.20\%$
test_tdmodule 32.2100μs 18.2211μs 54.8814 KOps/s 61.6255 KOps/s $\textbf{\color{#d91a1a}-10.94\%}$
test_tdmodule_dispatch 95.8980μs 36.1262μs 27.6807 KOps/s 31.3457 KOps/s $\textbf{\color{#d91a1a}-11.69\%}$
test_tdseq 45.0740μs 21.6426μs 46.2052 KOps/s 53.5012 KOps/s $\textbf{\color{#d91a1a}-13.64\%}$
test_tdseq_dispatch 71.4530μs 41.6042μs 24.0360 KOps/s 27.1567 KOps/s $\textbf{\color{#d91a1a}-11.49\%}$
test_instantiation_functorch 1.4936ms 1.3108ms 762.9044 Ops/s 771.5154 Ops/s $\color{#d91a1a}-1.12\%$
test_instantiation_td 1.9126ms 1.0095ms 990.5698 Ops/s 853.3788 Ops/s $\textbf{\color{#35bf28}+16.08\%}$
test_exec_functorch 0.3084ms 0.1581ms 6.3237 KOps/s 6.3478 KOps/s $\color{#d91a1a}-0.38\%$
test_exec_functional_call 0.3382ms 0.1464ms 6.8283 KOps/s 6.6155 KOps/s $\color{#35bf28}+3.22\%$
test_exec_td 0.2985ms 0.1435ms 6.9708 KOps/s 6.9235 KOps/s $\color{#35bf28}+0.68\%$
test_exec_td_decorator 0.4780ms 0.1950ms 5.1285 KOps/s 4.9293 KOps/s $\color{#35bf28}+4.04\%$
test_vmap_mlp_speed[True-True] 0.6832ms 0.4741ms 2.1093 KOps/s 2.0567 KOps/s $\color{#35bf28}+2.55\%$
test_vmap_mlp_speed[True-False] 0.6591ms 0.4708ms 2.1240 KOps/s 2.1341 KOps/s $\color{#d91a1a}-0.47\%$
test_vmap_mlp_speed[False-True] 0.6406ms 0.3847ms 2.5996 KOps/s 2.5729 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed[False-False] 0.5386ms 0.3812ms 2.6230 KOps/s 2.5916 KOps/s $\color{#35bf28}+1.21\%$
test_vmap_mlp_speed_decorator[True-True] 1.0384ms 0.4907ms 2.0380 KOps/s 2.0356 KOps/s $\color{#35bf28}+0.12\%$
test_vmap_mlp_speed_decorator[True-False] 0.6764ms 0.4875ms 2.0512 KOps/s 1.9972 KOps/s $\color{#35bf28}+2.70\%$
test_vmap_mlp_speed_decorator[False-True] 0.7724ms 0.3997ms 2.5021 KOps/s 2.4530 KOps/s $\color{#35bf28}+2.00\%$
test_vmap_mlp_speed_decorator[False-False] 0.7905ms 0.4123ms 2.4256 KOps/s 2.4704 KOps/s $\color{#d91a1a}-1.81\%$
test_to_module_speed[True] 2.3018ms 1.4096ms 709.3990 Ops/s 715.7041 Ops/s $\color{#d91a1a}-0.88\%$
test_to_module_speed[False] 1.6193ms 1.3985ms 715.0325 Ops/s 716.6258 Ops/s $\color{#d91a1a}-0.22\%$

@vmoens vmoens merged commit d09626f into main Apr 20, 2024
42 of 48 checks passed
@vmoens vmoens deleted the optional-funcdim branch April 20, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants