Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix torch_function for uninit param #683

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 20, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2024
@vmoens vmoens added the bug Something isn't working label Feb 20, 2024
@vmoens vmoens merged commit 8ea4e87 into main Feb 20, 2024
15 of 26 checks passed
@vmoens vmoens deleted the fix-torch-func-uninittensor branch February 20, 2024 20:46
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 126. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.7440μs 16.5982μs 60.2475 KOps/s 60.0585 KOps/s $\color{#35bf28}+0.31\%$
test_plain_set_stack_nested 60.2130μs 16.8851μs 59.2239 KOps/s 58.8517 KOps/s $\color{#35bf28}+0.63\%$
test_plain_set_nested_inplace 48.6620μs 19.1221μs 52.2955 KOps/s 51.9640 KOps/s $\color{#35bf28}+0.64\%$
test_plain_set_stack_nested_inplace 69.5310μs 19.1584μs 52.1965 KOps/s 52.0703 KOps/s $\color{#35bf28}+0.24\%$
test_items 29.3250μs 2.6554μs 376.5907 KOps/s 374.4965 KOps/s $\color{#35bf28}+0.56\%$
test_items_nested 0.3772ms 0.2684ms 3.7252 KOps/s 3.6548 KOps/s $\color{#35bf28}+1.93\%$
test_items_nested_locked 1.1248ms 0.2804ms 3.5666 KOps/s 3.6677 KOps/s $\color{#d91a1a}-2.76\%$
test_items_nested_leaf 0.2885ms 0.1657ms 6.0349 KOps/s 5.9286 KOps/s $\color{#35bf28}+1.79\%$
test_items_stack_nested 0.3409ms 0.2693ms 3.7137 KOps/s 3.6446 KOps/s $\color{#35bf28}+1.90\%$
test_items_stack_nested_leaf 2.5874ms 0.1666ms 6.0010 KOps/s 5.9574 KOps/s $\color{#35bf28}+0.73\%$
test_items_stack_nested_locked 0.3964ms 0.2699ms 3.7050 KOps/s 3.5769 KOps/s $\color{#35bf28}+3.58\%$
test_keys 38.1520μs 4.1945μs 238.4084 KOps/s 261.3547 KOps/s $\textbf{\color{#d91a1a}-8.78\%}$
test_keys_nested 1.8161ms 0.1519ms 6.5844 KOps/s 6.6511 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_nested_locked 3.7730ms 0.1555ms 6.4329 KOps/s 6.4358 KOps/s $\color{#d91a1a}-0.04\%$
test_keys_nested_leaf 37.6799ms 0.1381ms 7.2432 KOps/s 7.5517 KOps/s $\color{#d91a1a}-4.09\%$
test_keys_stack_nested 3.9111ms 0.1535ms 6.5146 KOps/s 6.5998 KOps/s $\color{#d91a1a}-1.29\%$
test_keys_stack_nested_leaf 0.2405ms 0.1326ms 7.5390 KOps/s 7.4981 KOps/s $\color{#35bf28}+0.55\%$
test_keys_stack_nested_locked 0.2971ms 0.1569ms 6.3732 KOps/s 6.3429 KOps/s $\color{#35bf28}+0.48\%$
test_values 10.2212μs 1.1507μs 869.0528 KOps/s 692.1186 KOps/s $\textbf{\color{#35bf28}+25.56\%}$
test_values_nested 0.1116ms 52.3546μs 19.1005 KOps/s 19.2394 KOps/s $\color{#d91a1a}-0.72\%$
test_values_nested_locked 0.1069ms 51.7765μs 19.3138 KOps/s 19.4008 KOps/s $\color{#d91a1a}-0.45\%$
test_values_nested_leaf 0.1084ms 47.2113μs 21.1814 KOps/s 21.5945 KOps/s $\color{#d91a1a}-1.91\%$
test_values_stack_nested 0.1065ms 53.1808μs 18.8038 KOps/s 18.9412 KOps/s $\color{#d91a1a}-0.73\%$
test_values_stack_nested_leaf 95.2490μs 47.1614μs 21.2038 KOps/s 21.4403 KOps/s $\color{#d91a1a}-1.10\%$
test_values_stack_nested_locked 92.7940μs 52.7115μs 18.9712 KOps/s 18.9717 KOps/s $-0.00\%$
test_membership 11.6820μs 1.3280μs 753.0305 KOps/s 750.4729 KOps/s $\color{#35bf28}+0.34\%$
test_membership_nested 22.4030μs 3.4310μs 291.4634 KOps/s 291.3704 KOps/s $\color{#35bf28}+0.03\%$
test_membership_nested_leaf 36.1270μs 3.4292μs 291.6163 KOps/s 276.6300 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_membership_stacked_nested 30.4770μs 3.4023μs 293.9190 KOps/s 292.5149 KOps/s $\color{#35bf28}+0.48\%$
test_membership_stacked_nested_leaf 37.8120μs 3.4019μs 293.9493 KOps/s 290.1486 KOps/s $\color{#35bf28}+1.31\%$
test_membership_nested_last 26.8100μs 6.6002μs 151.5108 KOps/s 151.8928 KOps/s $\color{#d91a1a}-0.25\%$
test_membership_nested_leaf_last 25.8880μs 6.5879μs 151.7944 KOps/s 150.6048 KOps/s $\color{#35bf28}+0.79\%$
test_membership_stacked_nested_last 33.7830μs 6.5485μs 152.7070 KOps/s 152.2474 KOps/s $\color{#35bf28}+0.30\%$
test_membership_stacked_nested_leaf_last 55.4040μs 6.5097μs 153.6170 KOps/s 150.9384 KOps/s $\color{#35bf28}+1.77\%$
test_nested_getleaf 47.6100μs 10.6769μs 93.6602 KOps/s 93.6071 KOps/s $\color{#35bf28}+0.06\%$
test_nested_get 32.4610μs 10.0371μs 99.6305 KOps/s 97.6147 KOps/s $\color{#35bf28}+2.07\%$
test_stacked_getleaf 31.3290μs 10.5751μs 94.5622 KOps/s 94.2430 KOps/s $\color{#35bf28}+0.34\%$
test_stacked_get 44.2340μs 10.0546μs 99.4566 KOps/s 99.8315 KOps/s $\color{#d91a1a}-0.38\%$
test_nested_getitemleaf 39.4850μs 12.1133μs 82.5538 KOps/s 82.6621 KOps/s $\color{#d91a1a}-0.13\%$
test_nested_getitem 49.7440μs 11.5894μs 86.2857 KOps/s 87.0348 KOps/s $\color{#d91a1a}-0.86\%$
test_stacked_getitemleaf 55.4140μs 12.0199μs 83.1955 KOps/s 82.5202 KOps/s $\color{#35bf28}+0.82\%$
test_stacked_getitem 41.7580μs 11.4167μs 87.5913 KOps/s 86.8588 KOps/s $\color{#35bf28}+0.84\%$
test_lock_nested 0.7321ms 0.3359ms 2.9772 KOps/s 2.9532 KOps/s $\color{#35bf28}+0.81\%$
test_lock_stack_nested 0.4167ms 0.2981ms 3.3546 KOps/s 3.3575 KOps/s $\color{#d91a1a}-0.09\%$
test_unlock_nested 80.8727ms 0.4184ms 2.3898 KOps/s 2.3880 KOps/s $\color{#35bf28}+0.08\%$
test_unlock_stack_nested 0.5252ms 0.3075ms 3.2524 KOps/s 3.2681 KOps/s $\color{#d91a1a}-0.48\%$
test_flatten_speed 0.6826ms 0.3749ms 2.6672 KOps/s 2.7549 KOps/s $\color{#d91a1a}-3.18\%$
test_unflatten_speed 0.7725ms 0.4579ms 2.1837 KOps/s 2.1710 KOps/s $\color{#35bf28}+0.59\%$
test_common_ops 1.2478ms 0.6757ms 1.4800 KOps/s 1.4892 KOps/s $\color{#d91a1a}-0.62\%$
test_creation 42.3390μs 1.8395μs 543.6405 KOps/s 551.3450 KOps/s $\color{#d91a1a}-1.40\%$
test_creation_empty 30.0570μs 9.2671μs 107.9084 KOps/s 102.7554 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_creation_nested_1 41.4890μs 11.8095μs 84.6776 KOps/s 81.3579 KOps/s $\color{#35bf28}+4.08\%$
test_creation_nested_2 65.2630μs 15.2574μs 65.5418 KOps/s 64.6031 KOps/s $\color{#35bf28}+1.45\%$
test_clone 70.7330μs 12.8229μs 77.9856 KOps/s 74.9400 KOps/s $\color{#35bf28}+4.06\%$
test_getitem[int] 32.8220μs 10.9029μs 91.7191 KOps/s 92.3567 KOps/s $\color{#d91a1a}-0.69\%$
test_getitem[slice_int] 61.4160μs 21.4242μs 46.6762 KOps/s 44.2365 KOps/s $\textbf{\color{#35bf28}+5.52\%}$
test_getitem[range] 0.3049ms 42.3414μs 23.6176 KOps/s 25.1654 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_getitem[tuple] 78.5570μs 17.6891μs 56.5319 KOps/s 56.5350 KOps/s $-0.01\%$
test_getitem[list] 0.1738ms 37.5091μs 26.6602 KOps/s 28.0701 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_setitem_dim[int] 52.5290μs 28.9925μs 34.4917 KOps/s 34.9697 KOps/s $\color{#d91a1a}-1.37\%$
test_setitem_dim[slice_int] 0.1200ms 54.5937μs 18.3171 KOps/s 18.8425 KOps/s $\color{#d91a1a}-2.79\%$
test_setitem_dim[range] 0.1141ms 74.9702μs 13.3386 KOps/s 13.5901 KOps/s $\color{#d91a1a}-1.85\%$
test_setitem_dim[tuple] 0.1081ms 44.7220μs 22.3604 KOps/s 23.5819 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_setitem 85.7410μs 18.6119μs 53.7291 KOps/s 53.2809 KOps/s $\color{#35bf28}+0.84\%$
test_set 95.1290μs 18.1687μs 55.0398 KOps/s 53.2680 KOps/s $\color{#35bf28}+3.33\%$
test_set_shared 4.3913ms 0.1419ms 7.0458 KOps/s 7.3467 KOps/s $\color{#d91a1a}-4.10\%$
test_update 96.5310μs 21.3419μs 46.8563 KOps/s 47.8174 KOps/s $\color{#d91a1a}-2.01\%$
test_update_nested 95.1390μs 28.6453μs 34.9098 KOps/s 34.3646 KOps/s $\color{#35bf28}+1.59\%$
test_set_nested 88.8570μs 20.2096μs 49.4813 KOps/s 48.8857 KOps/s $\color{#35bf28}+1.22\%$
test_set_nested_new 79.1990μs 23.6522μs 42.2793 KOps/s 41.2110 KOps/s $\color{#35bf28}+2.59\%$
test_select 0.1633ms 37.4902μs 26.6736 KOps/s 26.8530 KOps/s $\color{#d91a1a}-0.67\%$
test_select_nested 0.1260ms 57.3125μs 17.4482 KOps/s 17.2410 KOps/s $\color{#35bf28}+1.20\%$
test_exclude_nested 0.2189ms 0.1148ms 8.7083 KOps/s 8.6831 KOps/s $\color{#35bf28}+0.29\%$
test_empty[True] 0.6285ms 0.3940ms 2.5379 KOps/s 2.4645 KOps/s $\color{#35bf28}+2.98\%$
test_empty[False] 5.0756μs 1.0522μs 950.4112 KOps/s 976.1547 KOps/s $\color{#d91a1a}-2.64\%$
test_unbind_speed 0.3114ms 0.2460ms 4.0655 KOps/s 4.1309 KOps/s $\color{#d91a1a}-1.58\%$
test_unbind_speed_stack0 0.4022ms 0.2391ms 4.1818 KOps/s 4.2556 KOps/s $\color{#d91a1a}-1.73\%$
test_unbind_speed_stack1 0.1235s 0.6735ms 1.4847 KOps/s 1.5086 KOps/s $\color{#d91a1a}-1.58\%$
test_split 0.1317s 1.6332ms 612.2842 Ops/s 612.1778 Ops/s $\color{#35bf28}+0.02\%$
test_chunk 2.2576ms 1.4332ms 697.7175 Ops/s 689.5931 Ops/s $\color{#35bf28}+1.18\%$
test_creation[device0] 0.1875ms 0.1019ms 9.8170 KOps/s 9.8641 KOps/s $\color{#d91a1a}-0.48\%$
test_creation_from_tensor 5.7835ms 81.4053μs 12.2842 KOps/s 11.2950 KOps/s $\textbf{\color{#35bf28}+8.76\%}$
test_add_one[memmap_tensor0] 0.1605ms 5.4119μs 184.7777 KOps/s 187.6203 KOps/s $\color{#d91a1a}-1.52\%$
test_contiguous[memmap_tensor0] 22.5030μs 0.6355μs 1.5735 MOps/s 1.5855 MOps/s $\color{#d91a1a}-0.76\%$
test_stack[memmap_tensor0] 23.8950μs 3.5248μs 283.7035 KOps/s 278.1929 KOps/s $\color{#35bf28}+1.98\%$
test_memmaptd_index 0.9742ms 0.2353ms 4.2495 KOps/s 4.1422 KOps/s $\color{#35bf28}+2.59\%$
test_memmaptd_index_astensor 0.6762ms 0.2969ms 3.3683 KOps/s 3.2757 KOps/s $\color{#35bf28}+2.83\%$
test_memmaptd_index_op 0.9485ms 0.5805ms 1.7227 KOps/s 1.7376 KOps/s $\color{#d91a1a}-0.86\%$
test_serialize_model 0.2183s 0.1128s 8.8658 Ops/s 8.2843 Ops/s $\textbf{\color{#35bf28}+7.02\%}$
test_serialize_model_pickle 0.4515s 0.3789s 2.6391 Ops/s 2.5694 Ops/s $\color{#35bf28}+2.71\%$
test_serialize_weights 99.8589ms 96.5017ms 10.3625 Ops/s 9.8495 Ops/s $\textbf{\color{#35bf28}+5.21\%}$
test_serialize_weights_returnearly 0.2494s 0.1360s 7.3503 Ops/s 6.8806 Ops/s $\textbf{\color{#35bf28}+6.83\%}$
test_serialize_weights_pickle 0.7073s 0.4934s 2.0266 Ops/s 2.4209 Ops/s $\textbf{\color{#d91a1a}-16.29\%}$
test_serialize_weights_filesystem 98.4019ms 90.7704ms 11.0168 Ops/s 10.4515 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_serialize_model_filesystem 98.8084ms 93.2255ms 10.7267 Ops/s 10.6769 Ops/s $\color{#35bf28}+0.47\%$
test_reshape_pytree 55.9660μs 21.0440μs 47.5195 KOps/s 47.5677 KOps/s $\color{#d91a1a}-0.10\%$
test_reshape_td 66.3550μs 30.5179μs 32.7676 KOps/s 33.2401 KOps/s $\color{#d91a1a}-1.42\%$
test_view_pytree 56.2660μs 20.8538μs 47.9528 KOps/s 48.3307 KOps/s $\color{#d91a1a}-0.78\%$
test_view_td 0.1326s 61.0426μs 16.3820 KOps/s 16.2902 KOps/s $\color{#35bf28}+0.56\%$
test_unbind_pytree 53.5910μs 24.4245μs 40.9425 KOps/s 40.0752 KOps/s $\color{#35bf28}+2.16\%$
test_unbind_td 0.5974ms 36.1637μs 27.6520 KOps/s 28.5121 KOps/s $\color{#d91a1a}-3.02\%$
test_split_pytree 60.2230μs 24.1861μs 41.3460 KOps/s 42.1210 KOps/s $\color{#d91a1a}-1.84\%$
test_split_td 0.1186ms 38.1944μs 26.1818 KOps/s 25.9884 KOps/s $\color{#35bf28}+0.74\%$
test_add_pytree 68.0880μs 29.8039μs 33.5526 KOps/s 34.1350 KOps/s $\color{#d91a1a}-1.71\%$
test_add_td 94.9590μs 50.0468μs 19.9813 KOps/s 19.6495 KOps/s $\color{#35bf28}+1.69\%$
test_distributed 0.1859ms 99.0475μs 10.0962 KOps/s 9.8608 KOps/s $\color{#35bf28}+2.39\%$
test_tdmodule 0.1112ms 21.7837μs 45.9059 KOps/s 44.7924 KOps/s $\color{#35bf28}+2.49\%$
test_tdmodule_dispatch 0.1798ms 41.4013μs 24.1538 KOps/s 23.4840 KOps/s $\color{#35bf28}+2.85\%$
test_tdseq 0.3381ms 24.8320μs 40.2706 KOps/s 37.8819 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_tdseq_dispatch 0.4018ms 46.0790μs 21.7018 KOps/s 21.3337 KOps/s $\color{#35bf28}+1.73\%$
test_instantiation_functorch 1.5508ms 1.2954ms 771.9786 Ops/s 763.0296 Ops/s $\color{#35bf28}+1.17\%$
test_instantiation_td 1.7136ms 1.0023ms 997.6664 Ops/s 997.8892 Ops/s $\color{#d91a1a}-0.02\%$
test_exec_functorch 0.2923ms 0.1594ms 6.2751 KOps/s 6.3407 KOps/s $\color{#d91a1a}-1.03\%$
test_exec_functional_call 0.3813ms 0.1490ms 6.7109 KOps/s 6.6639 KOps/s $\color{#35bf28}+0.71\%$
test_exec_td 0.2945ms 0.1485ms 6.7345 KOps/s 6.7557 KOps/s $\color{#d91a1a}-0.31\%$
test_exec_td_decorator 0.4066ms 0.1945ms 5.1421 KOps/s 5.1506 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed[True-True] 0.6326ms 0.4800ms 2.0834 KOps/s 2.1183 KOps/s $\color{#d91a1a}-1.65\%$
test_vmap_mlp_speed[True-False] 0.8968ms 0.4818ms 2.0753 KOps/s 2.1378 KOps/s $\color{#d91a1a}-2.92\%$
test_vmap_mlp_speed[False-True] 0.6221ms 0.3940ms 2.5383 KOps/s 2.5816 KOps/s $\color{#d91a1a}-1.68\%$
test_vmap_mlp_speed[False-False] 0.6602ms 0.3967ms 2.5208 KOps/s 2.5820 KOps/s $\color{#d91a1a}-2.37\%$
test_vmap_mlp_speed_decorator[True-True] 1.2107ms 0.5276ms 1.8954 KOps/s 1.9465 KOps/s $\color{#d91a1a}-2.63\%$
test_vmap_mlp_speed_decorator[True-False] 0.8157ms 0.5279ms 1.8944 KOps/s 1.9378 KOps/s $\color{#d91a1a}-2.24\%$
test_vmap_mlp_speed_decorator[False-True] 0.8365ms 0.4298ms 2.3265 KOps/s 2.4884 KOps/s $\textbf{\color{#d91a1a}-6.51\%}$
test_vmap_mlp_speed_decorator[False-False] 0.6848ms 0.4124ms 2.4248 KOps/s 2.4839 KOps/s $\color{#d91a1a}-2.38\%$
test_to_module_speed[True] 2.1123ms 1.3828ms 723.1760 Ops/s 718.8324 Ops/s $\color{#35bf28}+0.60\%$
test_to_module_speed[False] 2.4169ms 1.3623ms 734.0542 Ops/s 728.5378 Ops/s $\color{#35bf28}+0.76\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 134. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.6923ms 13.6810μs 73.0941 KOps/s 70.9719 KOps/s $\color{#35bf28}+2.99\%$
test_plain_set_stack_nested 30.0100μs 13.6849μs 73.0735 KOps/s 71.3244 KOps/s $\color{#35bf28}+2.45\%$
test_plain_set_nested_inplace 42.3010μs 14.9776μs 66.7662 KOps/s 65.1353 KOps/s $\color{#35bf28}+2.50\%$
test_plain_set_stack_nested_inplace 37.3510μs 15.1889μs 65.8374 KOps/s 65.3098 KOps/s $\color{#35bf28}+0.81\%$
test_items 36.3000μs 4.7034μs 212.6131 KOps/s 212.2160 KOps/s $\color{#35bf28}+0.19\%$
test_items_nested 0.4196ms 0.3426ms 2.9190 KOps/s 2.9323 KOps/s $\color{#d91a1a}-0.45\%$
test_items_nested_locked 0.4651ms 0.3485ms 2.8692 KOps/s 2.8909 KOps/s $\color{#d91a1a}-0.75\%$
test_items_nested_leaf 0.2624ms 0.2038ms 4.9074 KOps/s 4.9127 KOps/s $\color{#d91a1a}-0.11\%$
test_items_stack_nested 0.4202ms 0.3439ms 2.9078 KOps/s 2.8724 KOps/s $\color{#35bf28}+1.23\%$
test_items_stack_nested_leaf 0.2576ms 0.2030ms 4.9261 KOps/s 4.9212 KOps/s $\color{#35bf28}+0.10\%$
test_items_stack_nested_locked 0.4033ms 0.3479ms 2.8747 KOps/s 2.8654 KOps/s $\color{#35bf28}+0.32\%$
test_keys 21.8300μs 4.5910μs 217.8194 KOps/s 219.5591 KOps/s $\color{#d91a1a}-0.79\%$
test_keys_nested 43.7019ms 0.1018ms 9.8191 KOps/s 10.5281 KOps/s $\textbf{\color{#d91a1a}-6.73\%}$
test_keys_nested_locked 0.1425ms 99.0066μs 10.1003 KOps/s 10.1216 KOps/s $\color{#d91a1a}-0.21\%$
test_keys_nested_leaf 0.1126ms 78.1840μs 12.7903 KOps/s 12.7419 KOps/s $\color{#35bf28}+0.38\%$
test_keys_stack_nested 0.1217ms 94.9337μs 10.5337 KOps/s 10.5367 KOps/s $\color{#d91a1a}-0.03\%$
test_keys_stack_nested_leaf 0.1034ms 78.2819μs 12.7743 KOps/s 12.6060 KOps/s $\color{#35bf28}+1.34\%$
test_keys_stack_nested_locked 0.1306ms 99.4341μs 10.0569 KOps/s 10.1256 KOps/s $\color{#d91a1a}-0.68\%$
test_values 6.7770μs 1.8935μs 528.1150 KOps/s 525.8280 KOps/s $\color{#35bf28}+0.43\%$
test_values_nested 62.8210μs 45.7524μs 21.8568 KOps/s 22.2018 KOps/s $\color{#d91a1a}-1.55\%$
test_values_nested_locked 68.1810μs 47.9627μs 20.8495 KOps/s 21.0245 KOps/s $\color{#d91a1a}-0.83\%$
test_values_nested_leaf 64.4710μs 40.4794μs 24.7039 KOps/s 25.2865 KOps/s $\color{#d91a1a}-2.30\%$
test_values_stack_nested 77.5510μs 46.7981μs 21.3684 KOps/s 21.6025 KOps/s $\color{#d91a1a}-1.08\%$
test_values_stack_nested_leaf 70.4010μs 39.8250μs 25.1098 KOps/s 25.0498 KOps/s $\color{#35bf28}+0.24\%$
test_values_stack_nested_locked 74.4510μs 48.7459μs 20.5146 KOps/s 20.6504 KOps/s $\color{#d91a1a}-0.66\%$
test_membership 5.7002μs 0.9412μs 1.0624 MOps/s 1.0499 MOps/s $\color{#35bf28}+1.19\%$
test_membership_nested 19.5310μs 2.8703μs 348.3965 KOps/s 350.4382 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_nested_leaf 19.6100μs 2.8916μs 345.8315 KOps/s 350.6785 KOps/s $\color{#d91a1a}-1.38\%$
test_membership_stacked_nested 25.0110μs 2.8640μs 349.1639 KOps/s 347.3823 KOps/s $\color{#35bf28}+0.51\%$
test_membership_stacked_nested_leaf 20.9400μs 2.8817μs 347.0168 KOps/s 351.0904 KOps/s $\color{#d91a1a}-1.16\%$
test_membership_nested_last 37.9510μs 5.2366μs 190.9640 KOps/s 188.4665 KOps/s $\color{#35bf28}+1.33\%$
test_membership_nested_leaf_last 33.0110μs 5.2065μs 192.0687 KOps/s 188.8729 KOps/s $\color{#35bf28}+1.69\%$
test_membership_stacked_nested_last 41.3410μs 12.7678μs 78.3223 KOps/s 188.0490 KOps/s $\textbf{\color{#d91a1a}-58.35\%}$
test_membership_stacked_nested_leaf_last 47.5110μs 12.7673μs 78.3253 KOps/s 190.2792 KOps/s $\textbf{\color{#d91a1a}-58.84\%}$
test_nested_getleaf 42.6510μs 8.5794μs 116.5584 KOps/s 117.8900 KOps/s $\color{#d91a1a}-1.13\%$
test_nested_get 23.1900μs 8.1068μs 123.3531 KOps/s 125.0768 KOps/s $\color{#d91a1a}-1.38\%$
test_stacked_getleaf 47.9310μs 8.5987μs 116.2967 KOps/s 118.6292 KOps/s $\color{#d91a1a}-1.97\%$
test_stacked_get 29.0810μs 8.0996μs 123.4625 KOps/s 124.8717 KOps/s $\color{#d91a1a}-1.13\%$
test_nested_getitemleaf 42.5700μs 9.9534μs 100.4681 KOps/s 101.7665 KOps/s $\color{#d91a1a}-1.28\%$
test_nested_getitem 36.4910μs 9.5101μs 105.1517 KOps/s 106.2492 KOps/s $\color{#d91a1a}-1.03\%$
test_stacked_getitemleaf 29.9610μs 9.8929μs 101.0822 KOps/s 100.7837 KOps/s $\color{#35bf28}+0.30\%$
test_stacked_getitem 40.3210μs 9.5249μs 104.9883 KOps/s 106.9228 KOps/s $\color{#d91a1a}-1.81\%$
test_lock_nested 2.0163ms 0.3533ms 2.8303 KOps/s 2.6986 KOps/s $\color{#35bf28}+4.88\%$
test_lock_stack_nested 0.3491ms 0.3019ms 3.3120 KOps/s 3.1649 KOps/s $\color{#35bf28}+4.65\%$
test_unlock_nested 0.7374ms 0.3513ms 2.8464 KOps/s 2.7678 KOps/s $\color{#35bf28}+2.84\%$
test_unlock_stack_nested 0.3652ms 0.3113ms 3.2119 KOps/s 3.0726 KOps/s $\color{#35bf28}+4.53\%$
test_flatten_speed 0.4855ms 0.2621ms 3.8160 KOps/s 3.8730 KOps/s $\color{#d91a1a}-1.47\%$
test_unflatten_speed 0.4282ms 0.3635ms 2.7512 KOps/s 2.7244 KOps/s $\color{#35bf28}+0.98\%$
test_common_ops 1.0695ms 0.6061ms 1.6498 KOps/s 1.5336 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_creation 15.7500μs 1.5342μs 651.8124 KOps/s 641.3185 KOps/s $\color{#35bf28}+1.64\%$
test_creation_empty 41.2900μs 8.1257μs 123.0657 KOps/s 112.6580 KOps/s $\textbf{\color{#35bf28}+9.24\%}$
test_creation_nested_1 28.5310μs 9.7887μs 102.1584 KOps/s 93.9955 KOps/s $\textbf{\color{#35bf28}+8.68\%}$
test_creation_nested_2 35.4900μs 12.2342μs 81.7381 KOps/s 75.8667 KOps/s $\textbf{\color{#35bf28}+7.74\%}$
test_clone 40.4700μs 13.5482μs 73.8103 KOps/s 66.9134 KOps/s $\textbf{\color{#35bf28}+10.31\%}$
test_getitem[int] 26.8310μs 11.1422μs 89.7487 KOps/s 91.7757 KOps/s $\color{#d91a1a}-2.21\%$
test_getitem[slice_int] 1.7660ms 21.1516μs 47.2778 KOps/s 43.7253 KOps/s $\textbf{\color{#35bf28}+8.12\%}$
test_getitem[range] 69.0810μs 50.3308μs 19.8685 KOps/s 19.5860 KOps/s $\color{#35bf28}+1.44\%$
test_getitem[tuple] 49.8600μs 18.8631μs 53.0136 KOps/s 52.7486 KOps/s $\color{#35bf28}+0.50\%$
test_getitem[list] 0.1554ms 40.6268μs 24.6143 KOps/s 25.1078 KOps/s $\color{#d91a1a}-1.97\%$
test_setitem_dim[int] 55.7610μs 28.9778μs 34.5091 KOps/s 32.7893 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_setitem_dim[slice_int] 84.0610μs 51.5093μs 19.4140 KOps/s 19.2608 KOps/s $\color{#35bf28}+0.80\%$
test_setitem_dim[range] 0.1101ms 72.9320μs 13.7114 KOps/s 13.7927 KOps/s $\color{#d91a1a}-0.59\%$
test_setitem_dim[tuple] 73.8110μs 44.2423μs 22.6028 KOps/s 22.4667 KOps/s $\color{#35bf28}+0.61\%$
test_setitem 51.8510μs 20.3912μs 49.0409 KOps/s 49.4029 KOps/s $\color{#d91a1a}-0.73\%$
test_set 53.9000μs 18.4843μs 54.1000 KOps/s 48.8741 KOps/s $\textbf{\color{#35bf28}+10.69\%}$
test_set_shared 0.1263s 0.1290ms 7.7543 KOps/s 7.4864 KOps/s $\color{#35bf28}+3.58\%$
test_update 66.0610μs 20.4451μs 48.9114 KOps/s 44.1505 KOps/s $\textbf{\color{#35bf28}+10.78\%}$
test_update_nested 59.6910μs 27.2225μs 36.7343 KOps/s 34.1155 KOps/s $\textbf{\color{#35bf28}+7.68\%}$
test_set_nested 54.8210μs 19.5285μs 51.2072 KOps/s 47.5250 KOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_set_nested_new 53.2520μs 22.6209μs 44.2069 KOps/s 42.5763 KOps/s $\color{#35bf28}+3.83\%$
test_select 67.2910μs 35.9996μs 27.7781 KOps/s 26.9649 KOps/s $\color{#35bf28}+3.02\%$
test_select_nested 83.9810μs 53.7185μs 18.6155 KOps/s 18.5006 KOps/s $\color{#35bf28}+0.62\%$
test_exclude_nested 0.1614ms 0.1150ms 8.6968 KOps/s 8.5457 KOps/s $\color{#35bf28}+1.77\%$
test_empty[True] 0.8881ms 0.3884ms 2.5749 KOps/s 2.5325 KOps/s $\color{#35bf28}+1.67\%$
test_empty[False] 3.0611μs 0.8476μs 1.1798 MOps/s 1.1772 MOps/s $\color{#35bf28}+0.22\%$
test_to 76.2510μs 57.1948μs 17.4841 KOps/s 17.5244 KOps/s $\color{#d91a1a}-0.23\%$
test_to_nonblocking 63.7510μs 37.1837μs 26.8935 KOps/s 27.0143 KOps/s $\color{#d91a1a}-0.45\%$
test_unbind_speed 0.3289ms 0.2683ms 3.7270 KOps/s 3.6385 KOps/s $\color{#35bf28}+2.43\%$
test_unbind_speed_stack0 0.4644ms 0.2653ms 3.7687 KOps/s 3.6682 KOps/s $\color{#35bf28}+2.74\%$
test_unbind_speed_stack1 0.1271s 0.7615ms 1.3131 KOps/s 1.2815 KOps/s $\color{#35bf28}+2.47\%$
test_split 1.7437ms 1.5577ms 641.9651 Ops/s 654.9050 Ops/s $\color{#d91a1a}-1.98\%$
test_chunk 1.7408ms 1.5509ms 644.7718 Ops/s 655.7338 Ops/s $\color{#d91a1a}-1.67\%$
test_creation[device0] 0.1255ms 73.0466μs 13.6899 KOps/s 13.7220 KOps/s $\color{#d91a1a}-0.23\%$
test_creation_from_tensor 0.2491ms 54.1163μs 18.4787 KOps/s 17.4149 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_add_one[memmap_tensor0] 94.2020μs 7.1547μs 139.7676 KOps/s 132.1211 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_contiguous[memmap_tensor0] 13.5300μs 0.6583μs 1.5192 MOps/s 1.5535 MOps/s $\color{#d91a1a}-2.21\%$
test_stack[memmap_tensor0] 0.2042ms 4.5053μs 221.9585 KOps/s 211.3626 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_memmaptd_index 1.1172ms 0.2583ms 3.8715 KOps/s 3.7205 KOps/s $\color{#35bf28}+4.06\%$
test_memmaptd_index_astensor 0.5647ms 0.3199ms 3.1262 KOps/s 3.0379 KOps/s $\color{#35bf28}+2.91\%$
test_memmaptd_index_op 0.9596ms 0.6037ms 1.6563 KOps/s 1.5285 KOps/s $\textbf{\color{#35bf28}+8.36\%}$
test_serialize_model 92.1032ms 89.0968ms 11.2237 Ops/s 9.1598 Ops/s $\textbf{\color{#35bf28}+22.53\%}$
test_serialize_model_pickle 1.3489s 1.2354s 0.8095 Ops/s 0.8085 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_weights 89.8150ms 86.4473ms 11.5677 Ops/s 10.9038 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_serialize_weights_returnearly 0.3652s 88.9718ms 11.2395 Ops/s 11.6345 Ops/s $\color{#d91a1a}-3.39\%$
test_serialize_weights_pickle 1.3530s 1.2481s 0.8012 Ops/s 0.8016 Ops/s $\color{#d91a1a}-0.04\%$
test_reshape_pytree 56.4810μs 25.6308μs 39.0156 KOps/s 36.5404 KOps/s $\textbf{\color{#35bf28}+6.77\%}$
test_reshape_td 69.1410μs 31.5905μs 31.6551 KOps/s 31.5843 KOps/s $\color{#35bf28}+0.22\%$
test_view_pytree 93.7210μs 25.3838μs 39.3951 KOps/s 38.4291 KOps/s $\color{#35bf28}+2.51\%$
test_view_td 0.5754ms 45.7968μs 21.8356 KOps/s 21.2797 KOps/s $\color{#35bf28}+2.61\%$
test_unbind_pytree 0.2902ms 30.2011μs 33.1114 KOps/s 31.8359 KOps/s $\color{#35bf28}+4.01\%$
test_unbind_td 93.5020μs 39.9368μs 25.0396 KOps/s 23.9977 KOps/s $\color{#35bf28}+4.34\%$
test_split_pytree 59.3110μs 29.0034μs 34.4787 KOps/s 33.7996 KOps/s $\color{#35bf28}+2.01\%$
test_split_td 0.1193ms 39.5641μs 25.2754 KOps/s 25.2163 KOps/s $\color{#35bf28}+0.23\%$
test_add_pytree 70.1410μs 36.4602μs 27.4272 KOps/s 26.0056 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_add_td 0.1060ms 50.5871μs 19.7679 KOps/s 18.2450 KOps/s $\textbf{\color{#35bf28}+8.35\%}$
test_distributed 1.8506ms 71.5889μs 13.9686 KOps/s 14.0862 KOps/s $\color{#d91a1a}-0.83\%$
test_tdmodule 34.0010μs 17.6341μs 56.7084 KOps/s 54.9834 KOps/s $\color{#35bf28}+3.14\%$
test_tdmodule_dispatch 0.1443ms 36.1861μs 27.6349 KOps/s 26.6449 KOps/s $\color{#35bf28}+3.72\%$
test_tdseq 38.2510μs 20.4888μs 48.8071 KOps/s 46.6388 KOps/s $\color{#35bf28}+4.65\%$
test_tdseq_dispatch 54.2910μs 38.8866μs 25.7158 KOps/s 25.0259 KOps/s $\color{#35bf28}+2.76\%$
test_instantiation_functorch 2.0077ms 1.6885ms 592.2383 Ops/s 593.4126 Ops/s $\color{#d91a1a}-0.20\%$
test_instantiation_td 1.7069ms 1.1600ms 862.0878 Ops/s 841.5889 Ops/s $\color{#35bf28}+2.44\%$
test_exec_functorch 0.2105ms 0.1607ms 6.2224 KOps/s 5.9995 KOps/s $\color{#35bf28}+3.72\%$
test_exec_functional_call 0.2289ms 0.1616ms 6.1870 KOps/s 6.0205 KOps/s $\color{#35bf28}+2.77\%$
test_exec_td 0.1880ms 0.1528ms 6.5447 KOps/s 6.5273 KOps/s $\color{#35bf28}+0.27\%$
test_exec_td_decorator 0.8544ms 0.1995ms 5.0137 KOps/s 4.9183 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed[True-True] 0.7153ms 0.6237ms 1.6034 KOps/s 1.6073 KOps/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed[True-False] 0.7375ms 0.6218ms 1.6083 KOps/s 1.6079 KOps/s $\color{#35bf28}+0.03\%$
test_vmap_mlp_speed[False-True] 0.6169ms 0.5508ms 1.8156 KOps/s 1.8195 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed[False-False] 0.5933ms 0.5476ms 1.8262 KOps/s 1.8272 KOps/s $\color{#d91a1a}-0.05\%$
test_vmap_mlp_speed_decorator[True-True] 0.7786ms 0.6631ms 1.5082 KOps/s 1.5120 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_mlp_speed_decorator[True-False] 0.9627ms 0.6633ms 1.5076 KOps/s 1.5037 KOps/s $\color{#35bf28}+0.26\%$
test_vmap_mlp_speed_decorator[False-True] 0.6938ms 0.5662ms 1.7661 KOps/s 1.7575 KOps/s $\color{#35bf28}+0.49\%$
test_vmap_mlp_speed_decorator[False-False] 0.8203ms 0.5677ms 1.7614 KOps/s 1.7598 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_transformer_speed[True-True] 8.4118ms 8.3327ms 120.0097 Ops/s 119.3527 Ops/s $\color{#35bf28}+0.55\%$
test_vmap_transformer_speed[True-False] 8.4992ms 8.3005ms 120.4748 Ops/s 119.3624 Ops/s $\color{#35bf28}+0.93\%$
test_vmap_transformer_speed[False-True] 8.5322ms 8.2759ms 120.8333 Ops/s 120.8053 Ops/s $\color{#35bf28}+0.02\%$
test_vmap_transformer_speed[False-False] 8.5834ms 8.2431ms 121.3142 Ops/s 120.6824 Ops/s $\color{#35bf28}+0.52\%$
test_vmap_transformer_speed_decorator[True-True] 20.0254ms 19.8423ms 50.3975 Ops/s 50.0619 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_transformer_speed_decorator[True-False] 19.8893ms 19.8260ms 50.4387 Ops/s 50.0059 Ops/s $\color{#35bf28}+0.87\%$
test_vmap_transformer_speed_decorator[False-True] 20.3524ms 19.5015ms 51.2782 Ops/s 51.1858 Ops/s $\color{#35bf28}+0.18\%$
test_vmap_transformer_speed_decorator[False-False] 19.5278ms 19.4197ms 51.4940 Ops/s 51.3421 Ops/s $\color{#35bf28}+0.30\%$
test_to_module_speed[True] 2.9334ms 1.2740ms 784.9511 Ops/s 786.7889 Ops/s $\color{#d91a1a}-0.23\%$
test_to_module_speed[False] 1.3890ms 1.2310ms 812.3267 Ops/s 814.3781 Ops/s $\color{#d91a1a}-0.25\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants