Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Delete parameter/buffer before setting it with regular setattr in to_module #583

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 28, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 28, 2023
@vmoens vmoens added the bug Something isn't working label Nov 28, 2023
@vmoens vmoens marked this pull request as ready for review November 28, 2023 17:58
Copy link

github-actions bot commented Nov 28, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 46.6580μs 16.0336μs 62.3689 KOps/s 63.3153 KOps/s $\color{#d91a1a}-1.49\%$
test_plain_set_stack_nested 0.2898ms 0.1437ms 6.9575 KOps/s 7.0040 KOps/s $\color{#d91a1a}-0.66\%$
test_plain_set_nested_inplace 48.6720μs 19.3258μs 51.7442 KOps/s 52.8186 KOps/s $\color{#d91a1a}-2.03\%$
test_plain_set_stack_nested_inplace 0.3283ms 0.1742ms 5.7404 KOps/s 5.7701 KOps/s $\color{#d91a1a}-0.51\%$
test_items 32.8420μs 2.3743μs 421.1702 KOps/s 409.4041 KOps/s $\color{#35bf28}+2.87\%$
test_items_nested 0.4719ms 0.2711ms 3.6885 KOps/s 3.7229 KOps/s $\color{#d91a1a}-0.93\%$
test_items_nested_locked 1.4173ms 0.2693ms 3.7129 KOps/s 3.6920 KOps/s $\color{#35bf28}+0.57\%$
test_items_nested_leaf 0.5250ms 0.1668ms 5.9951 KOps/s 5.9967 KOps/s $\color{#d91a1a}-0.03\%$
test_items_stack_nested 2.3182ms 1.4983ms 667.4413 Ops/s 677.1571 Ops/s $\color{#d91a1a}-1.43\%$
test_items_stack_nested_leaf 1.8305ms 1.3548ms 738.1203 Ops/s 737.8051 Ops/s $\color{#35bf28}+0.04\%$
test_items_stack_nested_locked 1.9110ms 0.7710ms 1.2970 KOps/s 1.2794 KOps/s $\color{#35bf28}+1.38\%$
test_keys 15.7700μs 3.8748μs 258.0775 KOps/s 260.6745 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_nested 0.5842ms 0.1405ms 7.1151 KOps/s 6.7315 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_keys_nested_locked 0.2759ms 0.1405ms 7.1181 KOps/s 7.1281 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_nested_leaf 0.3851ms 0.1403ms 7.1263 KOps/s 7.0547 KOps/s $\color{#35bf28}+1.01\%$
test_keys_stack_nested 2.2819ms 1.4072ms 710.6465 Ops/s 712.8428 Ops/s $\color{#d91a1a}-0.31\%$
test_keys_stack_nested_leaf 1.5428ms 1.4009ms 713.8190 Ops/s 703.9803 Ops/s $\color{#35bf28}+1.40\%$
test_keys_stack_nested_locked 1.1215ms 0.6795ms 1.4716 KOps/s 1.4481 KOps/s $\color{#35bf28}+1.63\%$
test_values 7.1736μs 1.1518μs 868.2127 KOps/s 857.6413 KOps/s $\color{#35bf28}+1.23\%$
test_values_nested 96.4910μs 49.7998μs 20.0804 KOps/s 19.9907 KOps/s $\color{#35bf28}+0.45\%$
test_values_nested_locked 89.7980μs 49.8085μs 20.0769 KOps/s 20.0105 KOps/s $\color{#35bf28}+0.33\%$
test_values_nested_leaf 61.5960μs 44.5909μs 22.4261 KOps/s 22.2458 KOps/s $\color{#35bf28}+0.81\%$
test_values_stack_nested 1.9646ms 1.1983ms 834.5499 Ops/s 829.3637 Ops/s $\color{#35bf28}+0.63\%$
test_values_stack_nested_leaf 1.3323ms 1.1827ms 845.5309 Ops/s 828.3555 Ops/s $\color{#35bf28}+2.07\%$
test_values_stack_nested_locked 0.9120ms 0.5145ms 1.9436 KOps/s 1.8838 KOps/s $\color{#35bf28}+3.17\%$
test_membership 32.3390μs 1.3652μs 732.4944 KOps/s 714.6323 KOps/s $\color{#35bf28}+2.50\%$
test_membership_nested 37.8910μs 2.8988μs 344.9664 KOps/s 348.0276 KOps/s $\color{#d91a1a}-0.88\%$
test_membership_nested_leaf 19.3870μs 2.7920μs 358.1711 KOps/s 340.4370 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_membership_stacked_nested 54.4930μs 11.7327μs 85.2317 KOps/s 83.9380 KOps/s $\color{#35bf28}+1.54\%$
test_membership_stacked_nested_leaf 55.0730μs 11.9364μs 83.7775 KOps/s 82.6966 KOps/s $\color{#35bf28}+1.31\%$
test_membership_nested_last 24.3250μs 5.9106μs 169.1879 KOps/s 165.6180 KOps/s $\color{#35bf28}+2.16\%$
test_membership_nested_leaf_last 39.1830μs 5.9106μs 169.1871 KOps/s 170.2078 KOps/s $\color{#d91a1a}-0.60\%$
test_membership_stacked_nested_last 0.2368ms 0.1676ms 5.9653 KOps/s 5.9905 KOps/s $\color{#d91a1a}-0.42\%$
test_membership_stacked_nested_leaf_last 42.7910μs 13.8904μs 71.9922 KOps/s 72.1322 KOps/s $\color{#d91a1a}-0.19\%$
test_nested_getleaf 40.0050μs 10.8950μs 91.7854 KOps/s 93.1165 KOps/s $\color{#d91a1a}-1.43\%$
test_nested_get 47.9900μs 10.3292μs 96.8134 KOps/s 98.1904 KOps/s $\color{#d91a1a}-1.40\%$
test_stacked_getleaf 1.4932ms 0.6424ms 1.5568 KOps/s 1.5493 KOps/s $\color{#35bf28}+0.48\%$
test_stacked_get 1.0497ms 0.6073ms 1.6466 KOps/s 1.6471 KOps/s $\color{#d91a1a}-0.03\%$
test_nested_getitemleaf 37.3900μs 10.8009μs 92.5853 KOps/s 93.4748 KOps/s $\color{#d91a1a}-0.95\%$
test_nested_getitem 28.5940μs 10.2523μs 97.5389 KOps/s 98.1461 KOps/s $\color{#d91a1a}-0.62\%$
test_stacked_getitemleaf 1.2279ms 0.6420ms 1.5576 KOps/s 1.5574 KOps/s $\color{#35bf28}+0.01\%$
test_stacked_getitem 0.7739ms 0.6111ms 1.6364 KOps/s 1.6251 KOps/s $\color{#35bf28}+0.70\%$
test_lock_nested 60.8213ms 0.6251ms 1.5997 KOps/s 1.7759 KOps/s $\textbf{\color{#d91a1a}-9.92\%}$
test_lock_stack_nested 9.0674ms 5.0231ms 199.0813 Ops/s 197.2143 Ops/s $\color{#35bf28}+0.95\%$
test_unlock_nested 0.8712ms 0.4419ms 2.2630 KOps/s 2.2755 KOps/s $\color{#d91a1a}-0.55\%$
test_unlock_stack_nested 77.2180ms 6.8049ms 146.9525 Ops/s 138.5125 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_flatten_speed 0.5687ms 0.2693ms 3.7139 KOps/s 3.7487 KOps/s $\color{#d91a1a}-0.93\%$
test_unflatten_speed 0.5283ms 0.4571ms 2.1875 KOps/s 2.1933 KOps/s $\color{#d91a1a}-0.26\%$
test_common_ops 3.4829ms 0.6749ms 1.4818 KOps/s 1.5172 KOps/s $\color{#d91a1a}-2.33\%$
test_creation 96.4610μs 2.4965μs 400.5546 KOps/s 408.8331 KOps/s $\color{#d91a1a}-2.02\%$
test_creation_empty 26.8000μs 8.3161μs 120.2486 KOps/s 124.6525 KOps/s $\color{#d91a1a}-3.53\%$
test_creation_nested_1 86.5800μs 11.5027μs 86.9361 KOps/s 87.7200 KOps/s $\color{#d91a1a}-0.89\%$
test_creation_nested_2 37.3300μs 15.0895μs 66.2714 KOps/s 67.3417 KOps/s $\color{#d91a1a}-1.59\%$
test_clone 86.7030μs 13.3787μs 74.7458 KOps/s 74.2311 KOps/s $\color{#35bf28}+0.69\%$
test_getitem[int] 42.6810μs 13.2795μs 75.3038 KOps/s 76.4661 KOps/s $\color{#d91a1a}-1.52\%$
test_getitem[slice_int] 62.2370μs 25.6034μs 39.0573 KOps/s 39.4434 KOps/s $\color{#d91a1a}-0.98\%$
test_getitem[range] 98.7260μs 44.1377μs 22.6564 KOps/s 21.2099 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_getitem[tuple] 70.2920μs 21.0209μs 47.5717 KOps/s 48.5014 KOps/s $\color{#d91a1a}-1.92\%$
test_getitem[list] 0.1023ms 40.7654μs 24.5306 KOps/s 23.8538 KOps/s $\color{#35bf28}+2.84\%$
test_setitem_dim[int] 53.7820μs 27.5048μs 36.3573 KOps/s 34.8703 KOps/s $\color{#35bf28}+4.26\%$
test_setitem_dim[slice_int] 92.9950μs 51.9506μs 19.2490 KOps/s 18.7133 KOps/s $\color{#35bf28}+2.86\%$
test_setitem_dim[range] 0.1254ms 71.1737μs 14.0501 KOps/s 13.5328 KOps/s $\color{#35bf28}+3.82\%$
test_setitem_dim[tuple] 73.5690μs 40.1853μs 24.8847 KOps/s 23.2887 KOps/s $\textbf{\color{#35bf28}+6.85\%}$
test_setitem 0.1129ms 18.3958μs 54.3603 KOps/s 54.7953 KOps/s $\color{#d91a1a}-0.79\%$
test_set 0.3382ms 19.0319μs 52.5433 KOps/s 55.0001 KOps/s $\color{#d91a1a}-4.47\%$
test_set_shared 4.6430ms 0.1409ms 7.0962 KOps/s 6.8724 KOps/s $\color{#35bf28}+3.26\%$
test_update 99.3360μs 18.8431μs 53.0699 KOps/s 53.9167 KOps/s $\color{#d91a1a}-1.57\%$
test_update_nested 89.9790μs 26.5651μs 37.6434 KOps/s 38.5113 KOps/s $\color{#d91a1a}-2.25\%$
test_set_nested 82.7660μs 19.7670μs 50.5894 KOps/s 51.2129 KOps/s $\color{#d91a1a}-1.22\%$
test_set_nested_new 80.5110μs 25.2609μs 39.5869 KOps/s 40.8924 KOps/s $\color{#d91a1a}-3.19\%$
test_select 0.1056ms 50.2257μs 19.9101 KOps/s 19.9314 KOps/s $\color{#d91a1a}-0.11\%$
test_unbind_speed 0.6026ms 0.3734ms 2.6783 KOps/s 2.6714 KOps/s $\color{#35bf28}+0.26\%$
test_unbind_speed_stack0 66.2064ms 4.6167ms 216.6052 Ops/s 223.4116 Ops/s $\color{#d91a1a}-3.05\%$
test_unbind_speed_stack1 11.6193μs 0.6315μs 1.5836 MOps/s 1.5615 MOps/s $\color{#35bf28}+1.41\%$
test_split 56.7392ms 1.7638ms 566.9554 Ops/s 554.0046 Ops/s $\color{#35bf28}+2.34\%$
test_chunk 58.3602ms 1.7341ms 576.6748 Ops/s 571.7330 Ops/s $\color{#35bf28}+0.86\%$
test_creation[device0] 3.2633ms 0.2970ms 3.3667 KOps/s 3.3388 KOps/s $\color{#35bf28}+0.84\%$
test_creation_from_tensor 3.5651ms 0.3285ms 3.0437 KOps/s 2.9746 KOps/s $\color{#35bf28}+2.32\%$
test_add_one[memmap_tensor0] 77.2350μs 24.3043μs 41.1451 KOps/s 39.1945 KOps/s $\color{#35bf28}+4.98\%$
test_contiguous[memmap_tensor0] 36.9190μs 5.6063μs 178.3719 KOps/s 175.3356 KOps/s $\color{#35bf28}+1.73\%$
test_stack[memmap_tensor0] 68.7090μs 18.8263μs 53.1173 KOps/s 50.4602 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_memmaptd_index 0.7365ms 0.3971ms 2.5185 KOps/s 2.4609 KOps/s $\color{#35bf28}+2.34\%$
test_memmaptd_index_astensor 0.9055ms 0.4570ms 2.1881 KOps/s 2.1497 KOps/s $\color{#35bf28}+1.79\%$
test_memmaptd_index_op 1.2307ms 0.7097ms 1.4091 KOps/s 1.3814 KOps/s $\color{#35bf28}+2.01\%$
test_reshape_pytree 64.0000μs 23.3459μs 42.8340 KOps/s 41.6284 KOps/s $\color{#35bf28}+2.90\%$
test_reshape_td 73.9690μs 31.3324μs 31.9159 KOps/s 32.4930 KOps/s $\color{#d91a1a}-1.78\%$
test_view_pytree 56.4060μs 23.4499μs 42.6441 KOps/s 42.1456 KOps/s $\color{#35bf28}+1.18\%$
test_view_td 22.3730μs 4.8897μs 204.5112 KOps/s 201.7704 KOps/s $\color{#35bf28}+1.36\%$
test_unbind_pytree 59.2820μs 26.5796μs 37.6229 KOps/s 36.4826 KOps/s $\color{#35bf28}+3.13\%$
test_unbind_td 0.1147ms 59.0175μs 16.9441 KOps/s 16.9258 KOps/s $\color{#35bf28}+0.11\%$
test_split_pytree 59.0420μs 26.6021μs 37.5911 KOps/s 37.2655 KOps/s $\color{#35bf28}+0.87\%$
test_split_td 1.5137ms 46.5382μs 21.4877 KOps/s 21.7929 KOps/s $\color{#d91a1a}-1.40\%$
test_add_pytree 79.7100μs 31.4375μs 31.8091 KOps/s 30.9401 KOps/s $\color{#35bf28}+2.81\%$
test_add_td 93.5260μs 44.8094μs 22.3167 KOps/s 22.6548 KOps/s $\color{#d91a1a}-1.49\%$
test_distributed 22.4920μs 6.0373μs 165.6370 KOps/s 145.3303 KOps/s $\textbf{\color{#35bf28}+13.97\%}$
test_tdmodule 0.3533ms 21.5637μs 46.3742 KOps/s 47.4389 KOps/s $\color{#d91a1a}-2.24\%$
test_tdmodule_dispatch 0.1816ms 39.2720μs 25.4634 KOps/s 25.2619 KOps/s $\color{#35bf28}+0.80\%$
test_tdseq 0.3570ms 24.3106μs 41.1343 KOps/s 41.5742 KOps/s $\color{#d91a1a}-1.06\%$
test_tdseq_dispatch 0.4135ms 43.7867μs 22.8380 KOps/s 23.5394 KOps/s $\color{#d91a1a}-2.98\%$
test_instantiation_functorch 2.8027ms 1.3293ms 752.2906 Ops/s 751.8029 Ops/s $\color{#35bf28}+0.06\%$
test_instantiation_td 1.5857ms 1.0272ms 973.5027 Ops/s 966.5329 Ops/s $\color{#35bf28}+0.72\%$
test_exec_functorch 0.2440ms 0.1580ms 6.3289 KOps/s 6.1582 KOps/s $\color{#35bf28}+2.77\%$
test_exec_functional_call 0.2898ms 0.1479ms 6.7612 KOps/s 6.6595 KOps/s $\color{#35bf28}+1.53\%$
test_exec_td 0.2144ms 0.1450ms 6.8988 KOps/s 6.8219 KOps/s $\color{#35bf28}+1.13\%$
test_exec_td_decorator 0.9160ms 0.1788ms 5.5921 KOps/s 5.5249 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed[True-True] 1.0242ms 0.9056ms 1.1042 KOps/s 1.0980 KOps/s $\color{#35bf28}+0.57\%$
test_vmap_mlp_speed[True-False] 0.6941ms 0.4778ms 2.0928 KOps/s 2.1099 KOps/s $\color{#d91a1a}-0.81\%$
test_vmap_mlp_speed[False-True] 1.5299ms 0.8014ms 1.2479 KOps/s 1.2565 KOps/s $\color{#d91a1a}-0.69\%$
test_vmap_mlp_speed[False-False] 0.6474ms 0.3971ms 2.5186 KOps/s 2.5574 KOps/s $\color{#d91a1a}-1.52\%$
test_vmap_mlp_speed_decorator[True-True] 2.9626ms 1.8422ms 542.8383 Ops/s 543.4785 Ops/s $\color{#d91a1a}-0.12\%$
test_vmap_mlp_speed_decorator[True-False] 1.1510ms 0.5323ms 1.8785 KOps/s 1.9239 KOps/s $\color{#d91a1a}-2.36\%$
test_vmap_mlp_speed_decorator[False-True] 2.1632ms 1.5243ms 656.0475 Ops/s 642.1068 Ops/s $\color{#35bf28}+2.17\%$
test_vmap_mlp_speed_decorator[False-False] 1.0173ms 0.4129ms 2.4218 KOps/s 2.4565 KOps/s $\color{#d91a1a}-1.42\%$

Copy link

github-actions bot commented Nov 28, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.4660ms 12.7087μs 78.6863 KOps/s 78.8241 KOps/s $\color{#d91a1a}-0.17\%$
test_plain_set_stack_nested 0.1839ms 0.1147ms 8.7183 KOps/s 8.5935 KOps/s $\color{#35bf28}+1.45\%$
test_plain_set_nested_inplace 31.1810μs 14.9296μs 66.9811 KOps/s 66.2674 KOps/s $\color{#35bf28}+1.08\%$
test_plain_set_stack_nested_inplace 0.1670ms 0.1397ms 7.1580 KOps/s 7.1189 KOps/s $\color{#35bf28}+0.55\%$
test_items 26.1400μs 4.6532μs 214.9057 KOps/s 212.3300 KOps/s $\color{#35bf28}+1.21\%$
test_items_nested 0.3846ms 0.3359ms 2.9767 KOps/s 2.9618 KOps/s $\color{#35bf28}+0.51\%$
test_items_nested_locked 0.3654ms 0.3405ms 2.9368 KOps/s 2.9469 KOps/s $\color{#d91a1a}-0.34\%$
test_items_nested_leaf 0.2198ms 0.1986ms 5.0365 KOps/s 4.9970 KOps/s $\color{#35bf28}+0.79\%$
test_items_stack_nested 1.5342ms 1.4770ms 677.0430 Ops/s 678.5988 Ops/s $\color{#d91a1a}-0.23\%$
test_items_stack_nested_leaf 1.3498ms 1.2929ms 773.4341 Ops/s 769.1278 Ops/s $\color{#35bf28}+0.56\%$
test_items_stack_nested_locked 0.9170ms 0.8051ms 1.2421 KOps/s 1.2265 KOps/s $\color{#35bf28}+1.28\%$
test_keys 21.5000μs 4.5602μs 219.2904 KOps/s 216.5825 KOps/s $\color{#35bf28}+1.25\%$
test_keys_nested 3.5877ms 90.7865μs 11.0149 KOps/s 11.0785 KOps/s $\color{#d91a1a}-0.57\%$
test_keys_nested_locked 0.1164ms 90.4124μs 11.0604 KOps/s 11.1054 KOps/s $\color{#d91a1a}-0.41\%$
test_keys_nested_leaf 41.2164ms 86.6785μs 11.5369 KOps/s 12.2262 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_keys_stack_nested 1.3374ms 1.2878ms 776.5132 Ops/s 782.4429 Ops/s $\color{#d91a1a}-0.76\%$
test_keys_stack_nested_leaf 1.3013ms 1.2630ms 791.7857 Ops/s 793.4818 Ops/s $\color{#d91a1a}-0.21\%$
test_keys_stack_nested_locked 0.6925ms 0.6086ms 1.6431 KOps/s 1.6518 KOps/s $\color{#d91a1a}-0.53\%$
test_values 10.0870μs 1.9153μs 522.0993 KOps/s 527.3017 KOps/s $\color{#d91a1a}-0.99\%$
test_values_nested 58.9400μs 43.1530μs 23.1733 KOps/s 23.3812 KOps/s $\color{#d91a1a}-0.89\%$
test_values_nested_locked 68.7020μs 45.2886μs 22.0806 KOps/s 22.3051 KOps/s $\color{#d91a1a}-1.01\%$
test_values_nested_leaf 0.1661ms 37.5521μs 26.6297 KOps/s 26.9031 KOps/s $\color{#d91a1a}-1.02\%$
test_values_stack_nested 1.1715ms 1.1195ms 893.2838 Ops/s 883.8733 Ops/s $\color{#35bf28}+1.06\%$
test_values_stack_nested_leaf 1.1633ms 1.1059ms 904.2602 Ops/s 909.5201 Ops/s $\color{#d91a1a}-0.58\%$
test_values_stack_nested_locked 0.5611ms 0.4911ms 2.0362 KOps/s 2.0435 KOps/s $\color{#d91a1a}-0.36\%$
test_membership 5.3222μs 0.9582μs 1.0436 MOps/s 1.0648 MOps/s $\color{#d91a1a}-1.98\%$
test_membership_nested 26.7200μs 2.1952μs 455.5343 KOps/s 467.2436 KOps/s $\color{#d91a1a}-2.51\%$
test_membership_nested_leaf 9.8250μs 2.1387μs 467.5703 KOps/s 469.8476 KOps/s $\color{#d91a1a}-0.48\%$
test_membership_stacked_nested 43.9610μs 11.0646μs 90.3782 KOps/s 91.1122 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_stacked_nested_leaf 26.9890μs 10.9698μs 91.1590 KOps/s 90.4659 KOps/s $\color{#35bf28}+0.77\%$
test_membership_nested_last 33.1200μs 4.5840μs 218.1511 KOps/s 216.2835 KOps/s $\color{#35bf28}+0.86\%$
test_membership_nested_leaf_last 18.4500μs 4.5658μs 219.0173 KOps/s 216.6153 KOps/s $\color{#35bf28}+1.11\%$
test_membership_stacked_nested_last 0.1681ms 0.1340ms 7.4634 KOps/s 7.4716 KOps/s $\color{#d91a1a}-0.11\%$
test_membership_stacked_nested_leaf_last 42.5410μs 12.9174μs 77.4151 KOps/s 78.3261 KOps/s $\color{#d91a1a}-1.16\%$
test_nested_getleaf 37.6800μs 8.4070μs 118.9492 KOps/s 118.5257 KOps/s $\color{#35bf28}+0.36\%$
test_nested_get 36.8910μs 8.0085μs 124.8669 KOps/s 125.1630 KOps/s $\color{#d91a1a}-0.24\%$
test_stacked_getleaf 0.6255ms 0.5623ms 1.7784 KOps/s 1.7668 KOps/s $\color{#35bf28}+0.65\%$
test_stacked_get 0.6037ms 0.5246ms 1.9062 KOps/s 1.8569 KOps/s $\color{#35bf28}+2.66\%$
test_nested_getitemleaf 34.8000μs 8.4596μs 118.2084 KOps/s 117.9929 KOps/s $\color{#35bf28}+0.18\%$
test_nested_getitem 29.5090μs 7.9871μs 125.2020 KOps/s 125.1197 KOps/s $\color{#35bf28}+0.07\%$
test_stacked_getitemleaf 0.8192ms 0.5718ms 1.7488 KOps/s 1.7717 KOps/s $\color{#d91a1a}-1.29\%$
test_stacked_getitem 0.6144ms 0.5365ms 1.8640 KOps/s 1.8731 KOps/s $\color{#d91a1a}-0.49\%$
test_lock_nested 3.2941ms 0.5554ms 1.8005 KOps/s 1.8076 KOps/s $\color{#d91a1a}-0.39\%$
test_lock_stack_nested 81.2377ms 7.1909ms 139.0656 Ops/s 137.3647 Ops/s $\color{#35bf28}+1.24\%$
test_unlock_nested 2.4037ms 0.4290ms 2.3309 KOps/s 2.3106 KOps/s $\color{#35bf28}+0.88\%$
test_unlock_stack_nested 66.6046ms 6.2095ms 161.0433 Ops/s 161.7890 Ops/s $\color{#d91a1a}-0.46\%$
test_flatten_speed 0.2403ms 0.1874ms 5.3357 KOps/s 5.3224 KOps/s $\color{#35bf28}+0.25\%$
test_unflatten_speed 0.4178ms 0.3652ms 2.7381 KOps/s 2.7872 KOps/s $\color{#d91a1a}-1.76\%$
test_common_ops 1.1028ms 0.5843ms 1.7115 KOps/s 1.6765 KOps/s $\color{#35bf28}+2.09\%$
test_creation 26.1200μs 2.0990μs 476.4251 KOps/s 475.9268 KOps/s $\color{#35bf28}+0.10\%$
test_creation_empty 20.5700μs 7.0713μs 141.4165 KOps/s 138.5136 KOps/s $\color{#35bf28}+2.10\%$
test_creation_nested_1 27.4990μs 9.4910μs 105.3631 KOps/s 105.2868 KOps/s $\color{#35bf28}+0.07\%$
test_creation_nested_2 27.5000μs 12.1066μs 82.5999 KOps/s 82.0937 KOps/s $\color{#35bf28}+0.62\%$
test_clone 91.2910μs 13.8607μs 72.1467 KOps/s 70.7261 KOps/s $\color{#35bf28}+2.01\%$
test_getitem[int] 35.5900μs 12.0558μs 82.9474 KOps/s 81.7214 KOps/s $\color{#35bf28}+1.50\%$
test_getitem[slice_int] 45.3910μs 23.5086μs 42.5377 KOps/s 43.4806 KOps/s $\color{#d91a1a}-2.17\%$
test_getitem[range] 69.9320μs 38.6295μs 25.8870 KOps/s 24.6380 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_getitem[tuple] 45.4900μs 19.9301μs 50.1754 KOps/s 50.5934 KOps/s $\color{#d91a1a}-0.83\%$
test_getitem[list] 0.2505ms 35.4755μs 28.1885 KOps/s 27.7218 KOps/s $\color{#35bf28}+1.68\%$
test_setitem_dim[int] 48.6610μs 25.2992μs 39.5270 KOps/s 39.3677 KOps/s $\color{#35bf28}+0.40\%$
test_setitem_dim[slice_int] 69.2900μs 44.3280μs 22.5591 KOps/s 22.3972 KOps/s $\color{#35bf28}+0.72\%$
test_setitem_dim[range] 82.4410μs 60.9822μs 16.3982 KOps/s 16.2700 KOps/s $\color{#35bf28}+0.79\%$
test_setitem_dim[tuple] 54.1400μs 37.9401μs 26.3574 KOps/s 26.1931 KOps/s $\color{#35bf28}+0.63\%$
test_setitem 84.2220μs 17.6317μs 56.7162 KOps/s 56.6408 KOps/s $\color{#35bf28}+0.13\%$
test_set 91.9600μs 17.1969μs 58.1499 KOps/s 58.0761 KOps/s $\color{#35bf28}+0.13\%$
test_set_shared 2.9597ms 0.1013ms 9.8731 KOps/s 8.8245 KOps/s $\textbf{\color{#35bf28}+11.88\%}$
test_update 80.1520μs 18.6421μs 53.6420 KOps/s 53.7030 KOps/s $\color{#d91a1a}-0.11\%$
test_update_nested 85.3210μs 25.1643μs 39.7388 KOps/s 39.6640 KOps/s $\color{#35bf28}+0.19\%$
test_set_nested 85.2420μs 18.4419μs 54.2243 KOps/s 53.3991 KOps/s $\color{#35bf28}+1.55\%$
test_set_nested_new 98.0820μs 22.9179μs 43.6339 KOps/s 43.2024 KOps/s $\color{#35bf28}+1.00\%$
test_select 0.1093ms 44.9574μs 22.2433 KOps/s 22.0536 KOps/s $\color{#35bf28}+0.86\%$
test_to 70.9310μs 51.8273μs 19.2949 KOps/s 19.9216 KOps/s $\color{#d91a1a}-3.15\%$
test_to_nonblocking 53.4710μs 33.1474μs 30.1683 KOps/s 29.3310 KOps/s $\color{#35bf28}+2.85\%$
test_unbind_speed 0.4230ms 0.3550ms 2.8172 KOps/s 2.8210 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_speed_stack0 62.3056ms 4.3083ms 232.1086 Ops/s 235.6303 Ops/s $\color{#d91a1a}-1.49\%$
test_unbind_speed_stack1 1.5076μs 0.5275μs 1.8958 MOps/s 1.9005 MOps/s $\color{#d91a1a}-0.24\%$
test_split 53.4580ms 1.8160ms 550.6486 Ops/s 549.2980 Ops/s $\color{#35bf28}+0.25\%$
test_chunk 53.3255ms 1.7918ms 558.1016 Ops/s 556.3566 Ops/s $\color{#35bf28}+0.31\%$
test_creation[device0] 0.3641ms 0.3057ms 3.2706 KOps/s 3.2685 KOps/s $\color{#35bf28}+0.07\%$
test_creation[device1] 0.6539ms 0.3086ms 3.2406 KOps/s 3.2458 KOps/s $\color{#d91a1a}-0.16\%$
test_creation_from_tensor 0.5612ms 0.3326ms 3.0068 KOps/s 3.0052 KOps/s $\color{#35bf28}+0.05\%$
test_add_one[memmap_tensor0] 62.2510μs 22.7582μs 43.9401 KOps/s 42.1061 KOps/s $\color{#35bf28}+4.36\%$
test_add_one[memmap_tensor1] 0.2067ms 70.1382μs 14.2576 KOps/s 14.2497 KOps/s $\color{#35bf28}+0.06\%$
test_contiguous[memmap_tensor0] 21.6500μs 5.6817μs 176.0045 KOps/s 167.5912 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_contiguous[memmap_tensor1] 50.0700μs 20.7965μs 48.0851 KOps/s 48.1732 KOps/s $\color{#d91a1a}-0.18\%$
test_stack[memmap_tensor0] 37.0810μs 18.9763μs 52.6972 KOps/s 50.6809 KOps/s $\color{#35bf28}+3.98\%$
test_stack[memmap_tensor1] 0.1568ms 70.5758μs 14.1692 KOps/s 14.0483 KOps/s $\color{#35bf28}+0.86\%$
test_memmaptd_index 0.4807ms 0.4169ms 2.3984 KOps/s 2.3280 KOps/s $\color{#35bf28}+3.02\%$
test_memmaptd_index_astensor 0.5347ms 0.4691ms 2.1318 KOps/s 2.0570 KOps/s $\color{#35bf28}+3.64\%$
test_memmaptd_index_op 0.7663ms 0.7186ms 1.3917 KOps/s 1.3503 KOps/s $\color{#35bf28}+3.06\%$
test_reshape_pytree 46.7090μs 20.7263μs 48.2478 KOps/s 47.4069 KOps/s $\color{#35bf28}+1.77\%$
test_reshape_td 47.0110μs 28.9461μs 34.5470 KOps/s 33.2976 KOps/s $\color{#35bf28}+3.75\%$
test_view_pytree 38.7700μs 20.1732μs 49.5706 KOps/s 48.4931 KOps/s $\color{#35bf28}+2.22\%$
test_view_td 19.9310μs 4.0340μs 247.8957 KOps/s 245.1600 KOps/s $\color{#35bf28}+1.12\%$
test_unbind_pytree 49.3310μs 25.2872μs 39.5457 KOps/s 38.9450 KOps/s $\color{#35bf28}+1.54\%$
test_unbind_td 79.9910μs 55.2162μs 18.1106 KOps/s 17.7694 KOps/s $\color{#35bf28}+1.92\%$
test_split_pytree 41.9520μs 23.5802μs 42.4084 KOps/s 41.5280 KOps/s $\color{#35bf28}+2.12\%$
test_split_td 63.4010μs 42.5085μs 23.5247 KOps/s 22.9180 KOps/s $\color{#35bf28}+2.65\%$
test_add_pytree 58.3910μs 30.3632μs 32.9346 KOps/s 32.5502 KOps/s $\color{#35bf28}+1.18\%$
test_add_td 68.5210μs 40.7237μs 24.5557 KOps/s 24.6856 KOps/s $\color{#d91a1a}-0.53\%$
test_distributed 20.1210μs 5.5786μs 179.2577 KOps/s 180.9068 KOps/s $\color{#d91a1a}-0.91\%$
test_tdmodule 30.9610μs 16.3131μs 61.3005 KOps/s 60.8514 KOps/s $\color{#35bf28}+0.74\%$
test_tdmodule_dispatch 0.2196ms 32.3712μs 30.8917 KOps/s 30.6634 KOps/s $\color{#35bf28}+0.74\%$
test_tdseq 37.0410μs 19.5896μs 51.0475 KOps/s 50.9739 KOps/s $\color{#35bf28}+0.14\%$
test_tdseq_dispatch 51.2500μs 35.4122μs 28.2389 KOps/s 28.1713 KOps/s $\color{#35bf28}+0.24\%$
test_instantiation_functorch 1.7097ms 1.6420ms 609.0179 Ops/s 606.1712 Ops/s $\color{#35bf28}+0.47\%$
test_instantiation_td 1.6621ms 1.1695ms 855.0916 Ops/s 856.2561 Ops/s $\color{#d91a1a}-0.14\%$
test_exec_functorch 0.2064ms 0.1532ms 6.5269 KOps/s 6.5674 KOps/s $\color{#d91a1a}-0.62\%$
test_exec_functional_call 0.2115ms 0.1512ms 6.6126 KOps/s 6.7387 KOps/s $\color{#d91a1a}-1.87\%$
test_exec_td 0.1870ms 0.1407ms 7.1096 KOps/s 7.0989 KOps/s $\color{#35bf28}+0.15\%$
test_exec_td_decorator 0.9138ms 0.1800ms 5.5545 KOps/s 5.6504 KOps/s $\color{#d91a1a}-1.70\%$
test_vmap_mlp_speed[True-True] 1.1059ms 1.0299ms 970.9579 Ops/s 964.0342 Ops/s $\color{#35bf28}+0.72\%$
test_vmap_mlp_speed[True-False] 0.6472ms 0.5896ms 1.6959 KOps/s 1.6948 KOps/s $\color{#35bf28}+0.07\%$
test_vmap_mlp_speed[False-True] 1.0977ms 0.9750ms 1.0256 KOps/s 1.0555 KOps/s $\color{#d91a1a}-2.83\%$
test_vmap_mlp_speed[False-False] 0.5859ms 0.5204ms 1.9217 KOps/s 1.9259 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed_decorator[True-True] 2.8861ms 1.9702ms 507.5733 Ops/s 505.7546 Ops/s $\color{#35bf28}+0.36\%$
test_vmap_mlp_speed_decorator[True-False] 1.1635ms 0.6344ms 1.5762 KOps/s 1.5772 KOps/s $\color{#d91a1a}-0.06\%$
test_vmap_mlp_speed_decorator[False-True] 2.1503ms 1.7076ms 585.6081 Ops/s 584.8303 Ops/s $\color{#35bf28}+0.13\%$
test_vmap_mlp_speed_decorator[False-False] 0.9159ms 0.5363ms 1.8647 KOps/s 1.8636 KOps/s $\color{#35bf28}+0.06\%$
test_vmap_transformer_speed[True-True] 12.6707ms 12.1460ms 82.3318 Ops/s 82.3736 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed[True-False] 7.9779ms 7.9387ms 125.9656 Ops/s 126.1106 Ops/s $\color{#d91a1a}-0.11\%$
test_vmap_transformer_speed[False-True] 12.0648ms 11.9897ms 83.4050 Ops/s 83.1239 Ops/s $\color{#35bf28}+0.34\%$
test_vmap_transformer_speed[False-False] 8.1924ms 7.8879ms 126.7763 Ops/s 127.1284 Ops/s $\color{#d91a1a}-0.28\%$
test_vmap_transformer_speed_decorator[True-True] 62.9312ms 62.0819ms 16.1077 Ops/s 14.8890 Ops/s $\textbf{\color{#35bf28}+8.19\%}$
test_vmap_transformer_speed_decorator[True-False] 21.2377ms 19.2075ms 52.0629 Ops/s 52.3682 Ops/s $\color{#d91a1a}-0.58\%$
test_vmap_transformer_speed_decorator[False-True] 57.5240ms 56.3764ms 17.7379 Ops/s 17.6902 Ops/s $\color{#35bf28}+0.27\%$
test_vmap_transformer_speed_decorator[False-False] 20.9441ms 18.8011ms 53.1883 Ops/s 53.6179 Ops/s $\color{#d91a1a}-0.80\%$

@vmoens vmoens merged commit 5bf5eed into main Nov 29, 2023
@vmoens vmoens deleted the del-param-before-set branch November 29, 2023 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants