Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix to_module __exit__ update when td is locked #671

Merged
merged 2 commits into from
Feb 7, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 7, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2024
@vmoens vmoens changed the title [BugFix] Fix to_module \__exit__\ update when td is locked [BugFix] Fix to_module __exit__ update when td is locked Feb 7, 2024
@vmoens vmoens added the bug Something isn't working label Feb 7, 2024
Copy link

github-actions bot commented Feb 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 126. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 43.5500μs 17.7318μs 56.3957 KOps/s 55.6109 KOps/s $\color{#35bf28}+1.41\%$
test_plain_set_stack_nested 0.1774ms 0.1454ms 6.8780 KOps/s 6.8062 KOps/s $\color{#35bf28}+1.05\%$
test_plain_set_nested_inplace 59.4740μs 20.2383μs 49.4111 KOps/s 47.9261 KOps/s $\color{#35bf28}+3.10\%$
test_plain_set_stack_nested_inplace 0.3059ms 0.1781ms 5.6153 KOps/s 5.6117 KOps/s $\color{#35bf28}+0.06\%$
test_items 16.8510μs 2.4561μs 407.1562 KOps/s 397.8281 KOps/s $\color{#35bf28}+2.34\%$
test_items_nested 0.5851ms 0.2713ms 3.6865 KOps/s 3.6293 KOps/s $\color{#35bf28}+1.57\%$
test_items_nested_locked 0.9238ms 0.2702ms 3.7008 KOps/s 3.6440 KOps/s $\color{#35bf28}+1.56\%$
test_items_nested_leaf 0.3289ms 0.1666ms 6.0016 KOps/s 5.9071 KOps/s $\color{#35bf28}+1.60\%$
test_items_stack_nested 1.5474ms 1.3156ms 760.1172 Ops/s 763.1260 Ops/s $\color{#d91a1a}-0.39\%$
test_items_stack_nested_leaf 1.5836ms 1.1711ms 853.8775 Ops/s 839.0063 Ops/s $\color{#35bf28}+1.77\%$
test_items_stack_nested_locked 1.4158ms 0.9105ms 1.0983 KOps/s 1.1429 KOps/s $\color{#d91a1a}-3.90\%$
test_keys 35.4060μs 3.8175μs 261.9515 KOps/s 258.7895 KOps/s $\color{#35bf28}+1.22\%$
test_keys_nested 1.6166ms 0.1468ms 6.8101 KOps/s 6.9229 KOps/s $\color{#d91a1a}-1.63\%$
test_keys_nested_locked 0.2823ms 0.1501ms 6.6642 KOps/s 6.6729 KOps/s $\color{#d91a1a}-0.13\%$
test_keys_nested_leaf 0.2438ms 0.1275ms 7.8437 KOps/s 7.7923 KOps/s $\color{#35bf28}+0.66\%$
test_keys_stack_nested 1.8826ms 1.2548ms 796.9433 Ops/s 792.9634 Ops/s $\color{#35bf28}+0.50\%$
test_keys_stack_nested_leaf 1.4332ms 1.2543ms 797.2552 Ops/s 794.9531 Ops/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested_locked 0.8980ms 0.7971ms 1.2546 KOps/s 1.2543 KOps/s $\color{#35bf28}+0.03\%$
test_values 5.2295μs 1.1496μs 869.8660 KOps/s 730.6243 KOps/s $\textbf{\color{#35bf28}+19.06\%}$
test_values_nested 96.8500μs 51.2449μs 19.5141 KOps/s 19.0647 KOps/s $\color{#35bf28}+2.36\%$
test_values_nested_locked 0.1041ms 51.7082μs 19.3393 KOps/s 19.1588 KOps/s $\color{#35bf28}+0.94\%$
test_values_nested_leaf 98.4530μs 44.9634μs 22.2403 KOps/s 21.6629 KOps/s $\color{#35bf28}+2.67\%$
test_values_stack_nested 2.1758ms 1.0046ms 995.4415 Ops/s 974.5329 Ops/s $\color{#35bf28}+2.15\%$
test_values_stack_nested_leaf 1.1916ms 1.0109ms 989.2536 Ops/s 963.6581 Ops/s $\color{#35bf28}+2.66\%$
test_values_stack_nested_locked 1.0304ms 0.5898ms 1.6954 KOps/s 1.6690 KOps/s $\color{#35bf28}+1.58\%$
test_membership 14.0960μs 1.3079μs 764.5835 KOps/s 737.1699 KOps/s $\color{#35bf28}+3.72\%$
test_membership_nested 18.0230μs 3.3101μs 302.1027 KOps/s 288.2594 KOps/s $\color{#35bf28}+4.80\%$
test_membership_nested_leaf 23.6640μs 3.3847μs 295.4513 KOps/s 277.8455 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_membership_stacked_nested 41.6070μs 11.5775μs 86.3745 KOps/s 80.3630 KOps/s $\textbf{\color{#35bf28}+7.48\%}$
test_membership_stacked_nested_leaf 43.6510μs 11.5677μs 86.4477 KOps/s 84.5926 KOps/s $\color{#35bf28}+2.19\%$
test_membership_nested_last 47.4380μs 6.4031μs 156.1740 KOps/s 151.0435 KOps/s $\color{#35bf28}+3.40\%$
test_membership_nested_leaf_last 36.4980μs 6.4388μs 155.3080 KOps/s 150.7603 KOps/s $\color{#35bf28}+3.02\%$
test_membership_stacked_nested_last 0.3351ms 0.1711ms 5.8445 KOps/s 5.6218 KOps/s $\color{#35bf28}+3.96\%$
test_membership_stacked_nested_leaf_last 42.4690μs 13.8130μs 72.3957 KOps/s 72.5751 KOps/s $\color{#d91a1a}-0.25\%$
test_nested_getleaf 32.0200μs 10.4993μs 95.2444 KOps/s 94.4628 KOps/s $\color{#35bf28}+0.83\%$
test_nested_get 28.8430μs 9.9125μs 100.8825 KOps/s 98.6818 KOps/s $\color{#35bf28}+2.23\%$
test_stacked_getleaf 0.8875ms 0.3932ms 2.5433 KOps/s 2.5030 KOps/s $\color{#35bf28}+1.61\%$
test_stacked_get 0.7478ms 0.3628ms 2.7565 KOps/s 2.7236 KOps/s $\color{#35bf28}+1.21\%$
test_nested_getitemleaf 42.3890μs 11.8634μs 84.2930 KOps/s 81.4591 KOps/s $\color{#35bf28}+3.48\%$
test_nested_getitem 55.9740μs 11.2221μs 89.1096 KOps/s 86.3054 KOps/s $\color{#35bf28}+3.25\%$
test_stacked_getitemleaf 0.6501ms 0.3941ms 2.5376 KOps/s 2.4636 KOps/s $\color{#35bf28}+3.00\%$
test_stacked_getitem 0.7028ms 0.3681ms 2.7169 KOps/s 2.6702 KOps/s $\color{#35bf28}+1.75\%$
test_lock_nested 2.9173ms 0.3329ms 3.0037 KOps/s 2.9352 KOps/s $\color{#35bf28}+2.33\%$
test_lock_stack_nested 81.4233ms 5.6891ms 175.7755 Ops/s 172.2869 Ops/s $\color{#35bf28}+2.02\%$
test_unlock_nested 64.9013ms 0.3903ms 2.5624 KOps/s 2.9828 KOps/s $\textbf{\color{#d91a1a}-14.10\%}$
test_unlock_stack_nested 84.6580ms 5.8052ms 172.2597 Ops/s 168.5473 Ops/s $\color{#35bf28}+2.20\%$
test_flatten_speed 0.6597ms 0.3637ms 2.7494 KOps/s 2.7257 KOps/s $\color{#35bf28}+0.87\%$
test_unflatten_speed 0.7371ms 0.4484ms 2.2300 KOps/s 2.1382 KOps/s $\color{#35bf28}+4.29\%$
test_common_ops 1.3100ms 0.6952ms 1.4384 KOps/s 1.4007 KOps/s $\color{#35bf28}+2.69\%$
test_creation 22.2010μs 1.8075μs 553.2577 KOps/s 544.3875 KOps/s $\color{#35bf28}+1.63\%$
test_creation_empty 36.0070μs 11.4492μs 87.3425 KOps/s 84.3618 KOps/s $\color{#35bf28}+3.53\%$
test_creation_nested_1 35.0560μs 14.0552μs 71.1481 KOps/s 69.2094 KOps/s $\color{#35bf28}+2.80\%$
test_creation_nested_2 45.9660μs 17.1052μs 58.4617 KOps/s 56.3855 KOps/s $\color{#35bf28}+3.68\%$
test_clone 63.2480μs 12.9972μs 76.9398 KOps/s 78.3190 KOps/s $\color{#d91a1a}-1.76\%$
test_getitem[int] 27.8820μs 11.1304μs 89.8443 KOps/s 86.2829 KOps/s $\color{#35bf28}+4.13\%$
test_getitem[slice_int] 52.0870μs 22.4867μs 44.4708 KOps/s 42.8061 KOps/s $\color{#35bf28}+3.89\%$
test_getitem[range] 93.7440μs 39.4498μs 25.3487 KOps/s 24.1118 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_getitem[tuple] 70.8410μs 18.7082μs 53.4525 KOps/s 53.5420 KOps/s $\color{#d91a1a}-0.17\%$
test_getitem[list] 0.1421ms 35.9367μs 27.8267 KOps/s 27.1497 KOps/s $\color{#35bf28}+2.49\%$
test_setitem_dim[int] 66.4430μs 31.7712μs 31.4751 KOps/s 30.5290 KOps/s $\color{#35bf28}+3.10\%$
test_setitem_dim[slice_int] 0.1032ms 56.7077μs 17.6343 KOps/s 17.1403 KOps/s $\color{#35bf28}+2.88\%$
test_setitem_dim[range] 4.7432ms 84.7158μs 11.8042 KOps/s 12.8884 KOps/s $\textbf{\color{#d91a1a}-8.41\%}$
test_setitem_dim[tuple] 87.6130μs 46.3291μs 21.5847 KOps/s 20.7393 KOps/s $\color{#35bf28}+4.08\%$
test_setitem 69.4790μs 20.4470μs 48.9069 KOps/s 49.8183 KOps/s $\color{#d91a1a}-1.83\%$
test_set 63.3780μs 19.2270μs 52.0103 KOps/s 50.8529 KOps/s $\color{#35bf28}+2.28\%$
test_set_shared 3.3843ms 0.1343ms 7.4436 KOps/s 7.3741 KOps/s $\color{#35bf28}+0.94\%$
test_update 0.1229ms 22.2112μs 45.0223 KOps/s 43.5983 KOps/s $\color{#35bf28}+3.27\%$
test_update_nested 0.1023ms 31.2654μs 31.9842 KOps/s 31.9924 KOps/s $\color{#d91a1a}-0.03\%$
test_set_nested 0.1495ms 22.1719μs 45.1021 KOps/s 46.5361 KOps/s $\color{#d91a1a}-3.08\%$
test_set_nested_new 84.9290μs 25.4272μs 39.3280 KOps/s 39.5678 KOps/s $\color{#d91a1a}-0.61\%$
test_select 87.0230μs 37.8633μs 26.4108 KOps/s 25.8821 KOps/s $\color{#35bf28}+2.04\%$
test_select_nested 0.1253ms 57.3613μs 17.4334 KOps/s 17.0011 KOps/s $\color{#35bf28}+2.54\%$
test_exclude_nested 0.2296ms 0.1170ms 8.5503 KOps/s 8.4343 KOps/s $\color{#35bf28}+1.37\%$
test_empty[True] 0.6382ms 0.4043ms 2.4733 KOps/s 2.4091 KOps/s $\color{#35bf28}+2.66\%$
test_empty[False] 5.3340μs 1.0471μs 955.0118 KOps/s 950.4587 KOps/s $\color{#35bf28}+0.48\%$
test_unbind_speed 0.3312ms 0.2450ms 4.0815 KOps/s 3.9903 KOps/s $\color{#35bf28}+2.29\%$
test_unbind_speed_stack0 74.1098ms 3.2951ms 303.4828 Ops/s 314.7648 Ops/s $\color{#d91a1a}-3.58\%$
test_unbind_speed_stack1 16.4273μs 1.8600μs 537.6389 KOps/s 521.0016 KOps/s $\color{#35bf28}+3.19\%$
test_split 2.1901ms 1.4573ms 686.2021 Ops/s 607.4180 Ops/s $\textbf{\color{#35bf28}+12.97\%}$
test_chunk 70.5908ms 1.5630ms 639.8002 Ops/s 629.6530 Ops/s $\color{#35bf28}+1.61\%$
test_creation[device0] 0.1863ms 99.8503μs 10.0150 KOps/s 9.6815 KOps/s $\color{#35bf28}+3.44\%$
test_creation_from_tensor 3.4170ms 78.3744μs 12.7593 KOps/s 12.2513 KOps/s $\color{#35bf28}+4.15\%$
test_add_one[memmap_tensor0] 0.1923ms 5.5070μs 181.5869 KOps/s 188.1742 KOps/s $\color{#d91a1a}-3.50\%$
test_contiguous[memmap_tensor0] 22.8920μs 0.6236μs 1.6036 MOps/s 1.5754 MOps/s $\color{#35bf28}+1.79\%$
test_stack[memmap_tensor0] 50.6450μs 3.5467μs 281.9499 KOps/s 282.5630 KOps/s $\color{#d91a1a}-0.22\%$
test_memmaptd_index 1.0032ms 0.2350ms 4.2552 KOps/s 4.0665 KOps/s $\color{#35bf28}+4.64\%$
test_memmaptd_index_astensor 0.6665ms 0.2965ms 3.3729 KOps/s 3.2370 KOps/s $\color{#35bf28}+4.20\%$
test_memmaptd_index_op 0.8716ms 0.6006ms 1.6649 KOps/s 1.6239 KOps/s $\color{#35bf28}+2.53\%$
test_serialize_model 0.1691s 0.1077s 9.2847 Ops/s 8.9855 Ops/s $\color{#35bf28}+3.33\%$
test_serialize_model_pickle 0.4496s 0.3745s 2.6701 Ops/s 2.5766 Ops/s $\color{#35bf28}+3.63\%$
test_serialize_weights 0.1793s 0.1079s 9.2717 Ops/s 9.4227 Ops/s $\color{#d91a1a}-1.60\%$
test_serialize_weights_returnearly 0.1850s 0.1273s 7.8576 Ops/s 7.4788 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_serialize_weights_pickle 0.7529s 0.4986s 2.0055 Ops/s 2.3946 Ops/s $\textbf{\color{#d91a1a}-16.25\%}$
test_serialize_weights_filesystem 0.1063s 88.4566ms 11.3050 Ops/s 9.9083 Ops/s $\textbf{\color{#35bf28}+14.10\%}$
test_serialize_model_filesystem 0.1701s 99.1236ms 10.0884 Ops/s 10.5809 Ops/s $\color{#d91a1a}-4.65\%$
test_reshape_pytree 48.8210μs 20.7033μs 48.3015 KOps/s 47.8839 KOps/s $\color{#35bf28}+0.87\%$
test_reshape_td 71.3620μs 31.0813μs 32.1737 KOps/s 31.9615 KOps/s $\color{#35bf28}+0.66\%$
test_view_pytree 59.3900μs 20.4481μs 48.9043 KOps/s 48.4945 KOps/s $\color{#35bf28}+0.85\%$
test_view_td 75.2255ms 10.9540μs 91.2905 KOps/s 129.3743 KOps/s $\textbf{\color{#d91a1a}-29.44\%}$
test_unbind_pytree 0.3764ms 23.8692μs 41.8950 KOps/s 40.7566 KOps/s $\color{#35bf28}+2.79\%$
test_unbind_td 79.7590μs 35.7719μs 27.9549 KOps/s 28.5348 KOps/s $\color{#d91a1a}-2.03\%$
test_split_pytree 58.5590μs 23.8083μs 42.0022 KOps/s 42.8948 KOps/s $\color{#d91a1a}-2.08\%$
test_split_td 0.1169ms 39.6592μs 25.2149 KOps/s 24.7909 KOps/s $\color{#35bf28}+1.71\%$
test_add_pytree 67.1450μs 29.3682μs 34.0505 KOps/s 34.6444 KOps/s $\color{#d91a1a}-1.71\%$
test_add_td 0.1164ms 54.8490μs 18.2319 KOps/s 18.3532 KOps/s $\color{#d91a1a}-0.66\%$
test_distributed 0.2687ms 94.8802μs 10.5396 KOps/s 10.2716 KOps/s $\color{#35bf28}+2.61\%$
test_tdmodule 0.3864ms 21.9948μs 45.4653 KOps/s 42.7838 KOps/s $\textbf{\color{#35bf28}+6.27\%}$
test_tdmodule_dispatch 0.2171ms 44.7114μs 22.3657 KOps/s 21.9316 KOps/s $\color{#35bf28}+1.98\%$
test_tdseq 0.3185ms 26.0577μs 38.3763 KOps/s 37.8638 KOps/s $\color{#35bf28}+1.35\%$
test_tdseq_dispatch 0.1383ms 48.0057μs 20.8309 KOps/s 20.1960 KOps/s $\color{#35bf28}+3.14\%$
test_instantiation_functorch 1.4390ms 1.3001ms 769.1705 Ops/s 764.1760 Ops/s $\color{#35bf28}+0.65\%$
test_instantiation_td 1.5144ms 0.9867ms 1.0135 KOps/s 992.1832 Ops/s $\color{#35bf28}+2.15\%$
test_exec_functorch 0.2878ms 0.1589ms 6.2933 KOps/s 6.3443 KOps/s $\color{#d91a1a}-0.80\%$
test_exec_functional_call 0.2818ms 0.1481ms 6.7528 KOps/s 6.9076 KOps/s $\color{#d91a1a}-2.24\%$
test_exec_td 0.2137ms 0.1458ms 6.8580 KOps/s 7.1108 KOps/s $\color{#d91a1a}-3.56\%$
test_exec_td_decorator 0.6025ms 0.1745ms 5.7309 KOps/s 5.8463 KOps/s $\color{#d91a1a}-1.97\%$
test_vmap_mlp_speed[True-True] 1.2695ms 0.8753ms 1.1424 KOps/s 1.1225 KOps/s $\color{#35bf28}+1.78\%$
test_vmap_mlp_speed[True-False] 0.6577ms 0.4673ms 2.1401 KOps/s 2.0570 KOps/s $\color{#35bf28}+4.04\%$
test_vmap_mlp_speed[False-True] 1.2201ms 0.7694ms 1.2997 KOps/s 1.2944 KOps/s $\color{#35bf28}+0.41\%$
test_vmap_mlp_speed[False-False] 0.7339ms 0.3819ms 2.6188 KOps/s 2.5841 KOps/s $\color{#35bf28}+1.34\%$
test_vmap_mlp_speed_decorator[True-True] 1.7425ms 1.5407ms 649.0407 Ops/s 707.8888 Ops/s $\textbf{\color{#d91a1a}-8.31\%}$
test_vmap_mlp_speed_decorator[True-False] 1.2384ms 0.5067ms 1.9737 KOps/s 1.9022 KOps/s $\color{#35bf28}+3.76\%$
test_vmap_mlp_speed_decorator[False-True] 1.6782ms 1.3013ms 768.4454 Ops/s 889.9931 Ops/s $\textbf{\color{#d91a1a}-13.66\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7215ms 0.3912ms 2.5564 KOps/s 2.5297 KOps/s $\color{#35bf28}+1.05\%$
test_to_module_speed[True] 1.7084ms 1.0922ms 915.5660 Ops/s 907.7039 Ops/s $\color{#35bf28}+0.87\%$
test_to_module_speed[False] 0.1140s 1.2022ms 831.8179 Ops/s 916.3390 Ops/s $\textbf{\color{#d91a1a}-9.22\%}$

Copy link

github-actions bot commented Feb 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 134. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1285ms 12.9265μs 77.3605 KOps/s 76.1820 KOps/s $\color{#35bf28}+1.55\%$
test_plain_set_stack_nested 0.1763ms 0.1172ms 8.5350 KOps/s 8.3854 KOps/s $\color{#35bf28}+1.78\%$
test_plain_set_nested_inplace 37.9910μs 14.3311μs 69.7783 KOps/s 68.6101 KOps/s $\color{#35bf28}+1.70\%$
test_plain_set_stack_nested_inplace 0.1870ms 0.1459ms 6.8557 KOps/s 6.7737 KOps/s $\color{#35bf28}+1.21\%$
test_items 20.0400μs 4.7722μs 209.5488 KOps/s 208.7898 KOps/s $\color{#35bf28}+0.36\%$
test_items_nested 0.3847ms 0.3396ms 2.9445 KOps/s 2.9314 KOps/s $\color{#35bf28}+0.45\%$
test_items_nested_locked 0.3739ms 0.3445ms 2.9030 KOps/s 2.9113 KOps/s $\color{#d91a1a}-0.28\%$
test_items_nested_leaf 0.2412ms 0.2013ms 4.9672 KOps/s 4.9666 KOps/s $\color{#35bf28}+0.01\%$
test_items_stack_nested 1.4495ms 1.2911ms 774.5409 Ops/s 766.7687 Ops/s $\color{#35bf28}+1.01\%$
test_items_stack_nested_leaf 1.3040ms 1.1340ms 881.8368 Ops/s 876.1814 Ops/s $\color{#35bf28}+0.65\%$
test_items_stack_nested_locked 2.3016ms 0.8908ms 1.1226 KOps/s 1.1220 KOps/s $\color{#35bf28}+0.05\%$
test_keys 49.0610μs 4.5690μs 218.8680 KOps/s 213.3963 KOps/s $\color{#35bf28}+2.56\%$
test_keys_nested 0.4995ms 94.4743μs 10.5849 KOps/s 10.3890 KOps/s $\color{#35bf28}+1.89\%$
test_keys_nested_locked 0.2745ms 97.5531μs 10.2508 KOps/s 10.0513 KOps/s $\color{#35bf28}+1.98\%$
test_keys_nested_leaf 0.2527ms 77.5425μs 12.8962 KOps/s 12.6884 KOps/s $\color{#35bf28}+1.64\%$
test_keys_stack_nested 1.2650ms 1.1238ms 889.8505 Ops/s 866.0466 Ops/s $\color{#35bf28}+2.75\%$
test_keys_stack_nested_leaf 1.2901ms 1.1230ms 890.4977 Ops/s 873.9018 Ops/s $\color{#35bf28}+1.90\%$
test_keys_stack_nested_locked 0.7355ms 0.6996ms 1.4294 KOps/s 1.4093 KOps/s $\color{#35bf28}+1.43\%$
test_values 7.2833μs 1.8797μs 531.9956 KOps/s 532.4967 KOps/s $\color{#d91a1a}-0.09\%$
test_values_nested 0.1306ms 45.0359μs 22.2045 KOps/s 22.1547 KOps/s $\color{#35bf28}+0.22\%$
test_values_nested_locked 0.1033ms 47.4316μs 21.0830 KOps/s 20.9394 KOps/s $\color{#35bf28}+0.69\%$
test_values_nested_leaf 54.2400μs 39.4801μs 25.3292 KOps/s 25.2384 KOps/s $\color{#35bf28}+0.36\%$
test_values_stack_nested 1.0444ms 0.9479ms 1.0549 KOps/s 1.0496 KOps/s $\color{#35bf28}+0.50\%$
test_values_stack_nested_leaf 0.9931ms 0.9398ms 1.0640 KOps/s 1.0565 KOps/s $\color{#35bf28}+0.72\%$
test_values_stack_nested_locked 0.6304ms 0.5567ms 1.7963 KOps/s 1.7786 KOps/s $\color{#35bf28}+1.00\%$
test_membership 3.9980μs 0.9571μs 1.0448 MOps/s 1.0519 MOps/s $\color{#d91a1a}-0.68\%$
test_membership_nested 19.4500μs 2.9494μs 339.0546 KOps/s 341.7259 KOps/s $\color{#d91a1a}-0.78\%$
test_membership_nested_leaf 20.2700μs 2.9240μs 341.9937 KOps/s 342.0200 KOps/s $-0.01\%$
test_membership_stacked_nested 0.1695ms 11.3510μs 88.0976 KOps/s 87.1209 KOps/s $\color{#35bf28}+1.12\%$
test_membership_stacked_nested_leaf 35.5610μs 11.3377μs 88.2016 KOps/s 86.7185 KOps/s $\color{#35bf28}+1.71\%$
test_membership_nested_last 0.1825ms 5.3491μs 186.9489 KOps/s 188.6623 KOps/s $\color{#d91a1a}-0.91\%$
test_membership_nested_leaf_last 0.2002ms 5.3150μs 188.1477 KOps/s 187.2719 KOps/s $\color{#35bf28}+0.47\%$
test_membership_stacked_nested_last 0.3518ms 0.1618ms 6.1820 KOps/s 6.2942 KOps/s $\color{#d91a1a}-1.78\%$
test_membership_stacked_nested_leaf_last 0.1775ms 13.2623μs 75.4020 KOps/s 74.7679 KOps/s $\color{#35bf28}+0.85\%$
test_nested_getleaf 0.1888ms 8.4636μs 118.1535 KOps/s 118.2785 KOps/s $\color{#d91a1a}-0.11\%$
test_nested_get 26.6700μs 8.0161μs 124.7497 KOps/s 125.2481 KOps/s $\color{#d91a1a}-0.40\%$
test_stacked_getleaf 0.4632ms 0.3257ms 3.0703 KOps/s 3.0336 KOps/s $\color{#35bf28}+1.21\%$
test_stacked_get 0.3350ms 0.2934ms 3.4082 KOps/s 3.3487 KOps/s $\color{#35bf28}+1.78\%$
test_nested_getitemleaf 32.3110μs 9.8829μs 101.1846 KOps/s 101.7031 KOps/s $\color{#d91a1a}-0.51\%$
test_nested_getitem 32.2500μs 9.3928μs 106.4644 KOps/s 106.4769 KOps/s $\color{#d91a1a}-0.01\%$
test_stacked_getitemleaf 0.4711ms 0.3324ms 3.0083 KOps/s 2.9921 KOps/s $\color{#35bf28}+0.54\%$
test_stacked_getitem 0.3224ms 0.2985ms 3.3502 KOps/s 3.3553 KOps/s $\color{#d91a1a}-0.15\%$
test_lock_nested 0.7836ms 0.3506ms 2.8522 KOps/s 2.8171 KOps/s $\color{#35bf28}+1.24\%$
test_lock_stack_nested 94.7063ms 6.4050ms 156.1276 Ops/s 154.4006 Ops/s $\color{#35bf28}+1.12\%$
test_unlock_nested 84.6977ms 0.4320ms 2.3151 KOps/s 2.8550 KOps/s $\textbf{\color{#d91a1a}-18.91\%}$
test_unlock_stack_nested 92.7157ms 6.5345ms 153.0340 Ops/s 150.9249 Ops/s $\color{#35bf28}+1.40\%$
test_flatten_speed 0.6481ms 0.2625ms 3.8091 KOps/s 3.8503 KOps/s $\color{#d91a1a}-1.07\%$
test_unflatten_speed 0.4274ms 0.3592ms 2.7836 KOps/s 2.7794 KOps/s $\color{#35bf28}+0.15\%$
test_common_ops 1.0123ms 0.5701ms 1.7540 KOps/s 1.7057 KOps/s $\color{#35bf28}+2.83\%$
test_creation 33.2800μs 1.5614μs 640.4336 KOps/s 643.9554 KOps/s $\color{#d91a1a}-0.55\%$
test_creation_empty 21.3600μs 7.0397μs 142.0520 KOps/s 134.6508 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_creation_nested_1 27.8100μs 8.8303μs 113.2468 KOps/s 107.3318 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_creation_nested_2 25.2710μs 11.3152μs 88.3770 KOps/s 86.4214 KOps/s $\color{#35bf28}+2.26\%$
test_clone 0.1464ms 13.5852μs 73.6095 KOps/s 73.1872 KOps/s $\color{#35bf28}+0.58\%$
test_getitem[int] 24.5200μs 10.6782μs 93.6486 KOps/s 95.6394 KOps/s $\color{#d91a1a}-2.08\%$
test_getitem[slice_int] 0.1484ms 20.5768μs 48.5983 KOps/s 48.6610 KOps/s $\color{#d91a1a}-0.13\%$
test_getitem[range] 0.1965ms 38.0696μs 26.2677 KOps/s 25.4388 KOps/s $\color{#35bf28}+3.26\%$
test_getitem[tuple] 46.8310μs 18.1842μs 54.9927 KOps/s 55.2060 KOps/s $\color{#d91a1a}-0.39\%$
test_getitem[list] 0.1725ms 33.8483μs 29.5436 KOps/s 28.0884 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_setitem_dim[int] 42.4210μs 26.1163μs 38.2902 KOps/s 36.6743 KOps/s $\color{#35bf28}+4.41\%$
test_setitem_dim[slice_int] 64.0100μs 47.0430μs 21.2571 KOps/s 21.4327 KOps/s $\color{#d91a1a}-0.82\%$
test_setitem_dim[range] 83.5600μs 64.4667μs 15.5119 KOps/s 15.1334 KOps/s $\color{#35bf28}+2.50\%$
test_setitem_dim[tuple] 91.6600μs 39.9183μs 25.0512 KOps/s 23.7877 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_setitem 0.1583ms 17.5433μs 57.0017 KOps/s 55.4703 KOps/s $\color{#35bf28}+2.76\%$
test_set 94.7210μs 16.9100μs 59.1368 KOps/s 57.5615 KOps/s $\color{#35bf28}+2.74\%$
test_set_shared 2.8427ms 0.1032ms 9.6917 KOps/s 9.7358 KOps/s $\color{#d91a1a}-0.45\%$
test_update 77.4400μs 18.8745μs 52.9815 KOps/s 51.2353 KOps/s $\color{#35bf28}+3.41\%$
test_update_nested 92.0310μs 25.2473μs 39.6082 KOps/s 38.2236 KOps/s $\color{#35bf28}+3.62\%$
test_set_nested 0.1108ms 18.3263μs 54.5663 KOps/s 52.9681 KOps/s $\color{#35bf28}+3.02\%$
test_set_nested_new 95.3000μs 20.6954μs 48.3199 KOps/s 46.7805 KOps/s $\color{#35bf28}+3.29\%$
test_select 0.2232ms 33.5911μs 29.7698 KOps/s 29.8081 KOps/s $\color{#d91a1a}-0.13\%$
test_select_nested 0.2436ms 52.9256μs 18.8945 KOps/s 18.5040 KOps/s $\color{#35bf28}+2.11\%$
test_exclude_nested 0.2800ms 0.1113ms 8.9842 KOps/s 8.7427 KOps/s $\color{#35bf28}+2.76\%$
test_empty[True] 0.5755ms 0.3849ms 2.5977 KOps/s 2.5919 KOps/s $\color{#35bf28}+0.23\%$
test_empty[False] 2.8341μs 0.8512μs 1.1748 MOps/s 1.1614 MOps/s $\color{#35bf28}+1.15\%$
test_to 75.1810μs 54.1906μs 18.4534 KOps/s 17.8010 KOps/s $\color{#35bf28}+3.66\%$
test_to_nonblocking 0.2224ms 33.9069μs 29.4925 KOps/s 28.4729 KOps/s $\color{#35bf28}+3.58\%$
test_unbind_speed 0.3034ms 0.2643ms 3.7843 KOps/s 3.7208 KOps/s $\color{#35bf28}+1.71\%$
test_unbind_speed_stack0 93.3701ms 3.7815ms 264.4445 Ops/s 285.1371 Ops/s $\textbf{\color{#d91a1a}-7.26\%}$
test_unbind_speed_stack1 0.1402ms 1.7969μs 556.5239 KOps/s 544.7176 KOps/s $\color{#35bf28}+2.17\%$
test_split 86.9027ms 1.7062ms 586.0998 Ops/s 659.5895 Ops/s $\textbf{\color{#d91a1a}-11.14\%}$
test_chunk 1.5521ms 1.4966ms 668.1611 Ops/s 609.2623 Ops/s $\textbf{\color{#35bf28}+9.67\%}$
test_creation[device0] 0.1934ms 71.3524μs 14.0149 KOps/s 13.9257 KOps/s $\color{#35bf28}+0.64\%$
test_creation_from_tensor 0.2183ms 56.1856μs 17.7982 KOps/s 18.6992 KOps/s $\color{#d91a1a}-4.82\%$
test_add_one[memmap_tensor0] 0.2146ms 6.8377μs 146.2472 KOps/s 140.2480 KOps/s $\color{#35bf28}+4.28\%$
test_contiguous[memmap_tensor0] 13.8510μs 0.5946μs 1.6819 MOps/s 1.6337 MOps/s $\color{#35bf28}+2.95\%$
test_stack[memmap_tensor0] 53.5300μs 4.2436μs 235.6505 KOps/s 225.5657 KOps/s $\color{#35bf28}+4.47\%$
test_memmaptd_index 1.0030ms 0.2589ms 3.8631 KOps/s 3.9275 KOps/s $\color{#d91a1a}-1.64\%$
test_memmaptd_index_astensor 0.6525ms 0.3150ms 3.1749 KOps/s 3.2138 KOps/s $\color{#d91a1a}-1.21\%$
test_memmaptd_index_op 86.2550ms 0.6433ms 1.5545 KOps/s 1.6720 KOps/s $\textbf{\color{#d91a1a}-7.03\%}$
test_serialize_model 92.0854ms 88.3513ms 11.3184 Ops/s 9.6136 Ops/s $\textbf{\color{#35bf28}+17.73\%}$
test_serialize_model_pickle 1.3494s 1.2362s 0.8089 Ops/s 0.8072 Ops/s $\color{#35bf28}+0.21\%$
test_serialize_weights 90.0739ms 86.5475ms 11.5543 Ops/s 9.6757 Ops/s $\textbf{\color{#35bf28}+19.42\%}$
test_serialize_weights_returnearly 0.2753s 74.4956ms 13.4236 Ops/s 12.4198 Ops/s $\textbf{\color{#35bf28}+8.08\%}$
test_serialize_weights_pickle 1.3531s 1.2364s 0.8088 Ops/s 0.8019 Ops/s $\color{#35bf28}+0.86\%$
test_reshape_pytree 53.4310μs 23.9120μs 41.8199 KOps/s 41.0427 KOps/s $\color{#35bf28}+1.89\%$
test_reshape_td 79.6100μs 30.1323μs 33.1870 KOps/s 33.1706 KOps/s $\color{#35bf28}+0.05\%$
test_view_pytree 0.2271ms 25.9224μs 38.5767 KOps/s 41.8565 KOps/s $\textbf{\color{#d91a1a}-7.84\%}$
test_view_td 95.2121ms 10.7904μs 92.6746 KOps/s 147.7146 KOps/s $\textbf{\color{#d91a1a}-37.26\%}$
test_unbind_pytree 84.1910μs 29.8830μs 33.4639 KOps/s 31.2771 KOps/s $\textbf{\color{#35bf28}+6.99\%}$
test_unbind_td 0.1078ms 39.7853μs 25.1349 KOps/s 24.6898 KOps/s $\color{#35bf28}+1.80\%$
test_split_pytree 53.4600μs 27.5214μs 36.3353 KOps/s 35.7829 KOps/s $\color{#35bf28}+1.54\%$
test_split_td 0.1719ms 38.5642μs 25.9308 KOps/s 26.1441 KOps/s $\color{#d91a1a}-0.82\%$
test_add_pytree 0.1664ms 35.5354μs 28.1409 KOps/s 27.5033 KOps/s $\color{#35bf28}+2.32\%$
test_add_td 0.1157ms 50.9301μs 19.6348 KOps/s 19.1332 KOps/s $\color{#35bf28}+2.62\%$
test_distributed 0.2541ms 69.9812μs 14.2895 KOps/s 14.1941 KOps/s $\color{#35bf28}+0.67\%$
test_tdmodule 42.4090μs 17.1237μs 58.3987 KOps/s 58.5680 KOps/s $\color{#d91a1a}-0.29\%$
test_tdmodule_dispatch 0.1407ms 34.4519μs 29.0260 KOps/s 28.4283 KOps/s $\color{#35bf28}+2.10\%$
test_tdseq 37.1710μs 19.8587μs 50.3556 KOps/s 50.2275 KOps/s $\color{#35bf28}+0.26\%$
test_tdseq_dispatch 0.1486ms 36.9627μs 27.0543 KOps/s 26.5043 KOps/s $\color{#35bf28}+2.08\%$
test_instantiation_functorch 1.7526ms 1.6466ms 607.3007 Ops/s 610.8926 Ops/s $\color{#d91a1a}-0.59\%$
test_instantiation_td 1.7025ms 1.1486ms 870.6247 Ops/s 869.3882 Ops/s $\color{#35bf28}+0.14\%$
test_exec_functorch 0.1946ms 0.1591ms 6.2865 KOps/s 6.2280 KOps/s $\color{#35bf28}+0.94\%$
test_exec_functional_call 0.2238ms 0.1558ms 6.4167 KOps/s 6.2446 KOps/s $\color{#35bf28}+2.76\%$
test_exec_td 0.2472ms 0.1460ms 6.8474 KOps/s 6.6808 KOps/s $\color{#35bf28}+2.49\%$
test_exec_td_decorator 0.8755ms 0.1738ms 5.7530 KOps/s 5.5741 KOps/s $\color{#35bf28}+3.21\%$
test_vmap_mlp_speed[True-True] 1.1905ms 1.0193ms 981.0773 Ops/s 964.7792 Ops/s $\color{#35bf28}+1.69\%$
test_vmap_mlp_speed[True-False] 0.7758ms 0.5880ms 1.7006 KOps/s 1.6931 KOps/s $\color{#35bf28}+0.44\%$
test_vmap_mlp_speed[False-True] 1.1103ms 0.9472ms 1.0557 KOps/s 1.0616 KOps/s $\color{#d91a1a}-0.56\%$
test_vmap_mlp_speed[False-False] 0.7175ms 0.5286ms 1.8919 KOps/s 1.9063 KOps/s $\color{#d91a1a}-0.76\%$
test_vmap_mlp_speed_decorator[True-True] 1.9957ms 1.8231ms 548.5167 Ops/s 650.3598 Ops/s $\textbf{\color{#d91a1a}-15.66\%}$
test_vmap_mlp_speed_decorator[True-False] 1.0080ms 0.6386ms 1.5660 KOps/s 1.5994 KOps/s $\color{#d91a1a}-2.09\%$
test_vmap_mlp_speed_decorator[False-True] 2.4806ms 1.6025ms 624.0069 Ops/s 757.4833 Ops/s $\textbf{\color{#d91a1a}-17.62\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7849ms 0.5569ms 1.7957 KOps/s 1.8644 KOps/s $\color{#d91a1a}-3.68\%$
test_vmap_transformer_speed[True-True] 13.6186ms 12.4227ms 80.4979 Ops/s 81.1870 Ops/s $\color{#d91a1a}-0.85\%$
test_vmap_transformer_speed[True-False] 8.8220ms 8.1497ms 122.7035 Ops/s 123.7049 Ops/s $\color{#d91a1a}-0.81\%$
test_vmap_transformer_speed[False-True] 12.6713ms 12.1789ms 82.1093 Ops/s 82.3627 Ops/s $\color{#d91a1a}-0.31\%$
test_vmap_transformer_speed[False-False] 8.5054ms 8.0727ms 123.8740 Ops/s 124.7876 Ops/s $\color{#d91a1a}-0.73\%$
test_vmap_transformer_speed_decorator[True-True] 59.8010ms 58.6652ms 17.0459 Ops/s 18.6927 Ops/s $\textbf{\color{#d91a1a}-8.81\%}$
test_vmap_transformer_speed_decorator[True-False] 19.9667ms 19.4259ms 51.4776 Ops/s 52.0279 Ops/s $\color{#d91a1a}-1.06\%$
test_vmap_transformer_speed_decorator[False-True] 55.0324ms 53.0563ms 18.8479 Ops/s 21.1854 Ops/s $\textbf{\color{#d91a1a}-11.03\%}$
test_vmap_transformer_speed_decorator[False-False] 20.5744ms 19.4247ms 51.4807 Ops/s 53.0917 Ops/s $\color{#d91a1a}-3.03\%$
test_to_module_speed[True] 1.1894ms 1.0106ms 989.4743 Ops/s 975.4138 Ops/s $\color{#35bf28}+1.44\%$
test_to_module_speed[False] 1.0774ms 0.9786ms 1.0219 KOps/s 1.0104 KOps/s $\color{#35bf28}+1.14\%$

@vmoens vmoens merged commit 5987707 into main Feb 7, 2024
48 checks passed
@vmoens vmoens deleted the fix-to-module branch October 21, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants