Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Expose NestedKey in root #642

Merged
merged 1 commit into from
Jan 29, 2024
Merged

[Feature] Expose NestedKey in root #642

merged 1 commit into from
Jan 29, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 29, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 29, 2024
@vmoens vmoens added the enhancement New feature or request label Jan 29, 2024
@vmoens vmoens merged commit 96d8dd3 into main Jan 29, 2024
25 of 32 checks passed
@vmoens vmoens deleted the expose-nestedkey branch January 29, 2024 13:59
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 124. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.3470μs 17.5290μs 57.0484 KOps/s 59.4870 KOps/s $\color{#d91a1a}-4.10\%$
test_plain_set_stack_nested 0.2736ms 0.1527ms 6.5505 KOps/s 6.8962 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_plain_set_nested_inplace 50.0330μs 19.8618μs 50.3478 KOps/s 51.9133 KOps/s $\color{#d91a1a}-3.02\%$
test_plain_set_stack_nested_inplace 0.3211ms 0.1848ms 5.4100 KOps/s 5.5728 KOps/s $\color{#d91a1a}-2.92\%$
test_items 17.6230μs 2.4676μs 405.2537 KOps/s 405.5306 KOps/s $\color{#d91a1a}-0.07\%$
test_items_nested 0.8148ms 0.2706ms 3.6960 KOps/s 3.7146 KOps/s $\color{#d91a1a}-0.50\%$
test_items_nested_locked 0.4802ms 0.2703ms 3.6999 KOps/s 3.7135 KOps/s $\color{#d91a1a}-0.37\%$
test_items_nested_leaf 0.6653ms 0.1693ms 5.9074 KOps/s 5.9895 KOps/s $\color{#d91a1a}-1.37\%$
test_items_stack_nested 2.1388ms 1.3125ms 761.9205 Ops/s 766.0032 Ops/s $\color{#d91a1a}-0.53\%$
test_items_stack_nested_leaf 1.6660ms 1.1842ms 844.4614 Ops/s 858.2881 Ops/s $\color{#d91a1a}-1.61\%$
test_items_stack_nested_locked 1.3512ms 0.8686ms 1.1513 KOps/s 1.1646 KOps/s $\color{#d91a1a}-1.14\%$
test_keys 21.5400μs 3.8666μs 258.6225 KOps/s 257.9062 KOps/s $\color{#35bf28}+0.28\%$
test_keys_nested 49.1088ms 0.1587ms 6.3017 KOps/s 6.8265 KOps/s $\textbf{\color{#d91a1a}-7.69\%}$
test_keys_nested_locked 0.2579ms 0.1511ms 6.6194 KOps/s 6.7525 KOps/s $\color{#d91a1a}-1.97\%$
test_keys_nested_leaf 0.2377ms 0.1295ms 7.7218 KOps/s 6.9737 KOps/s $\textbf{\color{#35bf28}+10.73\%}$
test_keys_stack_nested 1.6822ms 1.2613ms 792.8514 Ops/s 808.4792 Ops/s $\color{#d91a1a}-1.93\%$
test_keys_stack_nested_leaf 1.4503ms 1.2603ms 793.4807 Ops/s 808.7037 Ops/s $\color{#d91a1a}-1.88\%$
test_keys_stack_nested_locked 1.9178ms 0.8408ms 1.1893 KOps/s 1.2622 KOps/s $\textbf{\color{#d91a1a}-5.77\%}$
test_values 7.0737μs 1.1774μs 849.3088 KOps/s 775.6682 KOps/s $\textbf{\color{#35bf28}+9.49\%}$
test_values_nested 93.0240μs 51.2025μs 19.5303 KOps/s 19.4048 KOps/s $\color{#35bf28}+0.65\%$
test_values_nested_locked 0.1020ms 50.9321μs 19.6340 KOps/s 19.4723 KOps/s $\color{#35bf28}+0.83\%$
test_values_nested_leaf 0.1083ms 45.9990μs 21.7396 KOps/s 21.7826 KOps/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested 1.2396ms 1.0130ms 987.1998 Ops/s 992.4467 Ops/s $\color{#d91a1a}-0.53\%$
test_values_stack_nested_leaf 1.1524ms 1.0133ms 986.8418 Ops/s 998.6849 Ops/s $\color{#d91a1a}-1.19\%$
test_values_stack_nested_locked 0.9835ms 0.5941ms 1.6834 KOps/s 1.6924 KOps/s $\color{#d91a1a}-0.53\%$
test_membership 14.8170μs 1.3654μs 732.4107 KOps/s 767.2751 KOps/s $\color{#d91a1a}-4.54\%$
test_membership_nested 27.9010μs 3.4406μs 290.6464 KOps/s 291.3343 KOps/s $\color{#d91a1a}-0.24\%$
test_membership_nested_leaf 38.7900μs 3.4733μs 287.9138 KOps/s 289.2007 KOps/s $\color{#d91a1a}-0.45\%$
test_membership_stacked_nested 32.9100μs 11.7061μs 85.4253 KOps/s 86.1426 KOps/s $\color{#d91a1a}-0.83\%$
test_membership_stacked_nested_leaf 38.0410μs 11.5830μs 86.3337 KOps/s 86.6522 KOps/s $\color{#d91a1a}-0.37\%$
test_membership_nested_last 27.3400μs 6.6687μs 149.9538 KOps/s 152.0661 KOps/s $\color{#d91a1a}-1.39\%$
test_membership_nested_leaf_last 35.9570μs 6.7627μs 147.8709 KOps/s 137.4223 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_membership_stacked_nested_last 0.3155ms 0.1814ms 5.5115 KOps/s 5.6642 KOps/s $\color{#d91a1a}-2.70\%$
test_membership_stacked_nested_leaf_last 46.7770μs 13.6202μs 73.4202 KOps/s 72.5434 KOps/s $\color{#35bf28}+1.21\%$
test_nested_getleaf 30.5770μs 11.0192μs 90.7506 KOps/s 93.6926 KOps/s $\color{#d91a1a}-3.14\%$
test_nested_get 33.9930μs 10.4851μs 95.3730 KOps/s 98.0759 KOps/s $\color{#d91a1a}-2.76\%$
test_stacked_getleaf 0.6162ms 0.3936ms 2.5406 KOps/s 2.5324 KOps/s $\color{#35bf28}+0.32\%$
test_stacked_get 0.8293ms 0.3605ms 2.7742 KOps/s 2.7188 KOps/s $\color{#35bf28}+2.04\%$
test_nested_getitemleaf 52.1090μs 12.1509μs 82.2985 KOps/s 81.4714 KOps/s $\color{#35bf28}+1.02\%$
test_nested_getitem 33.5430μs 11.8452μs 84.4226 KOps/s 86.1637 KOps/s $\color{#d91a1a}-2.02\%$
test_stacked_getitemleaf 0.6007ms 0.3941ms 2.5375 KOps/s 2.5282 KOps/s $\color{#35bf28}+0.37\%$
test_stacked_getitem 0.5618ms 0.3633ms 2.7526 KOps/s 2.7096 KOps/s $\color{#35bf28}+1.59\%$
test_lock_nested 0.7500ms 0.3388ms 2.9519 KOps/s 2.9596 KOps/s $\color{#d91a1a}-0.26\%$
test_lock_stack_nested 71.2270ms 5.4120ms 184.7742 Ops/s 187.7095 Ops/s $\color{#d91a1a}-1.56\%$
test_unlock_nested 0.6583ms 0.3430ms 2.9154 KOps/s 2.4981 KOps/s $\textbf{\color{#35bf28}+16.71\%}$
test_unlock_stack_nested 71.2372ms 5.5928ms 178.8014 Ops/s 175.4188 Ops/s $\color{#35bf28}+1.93\%$
test_flatten_speed 0.5359ms 0.3736ms 2.6766 KOps/s 2.6801 KOps/s $\color{#d91a1a}-0.13\%$
test_unflatten_speed 0.5440ms 0.4673ms 2.1399 KOps/s 2.1659 KOps/s $\color{#d91a1a}-1.20\%$
test_common_ops 5.3561ms 0.6918ms 1.4454 KOps/s 1.4689 KOps/s $\color{#d91a1a}-1.60\%$
test_creation 58.6090μs 1.8374μs 544.2340 KOps/s 551.5933 KOps/s $\color{#d91a1a}-1.33\%$
test_creation_empty 35.3960μs 9.9783μs 100.2170 KOps/s 99.0329 KOps/s $\color{#35bf28}+1.20\%$
test_creation_nested_1 31.0980μs 12.5481μs 79.6936 KOps/s 79.6740 KOps/s $\color{#35bf28}+0.02\%$
test_creation_nested_2 39.5940μs 15.7560μs 63.4679 KOps/s 63.1338 KOps/s $\color{#35bf28}+0.53\%$
test_clone 55.6640μs 12.9671μs 77.1184 KOps/s 75.9306 KOps/s $\color{#35bf28}+1.56\%$
test_getitem[int] 41.3470μs 11.0658μs 90.3687 KOps/s 87.2964 KOps/s $\color{#35bf28}+3.52\%$
test_getitem[slice_int] 67.0250μs 21.9524μs 45.5531 KOps/s 42.8057 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_getitem[range] 0.1536ms 38.2401μs 26.1505 KOps/s 24.3939 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_getitem[tuple] 47.0880μs 17.9227μs 55.7950 KOps/s 53.7026 KOps/s $\color{#35bf28}+3.90\%$
test_getitem[list] 0.1415ms 34.3852μs 29.0822 KOps/s 28.1099 KOps/s $\color{#35bf28}+3.46\%$
test_setitem_dim[int] 51.3550μs 30.7946μs 32.4732 KOps/s 33.5903 KOps/s $\color{#d91a1a}-3.33\%$
test_setitem_dim[slice_int] 0.1044ms 57.1683μs 17.4922 KOps/s 17.8019 KOps/s $\color{#d91a1a}-1.74\%$
test_setitem_dim[range] 0.1041ms 73.9245μs 13.5273 KOps/s 13.9673 KOps/s $\color{#d91a1a}-3.15\%$
test_setitem_dim[tuple] 82.7640μs 44.9158μs 22.2639 KOps/s 22.5113 KOps/s $\color{#d91a1a}-1.10\%$
test_setitem 69.8400μs 19.2808μs 51.8650 KOps/s 52.9040 KOps/s $\color{#d91a1a}-1.96\%$
test_set 60.7130μs 18.5649μs 53.8650 KOps/s 54.3410 KOps/s $\color{#d91a1a}-0.88\%$
test_set_shared 3.1433ms 0.1408ms 7.1038 KOps/s 6.9931 KOps/s $\color{#35bf28}+1.58\%$
test_update 0.1417ms 21.5803μs 46.3386 KOps/s 46.7180 KOps/s $\color{#d91a1a}-0.81\%$
test_update_nested 97.7710μs 28.9328μs 34.5629 KOps/s 34.9022 KOps/s $\color{#d91a1a}-0.97\%$
test_set_nested 68.6070μs 20.5054μs 48.7677 KOps/s 49.0383 KOps/s $\color{#d91a1a}-0.55\%$
test_set_nested_new 90.2460μs 24.2471μs 41.2420 KOps/s 41.2913 KOps/s $\color{#d91a1a}-0.12\%$
test_select 81.9230μs 36.9332μs 27.0759 KOps/s 26.6008 KOps/s $\color{#35bf28}+1.79\%$
test_select_nested 0.1013ms 58.0350μs 17.2310 KOps/s 16.6762 KOps/s $\color{#35bf28}+3.33\%$
test_exclude_nested 1.0204ms 0.1171ms 8.5386 KOps/s 8.3200 KOps/s $\color{#35bf28}+2.63\%$
test_empty[True] 0.5461ms 0.4104ms 2.4368 KOps/s 2.3686 KOps/s $\color{#35bf28}+2.88\%$
test_empty[False] 6.0192μs 1.0654μs 938.6071 KOps/s 941.0240 KOps/s $\color{#d91a1a}-0.26\%$
test_unbind_speed 0.3524ms 0.2518ms 3.9706 KOps/s 4.0827 KOps/s $\color{#d91a1a}-2.75\%$
test_unbind_speed_stack0 75.0308ms 3.5791ms 279.3971 Ops/s 303.4884 Ops/s $\textbf{\color{#d91a1a}-7.94\%}$
test_unbind_speed_stack1 19.9980μs 1.9439μs 514.4402 KOps/s 509.1543 KOps/s $\color{#35bf28}+1.04\%$
test_split 2.2520ms 1.4687ms 680.8660 Ops/s 598.8257 Ops/s $\textbf{\color{#35bf28}+13.70\%}$
test_chunk 67.5078ms 1.5715ms 636.3294 Ops/s 618.7892 Ops/s $\color{#35bf28}+2.83\%$
test_creation[device0] 0.1934ms 0.1010ms 9.9047 KOps/s 10.0483 KOps/s $\color{#d91a1a}-1.43\%$
test_creation_from_tensor 4.4762ms 81.8848μs 12.2123 KOps/s 12.3943 KOps/s $\color{#d91a1a}-1.47\%$
test_add_one[memmap_tensor0] 0.2204ms 5.2584μs 190.1727 KOps/s 190.1521 KOps/s $\color{#35bf28}+0.01\%$
test_contiguous[memmap_tensor0] 10.6500μs 0.6352μs 1.5743 MOps/s 1.5375 MOps/s $\color{#35bf28}+2.39\%$
test_stack[memmap_tensor0] 53.2990μs 3.4842μs 287.0102 KOps/s 284.4389 KOps/s $\color{#35bf28}+0.90\%$
test_memmaptd_index 0.9519ms 0.2150ms 4.6505 KOps/s 4.5707 KOps/s $\color{#35bf28}+1.74\%$
test_memmaptd_index_astensor 0.6089ms 0.2777ms 3.6011 KOps/s 3.6032 KOps/s $\color{#d91a1a}-0.06\%$
test_memmaptd_index_op 1.0635ms 0.5539ms 1.8055 KOps/s 1.8030 KOps/s $\color{#35bf28}+0.14\%$
test_serialize_model 0.1752s 0.1087s 9.2018 Ops/s 9.1099 Ops/s $\color{#35bf28}+1.01\%$
test_serialize_model_pickle 0.4522s 0.3641s 2.7463 Ops/s 2.7425 Ops/s $\color{#35bf28}+0.14\%$
test_serialize_weights 0.1688s 0.1069s 9.3571 Ops/s 10.0364 Ops/s $\textbf{\color{#d91a1a}-6.77\%}$
test_serialize_weights_returnearly 0.1940s 0.1342s 7.4531 Ops/s 8.3144 Ops/s $\textbf{\color{#d91a1a}-10.36\%}$
test_serialize_weights_pickle 1.3023s 0.6673s 1.4985 Ops/s 2.2961 Ops/s $\textbf{\color{#d91a1a}-34.74\%}$
test_serialize_weights_filesystem 95.5051ms 90.5311ms 11.0459 Ops/s 10.0558 Ops/s $\textbf{\color{#35bf28}+9.85\%}$
test_serialize_model_filesystem 0.1610s 98.2632ms 10.1768 Ops/s 10.5855 Ops/s $\color{#d91a1a}-3.86\%$
test_reshape_pytree 50.8340μs 22.7927μs 43.8737 KOps/s 42.9811 KOps/s $\color{#35bf28}+2.08\%$
test_reshape_td 79.0770μs 30.7022μs 32.5710 KOps/s 33.0482 KOps/s $\color{#d91a1a}-1.44\%$
test_view_pytree 59.3410μs 22.8425μs 43.7781 KOps/s 43.5698 KOps/s $\color{#35bf28}+0.48\%$
test_view_td 77.9718ms 10.8302μs 92.3346 KOps/s 92.7444 KOps/s $\color{#d91a1a}-0.44\%$
test_unbind_pytree 0.1004ms 26.8332μs 37.2673 KOps/s 38.0707 KOps/s $\color{#d91a1a}-2.11\%$
test_unbind_td 88.5050μs 36.8488μs 27.1380 KOps/s 28.0585 KOps/s $\color{#d91a1a}-3.28\%$
test_split_pytree 73.5660μs 26.7041μs 37.4474 KOps/s 38.6406 KOps/s $\color{#d91a1a}-3.09\%$
test_split_td 0.1234ms 40.1685μs 24.8951 KOps/s 24.0774 KOps/s $\color{#35bf28}+3.40\%$
test_add_pytree 75.1900μs 31.7082μs 31.5376 KOps/s 32.0099 KOps/s $\color{#d91a1a}-1.48\%$
test_add_td 0.1758ms 52.7915μs 18.9424 KOps/s 20.5611 KOps/s $\textbf{\color{#d91a1a}-7.87\%}$
test_distributed 0.1801ms 96.4672μs 10.3662 KOps/s 9.9649 KOps/s $\color{#35bf28}+4.03\%$
test_tdmodule 0.5908ms 22.7152μs 44.0234 KOps/s 44.4897 KOps/s $\color{#d91a1a}-1.05\%$
test_tdmodule_dispatch 0.1877ms 43.2237μs 23.1355 KOps/s 23.3557 KOps/s $\color{#d91a1a}-0.94\%$
test_tdseq 0.3033ms 25.7216μs 38.8779 KOps/s 39.2048 KOps/s $\color{#d91a1a}-0.83\%$
test_tdseq_dispatch 0.1400ms 47.3775μs 21.1071 KOps/s 20.9627 KOps/s $\color{#35bf28}+0.69\%$
test_instantiation_functorch 1.5515ms 1.3105ms 763.0545 Ops/s 773.5853 Ops/s $\color{#d91a1a}-1.36\%$
test_instantiation_td 1.7461ms 1.0054ms 994.5817 Ops/s 997.5180 Ops/s $\color{#d91a1a}-0.29\%$
test_exec_functorch 0.2756ms 0.1564ms 6.3922 KOps/s 6.4672 KOps/s $\color{#d91a1a}-1.16\%$
test_exec_functional_call 0.2712ms 0.1431ms 6.9861 KOps/s 6.9381 KOps/s $\color{#35bf28}+0.69\%$
test_exec_td 0.2716ms 0.1411ms 7.0897 KOps/s 7.2075 KOps/s $\color{#d91a1a}-1.63\%$
test_exec_td_decorator 0.8042ms 0.1751ms 5.7103 KOps/s 5.5537 KOps/s $\color{#35bf28}+2.82\%$
test_vmap_mlp_speed[True-True] 1.5243ms 0.8814ms 1.1345 KOps/s 1.1377 KOps/s $\color{#d91a1a}-0.28\%$
test_vmap_mlp_speed[True-False] 0.7348ms 0.4657ms 2.1473 KOps/s 2.1599 KOps/s $\color{#d91a1a}-0.58\%$
test_vmap_mlp_speed[False-True] 0.9414ms 0.7511ms 1.3314 KOps/s 1.3190 KOps/s $\color{#35bf28}+0.93\%$
test_vmap_mlp_speed[False-False] 0.5789ms 0.3795ms 2.6353 KOps/s 2.6412 KOps/s $\color{#d91a1a}-0.22\%$
test_vmap_mlp_speed_decorator[True-True] 3.1720ms 2.2784ms 438.9116 Ops/s 435.1417 Ops/s $\color{#35bf28}+0.87\%$
test_vmap_mlp_speed_decorator[True-False] 0.9105ms 0.5141ms 1.9452 KOps/s 1.9174 KOps/s $\color{#35bf28}+1.45\%$
test_vmap_mlp_speed_decorator[False-True] 2.4518ms 1.8767ms 532.8419 Ops/s 536.7774 Ops/s $\color{#d91a1a}-0.73\%$
test_vmap_mlp_speed_decorator[False-False] 0.7596ms 0.3958ms 2.5264 KOps/s 2.4992 KOps/s $\color{#35bf28}+1.09\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 132. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 64.4191ms 17.4189μs 57.4090 KOps/s 77.4343 KOps/s $\textbf{\color{#d91a1a}-25.86\%}$
test_plain_set_stack_nested 0.1404ms 0.1225ms 8.1639 KOps/s 8.2119 KOps/s $\color{#d91a1a}-0.58\%$
test_plain_set_nested_inplace 40.1300μs 14.1198μs 70.8224 KOps/s 69.8719 KOps/s $\color{#35bf28}+1.36\%$
test_plain_set_stack_nested_inplace 0.1828ms 0.1486ms 6.7293 KOps/s 6.7652 KOps/s $\color{#d91a1a}-0.53\%$
test_items 19.3510μs 4.8311μs 206.9912 KOps/s 208.4721 KOps/s $\color{#d91a1a}-0.71\%$
test_items_nested 0.3882ms 0.3401ms 2.9404 KOps/s 2.9250 KOps/s $\color{#35bf28}+0.53\%$
test_items_nested_locked 0.3948ms 0.3457ms 2.8926 KOps/s 2.8933 KOps/s $\color{#d91a1a}-0.02\%$
test_items_nested_leaf 0.2298ms 0.2004ms 4.9907 KOps/s 4.9327 KOps/s $\color{#35bf28}+1.18\%$
test_items_stack_nested 1.4319ms 1.3340ms 749.6242 Ops/s 750.1631 Ops/s $\color{#d91a1a}-0.07\%$
test_items_stack_nested_leaf 1.2411ms 1.1783ms 848.6596 Ops/s 852.0370 Ops/s $\color{#d91a1a}-0.40\%$
test_items_stack_nested_locked 1.9686ms 0.9135ms 1.0946 KOps/s 1.1038 KOps/s $\color{#d91a1a}-0.83\%$
test_keys 26.0710μs 4.6212μs 216.3946 KOps/s 216.9151 KOps/s $\color{#d91a1a}-0.24\%$
test_keys_nested 0.5216ms 94.4637μs 10.5861 KOps/s 10.4444 KOps/s $\color{#35bf28}+1.36\%$
test_keys_nested_locked 0.1249ms 97.3150μs 10.2759 KOps/s 10.1641 KOps/s $\color{#35bf28}+1.10\%$
test_keys_nested_leaf 0.1799ms 78.1688μs 12.7928 KOps/s 12.6144 KOps/s $\color{#35bf28}+1.41\%$
test_keys_stack_nested 1.2315ms 1.1726ms 852.7928 Ops/s 833.0540 Ops/s $\color{#35bf28}+2.37\%$
test_keys_stack_nested_leaf 1.4545ms 1.1681ms 856.1162 Ops/s 855.8785 Ops/s $\color{#35bf28}+0.03\%$
test_keys_stack_nested_locked 0.8706ms 0.7447ms 1.3429 KOps/s 1.3297 KOps/s $\color{#35bf28}+0.99\%$
test_values 8.7237μs 1.8816μs 531.4685 KOps/s 526.3299 KOps/s $\color{#35bf28}+0.98\%$
test_values_nested 68.0330μs 45.2019μs 22.1229 KOps/s 21.9005 KOps/s $\color{#35bf28}+1.02\%$
test_values_nested_locked 67.2810μs 47.8704μs 20.8897 KOps/s 20.9459 KOps/s $\color{#d91a1a}-0.27\%$
test_values_nested_leaf 63.9210μs 39.7031μs 25.1869 KOps/s 24.9252 KOps/s $\color{#35bf28}+1.05\%$
test_values_stack_nested 1.0637ms 0.9746ms 1.0261 KOps/s 1.0397 KOps/s $\color{#d91a1a}-1.31\%$
test_values_stack_nested_leaf 1.1129ms 0.9762ms 1.0244 KOps/s 1.0247 KOps/s $\color{#d91a1a}-0.02\%$
test_values_stack_nested_locked 0.7028ms 0.5950ms 1.6807 KOps/s 1.7252 KOps/s $\color{#d91a1a}-2.57\%$
test_membership 14.3010μs 1.0836μs 922.8290 KOps/s 1.0294 MOps/s $\textbf{\color{#d91a1a}-10.35\%}$
test_membership_nested 33.7800μs 2.9179μs 342.7095 KOps/s 339.3288 KOps/s $\color{#35bf28}+1.00\%$
test_membership_nested_leaf 31.9410μs 2.9098μs 343.6633 KOps/s 339.5281 KOps/s $\color{#35bf28}+1.22\%$
test_membership_stacked_nested 30.8310μs 11.2530μs 88.8655 KOps/s 88.8979 KOps/s $\color{#d91a1a}-0.04\%$
test_membership_stacked_nested_leaf 43.8410μs 11.2508μs 88.8824 KOps/s 87.8084 KOps/s $\color{#35bf28}+1.22\%$
test_membership_nested_last 23.3400μs 5.3401μs 187.2625 KOps/s 187.3277 KOps/s $\color{#d91a1a}-0.03\%$
test_membership_nested_leaf_last 34.6710μs 5.2887μs 189.0812 KOps/s 185.8081 KOps/s $\color{#35bf28}+1.76\%$
test_membership_stacked_nested_last 0.1957ms 0.1557ms 6.4231 KOps/s 6.3980 KOps/s $\color{#35bf28}+0.39\%$
test_membership_stacked_nested_leaf_last 28.0910μs 13.1783μs 75.8823 KOps/s 76.0380 KOps/s $\color{#d91a1a}-0.20\%$
test_nested_getleaf 35.5710μs 8.4462μs 118.3970 KOps/s 118.7457 KOps/s $\color{#d91a1a}-0.29\%$
test_nested_get 32.3800μs 7.9762μs 125.3722 KOps/s 125.6427 KOps/s $\color{#d91a1a}-0.22\%$
test_stacked_getleaf 0.3970ms 0.3290ms 3.0391 KOps/s 3.0476 KOps/s $\color{#d91a1a}-0.28\%$
test_stacked_get 0.3471ms 0.2948ms 3.3920 KOps/s 3.4100 KOps/s $\color{#d91a1a}-0.53\%$
test_nested_getitemleaf 23.3100μs 9.7980μs 102.0616 KOps/s 100.0737 KOps/s $\color{#35bf28}+1.99\%$
test_nested_getitem 32.6710μs 9.3535μs 106.9118 KOps/s 103.7015 KOps/s $\color{#35bf28}+3.10\%$
test_stacked_getitemleaf 0.4002ms 0.3314ms 3.0174 KOps/s 3.0129 KOps/s $\color{#35bf28}+0.15\%$
test_stacked_getitem 0.3485ms 0.2969ms 3.3686 KOps/s 3.3638 KOps/s $\color{#35bf28}+0.14\%$
test_lock_nested 0.8619ms 0.3587ms 2.7878 KOps/s 2.8220 KOps/s $\color{#d91a1a}-1.21\%$
test_lock_stack_nested 87.8260ms 6.3843ms 156.6352 Ops/s 154.9059 Ops/s $\color{#35bf28}+1.12\%$
test_unlock_nested 81.6206ms 0.4398ms 2.2740 KOps/s 2.8670 KOps/s $\textbf{\color{#d91a1a}-20.68\%}$
test_unlock_stack_nested 87.7991ms 6.4635ms 154.7159 Ops/s 153.7943 Ops/s $\color{#35bf28}+0.60\%$
test_flatten_speed 0.6669ms 0.2611ms 3.8295 KOps/s 3.8048 KOps/s $\color{#35bf28}+0.65\%$
test_unflatten_speed 0.4113ms 0.3590ms 2.7855 KOps/s 2.7418 KOps/s $\color{#35bf28}+1.59\%$
test_common_ops 1.0241ms 0.5809ms 1.7215 KOps/s 1.6116 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_creation 11.7910μs 1.5643μs 639.2698 KOps/s 629.6739 KOps/s $\color{#35bf28}+1.52\%$
test_creation_empty 44.5810μs 6.4108μs 155.9879 KOps/s 146.9228 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_creation_nested_1 29.3810μs 8.2069μs 121.8487 KOps/s 116.2548 KOps/s $\color{#35bf28}+4.81\%$
test_creation_nested_2 36.9900μs 10.6181μs 94.1790 KOps/s 90.2890 KOps/s $\color{#35bf28}+4.31\%$
test_clone 62.3410μs 14.7668μs 67.7196 KOps/s 70.0789 KOps/s $\color{#d91a1a}-3.37\%$
test_getitem[int] 26.3810μs 11.0999μs 90.0906 KOps/s 85.9283 KOps/s $\color{#35bf28}+4.84\%$
test_getitem[slice_int] 40.0700μs 21.9728μs 45.5109 KOps/s 42.1874 KOps/s $\textbf{\color{#35bf28}+7.88\%}$
test_getitem[range] 66.3910μs 37.8558μs 26.4161 KOps/s 22.6463 KOps/s $\textbf{\color{#35bf28}+16.65\%}$
test_getitem[tuple] 39.8700μs 18.9763μs 52.6974 KOps/s 48.5131 KOps/s $\textbf{\color{#35bf28}+8.62\%}$
test_getitem[list] 0.1504ms 34.5412μs 28.9509 KOps/s 26.1164 KOps/s $\textbf{\color{#35bf28}+10.85\%}$
test_setitem_dim[int] 45.6310μs 27.1378μs 36.8489 KOps/s 35.9765 KOps/s $\color{#35bf28}+2.42\%$
test_setitem_dim[slice_int] 66.2110μs 47.4708μs 21.0656 KOps/s 19.6588 KOps/s $\textbf{\color{#35bf28}+7.16\%}$
test_setitem_dim[range] 80.0010μs 60.6821μs 16.4793 KOps/s 14.6202 KOps/s $\textbf{\color{#35bf28}+12.72\%}$
test_setitem_dim[tuple] 72.3810μs 41.5193μs 24.0852 KOps/s 23.2935 KOps/s $\color{#35bf28}+3.40\%$
test_setitem 54.7710μs 18.7443μs 53.3496 KOps/s 52.2511 KOps/s $\color{#35bf28}+2.10\%$
test_set 52.8110μs 17.9967μs 55.5658 KOps/s 53.6323 KOps/s $\color{#35bf28}+3.61\%$
test_set_shared 3.0217ms 0.1030ms 9.7048 KOps/s 9.4171 KOps/s $\color{#35bf28}+3.06\%$
test_update 78.3210μs 19.2192μs 52.0313 KOps/s 52.3129 KOps/s $\color{#d91a1a}-0.54\%$
test_update_nested 87.1320μs 25.7825μs 38.7861 KOps/s 39.2904 KOps/s $\color{#d91a1a}-1.28\%$
test_set_nested 69.9820μs 19.2701μs 51.8939 KOps/s 53.8073 KOps/s $\color{#d91a1a}-3.56\%$
test_set_nested_new 78.4230μs 21.7698μs 45.9351 KOps/s 47.5324 KOps/s $\color{#d91a1a}-3.36\%$
test_select 85.8430μs 34.6903μs 28.8265 KOps/s 29.5527 KOps/s $\color{#d91a1a}-2.46\%$
test_select_nested 95.8520μs 53.2337μs 18.7851 KOps/s 18.8402 KOps/s $\color{#d91a1a}-0.29\%$
test_exclude_nested 0.1428ms 0.1160ms 8.6241 KOps/s 8.6564 KOps/s $\color{#d91a1a}-0.37\%$
test_empty[True] 1.2806ms 0.3872ms 2.5827 KOps/s 2.5830 KOps/s $\color{#d91a1a}-0.01\%$
test_empty[False] 2.5680μs 0.8464μs 1.1815 MOps/s 1.1681 MOps/s $\color{#35bf28}+1.14\%$
test_to 75.1110μs 55.5990μs 17.9859 KOps/s 18.5217 KOps/s $\color{#d91a1a}-2.89\%$
test_to_nonblocking 83.0710μs 37.9350μs 26.3609 KOps/s 28.1744 KOps/s $\textbf{\color{#d91a1a}-6.44\%}$
test_unbind_speed 0.3000ms 0.2757ms 3.6270 KOps/s 3.7937 KOps/s $\color{#d91a1a}-4.39\%$
test_unbind_speed_stack0 89.4713ms 3.7797ms 264.5681 Ops/s 264.7719 Ops/s $\color{#d91a1a}-0.08\%$
test_unbind_speed_stack1 7.1333μs 1.7301μs 578.0045 KOps/s 547.0364 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_split 82.1252ms 1.7402ms 574.6428 Ops/s 645.2935 Ops/s $\textbf{\color{#d91a1a}-10.95\%}$
test_chunk 84.4859ms 1.6891ms 592.0348 Ops/s 597.6806 Ops/s $\color{#d91a1a}-0.94\%$
test_creation[device0] 0.1294ms 73.5427μs 13.5975 KOps/s 13.2211 KOps/s $\color{#35bf28}+2.85\%$
test_creation_from_tensor 0.1351ms 53.1236μs 18.8240 KOps/s 18.5066 KOps/s $\color{#35bf28}+1.72\%$
test_add_one[memmap_tensor0] 0.1663ms 7.8720μs 127.0325 KOps/s 135.8182 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_contiguous[memmap_tensor0] 23.4600μs 0.6390μs 1.5650 MOps/s 1.5517 MOps/s $\color{#35bf28}+0.86\%$
test_stack[memmap_tensor0] 38.5300μs 4.8434μs 206.4652 KOps/s 209.7370 KOps/s $\color{#d91a1a}-1.56\%$
test_memmaptd_index 1.0053ms 0.2666ms 3.7514 KOps/s 3.8033 KOps/s $\color{#d91a1a}-1.36\%$
test_memmaptd_index_astensor 0.6253ms 0.3222ms 3.1040 KOps/s 3.1224 KOps/s $\color{#d91a1a}-0.59\%$
test_memmaptd_index_op 0.9404ms 0.6187ms 1.6162 KOps/s 1.6273 KOps/s $\color{#d91a1a}-0.68\%$
test_serialize_model 0.1759s 98.0728ms 10.1965 Ops/s 9.6093 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_serialize_model_pickle 1.3740s 1.2380s 0.8077 Ops/s 0.8066 Ops/s $\color{#35bf28}+0.14\%$
test_serialize_weights 90.2261ms 86.7503ms 11.5273 Ops/s 10.0864 Ops/s $\textbf{\color{#35bf28}+14.29\%}$
test_serialize_weights_returnearly 0.2808s 74.0320ms 13.5077 Ops/s 12.0321 Ops/s $\textbf{\color{#35bf28}+12.26\%}$
test_serialize_weights_pickle 1.3636s 1.2372s 0.8083 Ops/s 0.8086 Ops/s $\color{#d91a1a}-0.04\%$
test_reshape_pytree 44.5910μs 24.8304μs 40.2732 KOps/s 38.4000 KOps/s $\color{#35bf28}+4.88\%$
test_reshape_td 60.4620μs 29.7497μs 33.6138 KOps/s 33.3168 KOps/s $\color{#35bf28}+0.89\%$
test_view_pytree 0.2239ms 24.1977μs 41.3262 KOps/s 40.2889 KOps/s $\color{#35bf28}+2.57\%$
test_view_td 91.7152ms 10.5985μs 94.3533 KOps/s 146.0636 KOps/s $\textbf{\color{#d91a1a}-35.40\%}$
test_unbind_pytree 63.8610μs 30.5956μs 32.6844 KOps/s 33.4134 KOps/s $\color{#d91a1a}-2.18\%$
test_unbind_td 0.2641ms 40.7403μs 24.5457 KOps/s 25.0095 KOps/s $\color{#d91a1a}-1.85\%$
test_split_pytree 93.8810μs 29.0686μs 34.4014 KOps/s 34.6516 KOps/s $\color{#d91a1a}-0.72\%$
test_split_td 0.3856ms 38.9900μs 25.6476 KOps/s 25.5760 KOps/s $\color{#35bf28}+0.28\%$
test_add_pytree 0.1908ms 40.8247μs 24.4950 KOps/s 26.1408 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_add_td 0.2792ms 51.2301μs 19.5198 KOps/s 19.7372 KOps/s $\color{#d91a1a}-1.10\%$
test_distributed 1.8423ms 72.3187μs 13.8277 KOps/s 12.9788 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_tdmodule 30.8900μs 16.5289μs 60.5001 KOps/s 58.8056 KOps/s $\color{#35bf28}+2.88\%$
test_tdmodule_dispatch 0.1350ms 33.8607μs 29.5328 KOps/s 28.5738 KOps/s $\color{#35bf28}+3.36\%$
test_tdseq 37.7010μs 19.7874μs 50.5372 KOps/s 49.7634 KOps/s $\color{#35bf28}+1.56\%$
test_tdseq_dispatch 61.6220μs 36.5879μs 27.3314 KOps/s 26.7011 KOps/s $\color{#35bf28}+2.36\%$
test_instantiation_functorch 1.8275ms 1.6838ms 593.8944 Ops/s 590.8391 Ops/s $\color{#35bf28}+0.52\%$
test_instantiation_td 1.7190ms 1.1550ms 865.8143 Ops/s 857.6877 Ops/s $\color{#35bf28}+0.95\%$
test_exec_functorch 0.2125ms 0.1651ms 6.0571 KOps/s 6.2032 KOps/s $\color{#d91a1a}-2.35\%$
test_exec_functional_call 0.2212ms 0.1662ms 6.0151 KOps/s 6.1535 KOps/s $\color{#d91a1a}-2.25\%$
test_exec_td 0.2372ms 0.1576ms 6.3463 KOps/s 6.4945 KOps/s $\color{#d91a1a}-2.28\%$
test_exec_td_decorator 0.8712ms 0.1951ms 5.1256 KOps/s 5.2524 KOps/s $\color{#d91a1a}-2.41\%$
test_vmap_mlp_speed[True-True] 1.4628ms 1.0845ms 922.1099 Ops/s 950.7254 Ops/s $\color{#d91a1a}-3.01\%$
test_vmap_mlp_speed[True-False] 0.6979ms 0.6233ms 1.6043 KOps/s 1.6477 KOps/s $\color{#d91a1a}-2.63\%$
test_vmap_mlp_speed[False-True] 1.0938ms 0.9956ms 1.0044 KOps/s 1.0294 KOps/s $\color{#d91a1a}-2.43\%$
test_vmap_mlp_speed[False-False] 0.6206ms 0.5514ms 1.8134 KOps/s 1.8543 KOps/s $\color{#d91a1a}-2.20\%$
test_vmap_mlp_speed_decorator[True-True] 2.9138ms 2.3321ms 428.7912 Ops/s 426.9129 Ops/s $\color{#35bf28}+0.44\%$
test_vmap_mlp_speed_decorator[True-False] 1.0425ms 0.6710ms 1.4904 KOps/s 1.5361 KOps/s $\color{#d91a1a}-2.98\%$
test_vmap_mlp_speed_decorator[False-True] 2.4118ms 1.9946ms 501.3447 Ops/s 508.3972 Ops/s $\color{#d91a1a}-1.39\%$
test_vmap_mlp_speed_decorator[False-False] 0.9347ms 0.5626ms 1.7774 KOps/s 1.7979 KOps/s $\color{#d91a1a}-1.14\%$
test_vmap_transformer_speed[True-True] 13.0999ms 12.5791ms 79.4972 Ops/s 80.1905 Ops/s $\color{#d91a1a}-0.86\%$
test_vmap_transformer_speed[True-False] 8.5577ms 8.2565ms 121.1172 Ops/s 122.2056 Ops/s $\color{#d91a1a}-0.89\%$
test_vmap_transformer_speed[False-True] 13.0740ms 12.6056ms 79.3297 Ops/s 80.8299 Ops/s $\color{#d91a1a}-1.86\%$
test_vmap_transformer_speed[False-False] 8.4381ms 8.1764ms 122.3029 Ops/s 123.6395 Ops/s $\color{#d91a1a}-1.08\%$
test_vmap_transformer_speed_decorator[True-True] 73.8537ms 73.0662ms 13.6862 Ops/s 13.6332 Ops/s $\color{#35bf28}+0.39\%$
test_vmap_transformer_speed_decorator[True-False] 21.3117ms 19.7340ms 50.6740 Ops/s 50.9930 Ops/s $\color{#d91a1a}-0.63\%$
test_vmap_transformer_speed_decorator[False-True] 68.5021ms 65.9195ms 15.1700 Ops/s 13.4405 Ops/s $\textbf{\color{#35bf28}+12.87\%}$
test_vmap_transformer_speed_decorator[False-False] 21.1680ms 19.3078ms 51.7924 Ops/s 52.0043 Ops/s $\color{#d91a1a}-0.41\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants