Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BE] No warning if user sets the log_prob_key explicitly and only one variable is sampled from the ProbTDMod #1209

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 5, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 5, 2025
… variable is sampled from the ProbTDMod

ghstack-source-id: 26073eefcaa86418a15ddac101187bf055a8c087
Pull Request resolved: #1209
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 5, 2025
… variable is sampled from the ProbTDMod

ghstack-source-id: 90621391506cdd67563a1e791ade093e7db5f5df
Pull Request resolved: #1209
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 5, 2025
… variable is sampled from the ProbTDMod

ghstack-source-id: ccc966d5e698a4fb394081a92bafc31649951ab7
Pull Request resolved: #1209
@vmoens vmoens merged commit 1dc52b3 into gh/vmoens/47/base Feb 5, 2025
5 of 7 checks passed
vmoens added a commit that referenced this pull request Feb 5, 2025
… variable is sampled from the ProbTDMod

ghstack-source-id: ccc966d5e698a4fb394081a92bafc31649951ab7
Pull Request resolved: #1209
@vmoens vmoens deleted the gh/vmoens/47/head branch February 5, 2025 11:47
Copy link

github-actions bot commented Feb 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}22$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.4980μs 20.3688μs 49.0946 KOps/s 48.7646 KOps/s $\color{#35bf28}+0.68\%$
test_plain_set_stack_nested 45.3250μs 20.8237μs 48.0222 KOps/s 48.2563 KOps/s $\color{#d91a1a}-0.49\%$
test_plain_set_nested_inplace 55.8350μs 22.5280μs 44.3891 KOps/s 44.2619 KOps/s $\color{#35bf28}+0.29\%$
test_plain_set_stack_nested_inplace 52.3170μs 22.4509μs 44.5417 KOps/s 44.0231 KOps/s $\color{#35bf28}+1.18\%$
test_items 24.2450μs 4.2335μs 236.2121 KOps/s 232.5607 KOps/s $\color{#35bf28}+1.57\%$
test_items_nested 0.5685ms 0.4011ms 2.4931 KOps/s 2.5118 KOps/s $\color{#d91a1a}-0.74\%$
test_items_nested_locked 0.6769ms 0.4000ms 2.4999 KOps/s 2.5093 KOps/s $\color{#d91a1a}-0.38\%$
test_items_nested_leaf 0.1455ms 75.7211μs 13.2064 KOps/s 12.9614 KOps/s $\color{#35bf28}+1.89\%$
test_items_stack_nested 0.5106ms 0.4032ms 2.4802 KOps/s 2.5096 KOps/s $\color{#d91a1a}-1.17\%$
test_items_stack_nested_leaf 0.1403ms 78.1533μs 12.7954 KOps/s 12.4754 KOps/s $\color{#35bf28}+2.57\%$
test_items_stack_nested_locked 0.5601ms 0.4051ms 2.4688 KOps/s 2.4845 KOps/s $\color{#d91a1a}-0.63\%$
test_keys 20.4480μs 3.4816μs 287.2241 KOps/s 287.4908 KOps/s $\color{#d91a1a}-0.09\%$
test_keys_nested 0.2611ms 0.1628ms 6.1424 KOps/s 6.1912 KOps/s $\color{#d91a1a}-0.79\%$
test_keys_nested_locked 1.6732ms 0.1699ms 5.8857 KOps/s 5.9610 KOps/s $\color{#d91a1a}-1.26\%$
test_keys_nested_leaf 0.2354ms 0.1415ms 7.0656 KOps/s 7.1264 KOps/s $\color{#d91a1a}-0.85\%$
test_keys_stack_nested 0.2529ms 0.1594ms 6.2738 KOps/s 6.2575 KOps/s $\color{#35bf28}+0.26\%$
test_keys_stack_nested_leaf 0.2383ms 0.1365ms 7.3243 KOps/s 7.2725 KOps/s $\color{#35bf28}+0.71\%$
test_keys_stack_nested_locked 0.3550ms 0.1654ms 6.0464 KOps/s 5.9765 KOps/s $\color{#35bf28}+1.17\%$
test_values 8.5460μs 1.0386μs 962.7910 KOps/s 965.0839 KOps/s $\color{#d91a1a}-0.24\%$
test_values_nested 0.1068ms 61.2283μs 16.3323 KOps/s 16.3916 KOps/s $\color{#d91a1a}-0.36\%$
test_values_nested_locked 0.1161ms 61.7350μs 16.1983 KOps/s 15.7093 KOps/s $\color{#35bf28}+3.11\%$
test_values_nested_leaf 0.1922ms 71.2228μs 14.0405 KOps/s 14.2688 KOps/s $\color{#d91a1a}-1.60\%$
test_values_stack_nested 0.1197ms 63.7942μs 15.6754 KOps/s 15.7497 KOps/s $\color{#d91a1a}-0.47\%$
test_values_stack_nested_leaf 0.1139ms 68.9366μs 14.5061 KOps/s 14.3584 KOps/s $\color{#35bf28}+1.03\%$
test_values_stack_nested_locked 0.1123ms 64.5246μs 15.4980 KOps/s 15.8354 KOps/s $\color{#d91a1a}-2.13\%$
test_membership 16.7720μs 0.8424μs 1.1871 MOps/s 1.4548 MOps/s $\textbf{\color{#d91a1a}-18.40\%}$
test_membership_nested 40.9570μs 2.8330μs 352.9771 KOps/s 350.8771 KOps/s $\color{#35bf28}+0.60\%$
test_membership_nested_leaf 37.5200μs 2.9001μs 344.8207 KOps/s 344.4987 KOps/s $\color{#35bf28}+0.09\%$
test_membership_stacked_nested 40.3760μs 2.8475μs 351.1877 KOps/s 348.5529 KOps/s $\color{#35bf28}+0.76\%$
test_membership_stacked_nested_leaf 23.9050μs 2.8436μs 351.6649 KOps/s 349.5029 KOps/s $\color{#35bf28}+0.62\%$
test_membership_nested_last 40.4150μs 4.3297μs 230.9630 KOps/s 234.9936 KOps/s $\color{#d91a1a}-1.72\%$
test_membership_nested_leaf_last 27.6320μs 4.3614μs 229.2853 KOps/s 232.5854 KOps/s $\color{#d91a1a}-1.42\%$
test_membership_stacked_nested_last 57.9280μs 13.3915μs 74.6741 KOps/s 231.9183 KOps/s $\textbf{\color{#d91a1a}-67.80\%}$
test_membership_stacked_nested_leaf_last 45.1640μs 13.4398μs 74.4057 KOps/s 231.3993 KOps/s $\textbf{\color{#d91a1a}-67.85\%}$
test_nested_getleaf 51.4160μs 10.4453μs 95.7367 KOps/s 95.0244 KOps/s $\color{#35bf28}+0.75\%$
test_nested_get 45.7860μs 9.9198μs 100.8088 KOps/s 99.8623 KOps/s $\color{#35bf28}+0.95\%$
test_stacked_getleaf 36.3380μs 10.4196μs 95.9732 KOps/s 95.5538 KOps/s $\color{#35bf28}+0.44\%$
test_stacked_get 32.2910μs 10.0223μs 99.7775 KOps/s 99.4003 KOps/s $\color{#35bf28}+0.38\%$
test_nested_getitemleaf 48.0000μs 11.0661μs 90.3661 KOps/s 88.9972 KOps/s $\color{#35bf28}+1.54\%$
test_nested_getitem 48.1400μs 10.6449μs 93.9421 KOps/s 93.1796 KOps/s $\color{#35bf28}+0.82\%$
test_stacked_getitemleaf 47.6090μs 11.0004μs 90.9056 KOps/s 89.4832 KOps/s $\color{#35bf28}+1.59\%$
test_stacked_getitem 33.2230μs 10.6162μs 94.1960 KOps/s 93.6417 KOps/s $\color{#35bf28}+0.59\%$
test_lock_nested 0.5112ms 0.4098ms 2.4402 KOps/s 2.4839 KOps/s $\color{#d91a1a}-1.76\%$
test_lock_stack_nested 0.5613ms 0.4089ms 2.4454 KOps/s 2.4311 KOps/s $\color{#35bf28}+0.59\%$
test_unlock_nested 0.5069ms 0.3402ms 2.9393 KOps/s 3.0471 KOps/s $\color{#d91a1a}-3.54\%$
test_unlock_stack_nested 0.3989ms 0.3284ms 3.0447 KOps/s 3.0166 KOps/s $\color{#35bf28}+0.93\%$
test_flatten_speed 0.1789ms 97.7585μs 10.2293 KOps/s 9.9207 KOps/s $\color{#35bf28}+3.11\%$
test_unflatten_speed 0.7587ms 0.5122ms 1.9525 KOps/s 1.9246 KOps/s $\color{#35bf28}+1.45\%$
test_common_ops 4.1068ms 0.8167ms 1.2244 KOps/s 1.2250 KOps/s $\color{#d91a1a}-0.05\%$
test_creation 36.3480μs 2.4626μs 406.0799 KOps/s 401.0001 KOps/s $\color{#35bf28}+1.27\%$
test_creation_empty 44.6630μs 12.6394μs 79.1179 KOps/s 82.4070 KOps/s $\color{#d91a1a}-3.99\%$
test_creation_nested_1 66.4440μs 15.6819μs 63.7676 KOps/s 66.2186 KOps/s $\color{#d91a1a}-3.70\%$
test_creation_nested_2 53.5100μs 19.9603μs 50.0994 KOps/s 51.1050 KOps/s $\color{#d91a1a}-1.97\%$
test_clone 52.7990μs 13.3699μs 74.7951 KOps/s 72.8221 KOps/s $\color{#35bf28}+2.71\%$
test_getitem[int] 0.8554ms 12.6593μs 78.9932 KOps/s 76.7900 KOps/s $\color{#35bf28}+2.87\%$
test_getitem[slice_int] 0.1433ms 23.9632μs 41.7306 KOps/s 40.5241 KOps/s $\color{#35bf28}+2.98\%$
test_getitem[range] 0.1645ms 49.4555μs 20.2202 KOps/s 20.1488 KOps/s $\color{#35bf28}+0.35\%$
test_getitem[tuple] 0.1278ms 20.0934μs 49.7675 KOps/s 48.3026 KOps/s $\color{#35bf28}+3.03\%$
test_getitem[list] 0.1583ms 44.0097μs 22.7223 KOps/s 22.1203 KOps/s $\color{#35bf28}+2.72\%$
test_setitem_dim[int] 60.2430μs 25.3743μs 39.4099 KOps/s 38.6648 KOps/s $\color{#35bf28}+1.93\%$
test_setitem_dim[slice_int] 83.2560μs 50.3067μs 19.8781 KOps/s 19.6838 KOps/s $\color{#35bf28}+0.99\%$
test_setitem_dim[range] 0.1144ms 74.8466μs 13.3607 KOps/s 12.8564 KOps/s $\color{#35bf28}+3.92\%$
test_setitem_dim[tuple] 87.7340μs 40.2594μs 24.8389 KOps/s 25.3339 KOps/s $\color{#d91a1a}-1.95\%$
test_setitem 57.0960μs 20.9299μs 47.7785 KOps/s 47.4639 KOps/s $\color{#35bf28}+0.66\%$
test_set 55.3940μs 20.5578μs 48.6433 KOps/s 48.7575 KOps/s $\color{#d91a1a}-0.23\%$
test_set_shared 3.9170ms 0.1802ms 5.5495 KOps/s 5.4641 KOps/s $\color{#35bf28}+1.56\%$
test_update 0.1060ms 24.0439μs 41.5905 KOps/s 42.1161 KOps/s $\color{#d91a1a}-1.25\%$
test_update_nested 80.9810μs 34.2576μs 29.1906 KOps/s 29.8866 KOps/s $\color{#d91a1a}-2.33\%$
test_update__nested 0.4330ms 33.7745μs 29.6081 KOps/s 29.5288 KOps/s $\color{#35bf28}+0.27\%$
test_set_nested 70.7220μs 22.3289μs 44.7850 KOps/s 44.8501 KOps/s $\color{#d91a1a}-0.15\%$
test_set_nested_new 64.0700μs 27.2576μs 36.6871 KOps/s 37.3264 KOps/s $\color{#d91a1a}-1.71\%$
test_select 0.1004ms 42.9630μs 23.2759 KOps/s 23.4856 KOps/s $\color{#d91a1a}-0.89\%$
test_select_nested 0.1647ms 62.5952μs 15.9757 KOps/s 16.0268 KOps/s $\color{#d91a1a}-0.32\%$
test_exclude_nested 0.3490ms 80.1671μs 12.4739 KOps/s 12.5716 KOps/s $\color{#d91a1a}-0.78\%$
test_empty[True] 0.5606ms 0.4032ms 2.4803 KOps/s 2.5045 KOps/s $\color{#d91a1a}-0.97\%$
test_empty[False] 10.0013μs 1.3649μs 732.6311 KOps/s 740.2046 KOps/s $\color{#d91a1a}-1.02\%$
test_unbind_speed 0.4752ms 0.2728ms 3.6651 KOps/s 3.7154 KOps/s $\color{#d91a1a}-1.35\%$
test_unbind_speed_stack0 0.3270ms 0.2614ms 3.8261 KOps/s 3.8058 KOps/s $\color{#35bf28}+0.53\%$
test_unbind_speed_stack1 93.6311ms 0.6965ms 1.4358 KOps/s 1.3975 KOps/s $\color{#35bf28}+2.73\%$
test_split 95.9934ms 1.7449ms 573.1140 Ops/s 563.2928 Ops/s $\color{#35bf28}+1.74\%$
test_chunk 95.2760ms 1.7380ms 575.3696 Ops/s 562.6959 Ops/s $\color{#35bf28}+2.25\%$
test_consolidate_njt[False-None] 8.4529ms 8.2084ms 121.8259 Ops/s 122.4890 Ops/s $\color{#d91a1a}-0.54\%$
test_creation[device0] 0.2159ms 92.2416μs 10.8411 KOps/s 11.0554 KOps/s $\color{#d91a1a}-1.94\%$
test_creation_from_tensor 3.4504ms 96.1276μs 10.4028 KOps/s 10.0892 KOps/s $\color{#35bf28}+3.11\%$
test_add_one[memmap_tensor0] 0.1348ms 5.0832μs 196.7264 KOps/s 196.0014 KOps/s $\color{#35bf28}+0.37\%$
test_contiguous[memmap_tensor0] 40.5520μs 0.5062μs 1.9756 MOps/s 1.9799 MOps/s $\color{#d91a1a}-0.22\%$
test_stack[memmap_tensor0] 19.4360μs 3.4385μs 290.8220 KOps/s 288.3529 KOps/s $\color{#35bf28}+0.86\%$
test_memmaptd_index 1.0449ms 0.2294ms 4.3583 KOps/s 4.3534 KOps/s $\color{#35bf28}+0.11\%$
test_memmaptd_index_astensor 0.5152ms 0.3103ms 3.2229 KOps/s 3.1681 KOps/s $\color{#35bf28}+1.73\%$
test_memmaptd_index_op 0.7960ms 0.5998ms 1.6671 KOps/s 1.7114 KOps/s $\color{#d91a1a}-2.59\%$
test_serialize_model 0.2148s 0.1311s 7.6304 Ops/s 8.8633 Ops/s $\textbf{\color{#d91a1a}-13.91\%}$
test_serialize_model_pickle 0.4753s 0.3929s 2.5449 Ops/s 2.5249 Ops/s $\color{#35bf28}+0.79\%$
test_serialize_weights 0.1275s 0.1120s 8.9311 Ops/s 8.7614 Ops/s $\color{#35bf28}+1.94\%$
test_serialize_weights_returnearly 0.1848s 0.1572s 6.3613 Ops/s 6.3182 Ops/s $\color{#35bf28}+0.68\%$
test_serialize_weights_pickle 0.9440s 0.6979s 1.4328 Ops/s 1.1786 Ops/s $\textbf{\color{#35bf28}+21.56\%}$
test_serialize_weights_filesystem 0.1527s 0.1417s 7.0585 Ops/s 6.3967 Ops/s $\textbf{\color{#35bf28}+10.35\%}$
test_serialize_model_filesystem 0.1470s 0.1403s 7.1258 Ops/s 7.0203 Ops/s $\color{#35bf28}+1.50\%$
test_reshape_pytree 62.4870μs 25.9275μs 38.5690 KOps/s 38.4168 KOps/s $\color{#35bf28}+0.40\%$
test_reshape_td 65.4020μs 33.4496μs 29.8957 KOps/s 30.1548 KOps/s $\color{#d91a1a}-0.86\%$
test_view_pytree 97.9730μs 25.8195μs 38.7304 KOps/s 38.2047 KOps/s $\color{#35bf28}+1.38\%$
test_view_td 77.9960μs 38.5398μs 25.9472 KOps/s 25.3218 KOps/s $\color{#35bf28}+2.47\%$
test_unbind_pytree 62.7580μs 29.1724μs 34.2789 KOps/s 34.2290 KOps/s $\color{#35bf28}+0.15\%$
test_unbind_td 0.3191ms 40.0331μs 24.9793 KOps/s 25.3501 KOps/s $\color{#d91a1a}-1.46\%$
test_split_pytree 64.3300μs 29.3197μs 34.1068 KOps/s 34.6569 KOps/s $\color{#d91a1a}-1.59\%$
test_split_td 0.2064ms 45.4058μs 22.0236 KOps/s 22.1837 KOps/s $\color{#d91a1a}-0.72\%$
test_add_pytree 92.9870μs 35.8142μs 27.9219 KOps/s 27.9785 KOps/s $\color{#d91a1a}-0.20\%$
test_add_td 0.1089ms 58.4497μs 17.1087 KOps/s 17.7485 KOps/s $\color{#d91a1a}-3.60\%$
test_compile_add_one_nested[tensordict-compile] 0.1587ms 68.2454μs 14.6530 KOps/s 15.2563 KOps/s $\color{#d91a1a}-3.95\%$
test_compile_add_one_nested[tensordict-eager] 0.5201ms 0.1719ms 5.8182 KOps/s 5.8247 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_one_nested[pytree-compile] 0.1106ms 47.0117μs 21.2713 KOps/s 22.4296 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_compile_add_one_nested[pytree-eager] 0.2130ms 0.1190ms 8.4062 KOps/s 8.4462 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_copy_nested[tensordict-compile] 0.1053ms 29.1777μs 34.2728 KOps/s 35.7128 KOps/s $\color{#d91a1a}-4.03\%$
test_compile_copy_nested[tensordict-eager] 0.1320ms 57.3140μs 17.4478 KOps/s 17.0724 KOps/s $\color{#35bf28}+2.20\%$
test_compile_copy_nested[pytree-compile] 0.1565ms 79.4050μs 12.5937 KOps/s 12.5213 KOps/s $\color{#35bf28}+0.58\%$
test_compile_copy_nested[pytree-eager] 0.1299ms 66.6324μs 15.0077 KOps/s 14.8025 KOps/s $\color{#35bf28}+1.39\%$
test_compile_add_one_flat[tensordict-compile] 0.2267ms 0.1086ms 9.2070 KOps/s 9.5084 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_add_one_flat[tensordict-eager] 0.4390ms 0.2168ms 4.6128 KOps/s 4.6551 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_add_one_flat[tensorclass-compile] 0.1160ms 47.2971μs 21.1429 KOps/s 21.7736 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_add_one_flat[tensorclass-eager] 0.1613ms 67.4085μs 14.8349 KOps/s 14.8789 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_one_flat[pytree-compile] 0.1795ms 0.1004ms 9.9603 KOps/s 10.0483 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_flat[pytree-eager] 0.4214ms 0.2021ms 4.9487 KOps/s 4.9991 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_add_self_flat[tensordict-eager] 0.3306ms 0.2305ms 4.3384 KOps/s 4.2918 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_self_flat[tensordict-compile] 0.2504ms 0.1140ms 8.7698 KOps/s 9.3441 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_compile_add_self_flat[tensorclass-eager] 0.3345ms 63.4589μs 15.7582 KOps/s 16.3281 KOps/s $\color{#d91a1a}-3.49\%$
test_compile_add_self_flat[tensorclass-compile] 0.1001ms 50.6243μs 19.7533 KOps/s 20.8666 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_compile_add_self_flat[pytree-eager] 0.2993ms 0.1562ms 6.4034 KOps/s 6.3391 KOps/s $\color{#35bf28}+1.01\%$
test_compile_add_self_flat[pytree-compile] 0.1746ms 0.1003ms 9.9745 KOps/s 10.2206 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_copy_flat[tensordict-compile] 64.7710μs 21.4787μs 46.5578 KOps/s 46.8074 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_copy_flat[tensordict-eager] 0.1289ms 64.9236μs 15.4027 KOps/s 15.0243 KOps/s $\color{#35bf28}+2.52\%$
test_compile_copy_flat[pytree-compile] 0.1799ms 81.2734μs 12.3042 KOps/s 12.2816 KOps/s $\color{#35bf28}+0.18\%$
test_compile_copy_flat[pytree-eager] 0.1438ms 66.5876μs 15.0178 KOps/s 14.8609 KOps/s $\color{#35bf28}+1.06\%$
test_compile_assign_and_add[tensordict-compile] 0.3128ms 0.2203ms 4.5395 KOps/s 4.7511 KOps/s $\color{#d91a1a}-4.46\%$
test_compile_assign_and_add[tensordict-eager] 1.6086ms 1.3867ms 721.1367 Ops/s 738.4752 Ops/s $\color{#d91a1a}-2.35\%$
test_compile_assign_and_add[pytree-compile] 0.3726ms 0.2134ms 4.6867 KOps/s 4.8440 KOps/s $\color{#d91a1a}-3.25\%$
test_compile_assign_and_add[pytree-eager] 1.0191ms 0.8122ms 1.2312 KOps/s 1.2336 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_assign_and_add_stack[compile] 0.5998ms 0.4673ms 2.1399 KOps/s 2.2240 KOps/s $\color{#d91a1a}-3.78\%$
test_compile_assign_and_add_stack[eager] 4.5391ms 2.7776ms 360.0278 Ops/s 375.8295 Ops/s $\color{#d91a1a}-4.20\%$
test_compile_indexing[tensor-tensordict-compile] 0.1362ms 40.6161μs 24.6208 KOps/s 26.9731 KOps/s $\textbf{\color{#d91a1a}-8.72\%}$
test_compile_indexing[tensor-tensordict-eager] 0.1962s 40.9622μs 24.4128 KOps/s 30.1169 KOps/s $\textbf{\color{#d91a1a}-18.94\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1065ms 32.1813μs 31.0740 KOps/s 33.8750 KOps/s $\textbf{\color{#d91a1a}-8.27\%}$
test_compile_indexing[tensor-tensorclass-eager] 86.7130μs 23.5347μs 42.4904 KOps/s 43.4735 KOps/s $\color{#d91a1a}-2.26\%$
test_compile_indexing[tensor-pytree-compile] 89.4270μs 33.3395μs 29.9945 KOps/s 33.0251 KOps/s $\textbf{\color{#d91a1a}-9.18\%}$
test_compile_indexing[tensor-pytree-eager] 85.3900μs 24.0431μs 41.5920 KOps/s 43.3172 KOps/s $\color{#d91a1a}-3.98\%$
test_compile_indexing[slice-tensordict-compile] 0.1279ms 57.4963μs 17.3924 KOps/s 19.5786 KOps/s $\textbf{\color{#d91a1a}-11.17\%}$
test_compile_indexing[slice-tensordict-eager] 0.3629ms 20.0223μs 49.9442 KOps/s 49.3239 KOps/s $\color{#35bf28}+1.26\%$
test_compile_indexing[slice-tensorclass-compile] 0.1051ms 49.0317μs 20.3950 KOps/s 22.8006 KOps/s $\textbf{\color{#d91a1a}-10.55\%}$
test_compile_indexing[slice-tensorclass-eager] 61.3050μs 18.8520μs 53.0447 KOps/s 53.3443 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[slice-pytree-compile] 0.1124ms 48.7601μs 20.5086 KOps/s 22.0758 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_compile_indexing[slice-pytree-eager] 50.1740μs 18.8498μs 53.0509 KOps/s 53.6351 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_indexing[int-tensordict-compile] 0.1336ms 58.1866μs 17.1861 KOps/s 19.0157 KOps/s $\textbf{\color{#d91a1a}-9.62\%}$
test_compile_indexing[int-tensordict-eager] 0.8585ms 19.9252μs 50.1878 KOps/s 49.2133 KOps/s $\color{#35bf28}+1.98\%$
test_compile_indexing[int-tensorclass-compile] 0.1327ms 49.5506μs 20.1814 KOps/s 21.8342 KOps/s $\textbf{\color{#d91a1a}-7.57\%}$
test_compile_indexing[int-tensorclass-eager] 50.5950μs 18.6855μs 53.5175 KOps/s 53.8683 KOps/s $\color{#d91a1a}-0.65\%$
test_compile_indexing[int-pytree-compile] 0.1039ms 49.5157μs 20.1956 KOps/s 21.8287 KOps/s $\textbf{\color{#d91a1a}-7.48\%}$
test_compile_indexing[int-pytree-eager] 79.0740μs 18.7380μs 53.3676 KOps/s 53.8485 KOps/s $\color{#d91a1a}-0.89\%$
test_mod_add[eager] 0.1002ms 36.7539μs 27.2080 KOps/s 28.7494 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_mod_add[compile] 0.1398ms 66.6244μs 15.0095 KOps/s 15.6820 KOps/s $\color{#d91a1a}-4.29\%$
test_mod_add[compile-overhead] 0.1378ms 67.1113μs 14.9006 KOps/s 15.9390 KOps/s $\textbf{\color{#d91a1a}-6.51\%}$
test_mod_wrap[eager] 0.3589ms 0.2225ms 4.4934 KOps/s 4.4775 KOps/s $\color{#35bf28}+0.36\%$
test_mod_wrap[compile] 2.0150ms 0.2300ms 4.3486 KOps/s 4.4399 KOps/s $\color{#d91a1a}-2.06\%$
test_mod_wrap[compile-overhead] 0.4206ms 0.2284ms 4.3783 KOps/s 4.4000 KOps/s $\color{#d91a1a}-0.49\%$
test_mod_wrap_and_backward[eager] 17.9969ms 12.7735ms 78.2871 Ops/s 78.4249 Ops/s $\color{#d91a1a}-0.18\%$
test_mod_wrap_and_backward[compile] 12.7309ms 11.0099ms 90.8273 Ops/s 78.7025 Ops/s $\textbf{\color{#35bf28}+15.41\%}$
test_mod_wrap_and_backward[compile-overhead] 14.4410ms 11.4169ms 87.5894 Ops/s 88.9425 Ops/s $\color{#d91a1a}-1.52\%$
test_seq_add[eager] 0.2085ms 0.1189ms 8.4111 KOps/s 8.4991 KOps/s $\color{#d91a1a}-1.03\%$
test_seq_add[compile] 0.1546ms 79.6893μs 12.5487 KOps/s 13.1570 KOps/s $\color{#d91a1a}-4.62\%$
test_seq_add[compile-overhead] 0.1350ms 77.1082μs 12.9688 KOps/s 13.4619 KOps/s $\color{#d91a1a}-3.66\%$
test_seq_wrap[eager] 0.5612ms 0.4527ms 2.2089 KOps/s 2.2312 KOps/s $\color{#d91a1a}-1.00\%$
test_seq_wrap[compile] 0.8947ms 0.2463ms 4.0601 KOps/s 4.1202 KOps/s $\color{#d91a1a}-1.46\%$
test_seq_wrap[compile-overhead] 0.4217ms 0.2417ms 4.1379 KOps/s 4.0905 KOps/s $\color{#35bf28}+1.16\%$
test_func_call_runtime[False-eager] 0.9606ms 0.5410ms 1.8484 KOps/s 1.8329 KOps/s $\color{#35bf28}+0.85\%$
test_func_call_runtime[False-compile] 0.6614ms 0.4385ms 2.2807 KOps/s 2.2520 KOps/s $\color{#35bf28}+1.28\%$
test_func_call_runtime[False-compile-overhead] 0.5337ms 0.4336ms 2.3061 KOps/s 2.2734 KOps/s $\color{#35bf28}+1.44\%$
test_func_call_runtime[True-eager] 1.6064ms 0.7610ms 1.3140 KOps/s 1.3217 KOps/s $\color{#d91a1a}-0.58\%$
test_func_call_runtime[True-compile] 0.6312ms 0.4594ms 2.1769 KOps/s 2.1479 KOps/s $\color{#35bf28}+1.35\%$
test_func_call_runtime[True-compile-overhead] 0.6362ms 0.4581ms 2.1829 KOps/s 2.1431 KOps/s $\color{#35bf28}+1.86\%$
test_func_call_cm_runtime[False-eager] 0.9651ms 0.5379ms 1.8589 KOps/s 1.8358 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_cm_runtime[False-compile] 0.6015ms 0.4351ms 2.2984 KOps/s 2.2486 KOps/s $\color{#35bf28}+2.22\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8688ms 0.4428ms 2.2585 KOps/s 2.2566 KOps/s $\color{#35bf28}+0.08\%$
test_func_call_cm_runtime[True-eager] 1.2391ms 0.9002ms 1.1109 KOps/s 1.1161 KOps/s $\color{#d91a1a}-0.47\%$
test_func_call_cm_runtime[True-compile] 1.5458ms 0.7967ms 1.2552 KOps/s 1.2388 KOps/s $\color{#35bf28}+1.32\%$
test_func_call_cm_runtime[True-compile-overhead] 3.2108ms 0.8031ms 1.2451 KOps/s 1.2349 KOps/s $\color{#35bf28}+0.83\%$
test_vmap_func_call_cm_runtime[eager] 2.6112ms 1.8902ms 529.0584 Ops/s 517.1803 Ops/s $\color{#35bf28}+2.30\%$
test_vmap_func_call_cm_runtime[compile] 0.7753ms 0.5338ms 1.8732 KOps/s 1.8472 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8277ms 0.5358ms 1.8665 KOps/s 1.8478 KOps/s $\color{#35bf28}+1.01\%$
test_distributed 1.4060ms 0.1275ms 7.8447 KOps/s 7.7121 KOps/s $\color{#35bf28}+1.72\%$
test_tdmodule 0.1072ms 27.8545μs 35.9008 KOps/s 36.4972 KOps/s $\color{#d91a1a}-1.63\%$
test_tdmodule_dispatch 68.2880μs 49.7022μs 20.1198 KOps/s 19.9082 KOps/s $\color{#35bf28}+1.06\%$
test_tdseq 45.4050μs 29.4087μs 34.0036 KOps/s 33.6465 KOps/s $\color{#35bf28}+1.06\%$
test_tdseq_dispatch 84.5180μs 55.6672μs 17.9639 KOps/s 18.6039 KOps/s $\color{#d91a1a}-3.44\%$
test_instantiation_functorch 2.4612ms 1.5589ms 641.4772 Ops/s 661.4521 Ops/s $\color{#d91a1a}-3.02\%$
test_exec_functorch 0.4204ms 0.1840ms 5.4334 KOps/s 5.6534 KOps/s $\color{#d91a1a}-3.89\%$
test_exec_functional_call 0.3205ms 0.1659ms 6.0272 KOps/s 5.8923 KOps/s $\color{#35bf28}+2.29\%$
test_exec_td_decorator 0.4723ms 0.2306ms 4.3373 KOps/s 4.3801 KOps/s $\color{#d91a1a}-0.98\%$
test_vmap_mlp_speed_decorator[True-True] 1.0027ms 0.6606ms 1.5138 KOps/s 1.5129 KOps/s $\color{#35bf28}+0.06\%$
test_vmap_mlp_speed_decorator[True-False] 0.9071ms 0.6619ms 1.5108 KOps/s 1.5246 KOps/s $\color{#d91a1a}-0.91\%$
test_vmap_mlp_speed_decorator[False-True] 0.8769ms 0.5298ms 1.8874 KOps/s 1.8805 KOps/s $\color{#35bf28}+0.37\%$
test_vmap_mlp_speed_decorator[False-False] 0.7910ms 0.5311ms 1.8829 KOps/s 1.8924 KOps/s $\color{#d91a1a}-0.51\%$
test_to_module_speed[True] 2.1884ms 1.3337ms 749.7952 Ops/s 753.5538 Ops/s $\color{#d91a1a}-0.50\%$
test_to_module_speed[False] 1.7578ms 1.3000ms 769.2463 Ops/s 767.4038 Ops/s $\color{#35bf28}+0.24\%$
test_tc_init 98.4940μs 50.1085μs 19.9567 KOps/s 20.7791 KOps/s $\color{#d91a1a}-3.96\%$
test_tc_init_nested 0.6183ms 0.1003ms 9.9681 KOps/s 10.5005 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_tc_first_layer_tensor 18.8450μs 1.4828μs 674.4215 KOps/s 653.7516 KOps/s $\color{#35bf28}+3.16\%$
test_tc_first_layer_nontensor 29.3950μs 4.6142μs 216.7242 KOps/s 218.0949 KOps/s $\color{#d91a1a}-0.63\%$
test_tc_second_layer_tensor 21.9510μs 2.7729μs 360.6295 KOps/s 359.3970 KOps/s $\color{#35bf28}+0.34\%$
test_tc_second_layer_nontensor 25.3280μs 5.9388μs 168.3847 KOps/s 170.1058 KOps/s $\color{#d91a1a}-1.01\%$
test_unbind 0.2237s 12.9965ms 76.9438 Ops/s 75.7555 Ops/s $\color{#35bf28}+1.57\%$
test_full_like 8.5831ms 7.3928ms 135.2672 Ops/s 148.8109 Ops/s $\textbf{\color{#d91a1a}-9.10\%}$
test_zeros_like 7.8138ms 4.3883ms 227.8764 Ops/s 373.9712 Ops/s $\textbf{\color{#d91a1a}-39.07\%}$
test_ones_like 4.0420ms 3.0487ms 328.0126 Ops/s 328.4996 Ops/s $\color{#d91a1a}-0.15\%$
test_clone 6.0379ms 4.6210ms 216.4050 Ops/s 213.5378 Ops/s $\color{#35bf28}+1.34\%$
test_squeeze 58.4090μs 12.2714μs 81.4900 KOps/s 81.7201 KOps/s $\color{#d91a1a}-0.28\%$
test_unsqueeze 0.2830ms 94.1774μs 10.6183 KOps/s 10.7688 KOps/s $\color{#d91a1a}-1.40\%$
test_split 0.3259ms 0.1989ms 5.0277 KOps/s 5.1304 KOps/s $\color{#d91a1a}-2.00\%$
test_permute 0.3499ms 0.2010ms 4.9763 KOps/s 5.1280 KOps/s $\color{#d91a1a}-2.96\%$
test_stack 29.3011ms 23.9445ms 41.7632 Ops/s 41.5761 Ops/s $\color{#35bf28}+0.45\%$
test_cat 29.5622ms 23.8551ms 41.9197 Ops/s 41.3823 Ops/s $\color{#35bf28}+1.30\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants