Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] named_apply and default value in apply #584

Merged
merged 2 commits into from
Nov 29, 2023
Merged

[Feature] named_apply and default value in apply #584

merged 2 commits into from
Nov 29, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 29, 2023

Allows the following usages of apply:

td = TensorDict({"1": 1}, [])
other = TensorDict({}, [])
td.apply(lambda x, y: x+y, other, default=2) # fills "1" with 3

td.named_apply(lambda name, x: x + int(name)) # fills "1" with 2

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 29, 2023
Copy link

github-actions bot commented Nov 29, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.3700μs 15.8276μs 63.1807 KOps/s 63.9048 KOps/s $\color{#d91a1a}-1.13\%$
test_plain_set_stack_nested 0.2068ms 0.1426ms 7.0147 KOps/s 7.0921 KOps/s $\color{#d91a1a}-1.09\%$
test_plain_set_nested_inplace 51.7570μs 19.0323μs 52.5423 KOps/s 53.2596 KOps/s $\color{#d91a1a}-1.35\%$
test_plain_set_stack_nested_inplace 0.3109ms 0.1724ms 5.7991 KOps/s 5.8101 KOps/s $\color{#d91a1a}-0.19\%$
test_items 14.6570μs 2.4509μs 408.0153 KOps/s 342.0725 KOps/s $\textbf{\color{#35bf28}+19.28\%}$
test_items_nested 1.4576ms 0.2762ms 3.6208 KOps/s 3.7220 KOps/s $\color{#d91a1a}-2.72\%$
test_items_nested_locked 0.4555ms 0.2662ms 3.7569 KOps/s 3.6792 KOps/s $\color{#35bf28}+2.11\%$
test_items_nested_leaf 0.7155ms 0.1682ms 5.9441 KOps/s 6.0151 KOps/s $\color{#d91a1a}-1.18\%$
test_items_stack_nested 1.8099ms 1.4907ms 670.8150 Ops/s 667.9558 Ops/s $\color{#35bf28}+0.43\%$
test_items_stack_nested_leaf 2.0959ms 1.3598ms 735.3892 Ops/s 733.5194 Ops/s $\color{#35bf28}+0.25\%$
test_items_stack_nested_locked 1.7888ms 0.7753ms 1.2899 KOps/s 1.2723 KOps/s $\color{#35bf28}+1.38\%$
test_keys 27.1810μs 3.8905μs 257.0363 KOps/s 252.3271 KOps/s $\color{#35bf28}+1.87\%$
test_keys_nested 0.5254ms 0.1403ms 7.1266 KOps/s 6.5360 KOps/s $\textbf{\color{#35bf28}+9.04\%}$
test_keys_nested_locked 0.1972ms 0.1407ms 7.1088 KOps/s 6.9928 KOps/s $\color{#35bf28}+1.66\%$
test_keys_nested_leaf 0.3806ms 0.1383ms 7.2331 KOps/s 7.0145 KOps/s $\color{#35bf28}+3.12\%$
test_keys_stack_nested 1.5693ms 1.4161ms 706.1417 Ops/s 703.9586 Ops/s $\color{#35bf28}+0.31\%$
test_keys_stack_nested_leaf 2.1193ms 1.4121ms 708.1619 Ops/s 711.1705 Ops/s $\color{#d91a1a}-0.42\%$
test_keys_stack_nested_locked 1.4743ms 0.6772ms 1.4766 KOps/s 1.4854 KOps/s $\color{#d91a1a}-0.59\%$
test_values 0.2365ms 1.2126μs 824.6643 KOps/s 894.0042 KOps/s $\textbf{\color{#d91a1a}-7.76\%}$
test_values_nested 95.3080μs 49.0099μs 20.4040 KOps/s 20.1661 KOps/s $\color{#35bf28}+1.18\%$
test_values_nested_locked 0.1028ms 49.6645μs 20.1351 KOps/s 20.2913 KOps/s $\color{#d91a1a}-0.77\%$
test_values_nested_leaf 58.7700μs 43.8539μs 22.8030 KOps/s 22.6812 KOps/s $\color{#35bf28}+0.54\%$
test_values_stack_nested 1.6341ms 1.2050ms 829.9093 Ops/s 834.1769 Ops/s $\color{#d91a1a}-0.51\%$
test_values_stack_nested_leaf 1.8601ms 1.2014ms 832.3950 Ops/s 836.4797 Ops/s $\color{#d91a1a}-0.49\%$
test_values_stack_nested_locked 0.7453ms 0.5131ms 1.9491 KOps/s 1.9249 KOps/s $\color{#35bf28}+1.26\%$
test_membership 15.2180μs 1.3757μs 726.9068 KOps/s 750.9056 KOps/s $\color{#d91a1a}-3.20\%$
test_membership_nested 20.0070μs 2.8572μs 349.9941 KOps/s 360.3426 KOps/s $\color{#d91a1a}-2.87\%$
test_membership_nested_leaf 0.1345ms 2.9847μs 335.0396 KOps/s 363.7121 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_membership_stacked_nested 38.7420μs 11.8385μs 84.4704 KOps/s 85.9656 KOps/s $\color{#d91a1a}-1.74\%$
test_membership_stacked_nested_leaf 37.6100μs 11.8658μs 84.2760 KOps/s 85.7770 KOps/s $\color{#d91a1a}-1.75\%$
test_membership_nested_last 29.4550μs 6.0912μs 164.1709 KOps/s 171.2005 KOps/s $\color{#d91a1a}-4.11\%$
test_membership_nested_leaf_last 24.4150μs 6.0545μs 165.1667 KOps/s 170.9430 KOps/s $\color{#d91a1a}-3.38\%$
test_membership_stacked_nested_last 0.2285ms 0.1688ms 5.9228 KOps/s 5.9424 KOps/s $\color{#d91a1a}-0.33\%$
test_membership_stacked_nested_leaf_last 51.9770μs 13.8573μs 72.1641 KOps/s 69.2791 KOps/s $\color{#35bf28}+4.16\%$
test_nested_getleaf 36.2380μs 10.6675μs 93.7429 KOps/s 92.6043 KOps/s $\color{#35bf28}+1.23\%$
test_nested_get 0.2694ms 10.1915μs 98.1211 KOps/s 95.3151 KOps/s $\color{#35bf28}+2.94\%$
test_stacked_getleaf 1.0797ms 0.6479ms 1.5434 KOps/s 1.5676 KOps/s $\color{#d91a1a}-1.54\%$
test_stacked_get 0.7048ms 0.6094ms 1.6409 KOps/s 1.6416 KOps/s $\color{#d91a1a}-0.04\%$
test_nested_getitemleaf 58.3490μs 10.7221μs 93.2651 KOps/s 94.2137 KOps/s $\color{#d91a1a}-1.01\%$
test_nested_getitem 35.4960μs 10.1383μs 98.6363 KOps/s 98.1961 KOps/s $\color{#35bf28}+0.45\%$
test_stacked_getitemleaf 0.7492ms 0.6438ms 1.5533 KOps/s 1.5540 KOps/s $\color{#d91a1a}-0.05\%$
test_stacked_getitem 0.9761ms 0.6111ms 1.6364 KOps/s 1.6240 KOps/s $\color{#35bf28}+0.76\%$
test_lock_nested 60.3839ms 0.6283ms 1.5916 KOps/s 1.7741 KOps/s $\textbf{\color{#d91a1a}-10.28\%}$
test_lock_stack_nested 8.7297ms 5.1653ms 193.6012 Ops/s 196.4809 Ops/s $\color{#d91a1a}-1.47\%$
test_unlock_nested 0.8868ms 0.4459ms 2.2425 KOps/s 2.2636 KOps/s $\color{#d91a1a}-0.93\%$
test_unlock_stack_nested 73.8891ms 6.9973ms 142.9126 Ops/s 142.9378 Ops/s $\color{#d91a1a}-0.02\%$
test_flatten_speed 0.3401ms 0.2668ms 3.7480 KOps/s 3.7053 KOps/s $\color{#35bf28}+1.15\%$
test_unflatten_speed 0.6459ms 0.4626ms 2.1618 KOps/s 2.1668 KOps/s $\color{#d91a1a}-0.23\%$
test_common_ops 3.8773ms 0.6714ms 1.4895 KOps/s 1.4787 KOps/s $\color{#35bf28}+0.73\%$
test_creation 27.5320μs 2.5010μs 399.8476 KOps/s 407.3676 KOps/s $\color{#d91a1a}-1.85\%$
test_creation_empty 24.0440μs 8.0621μs 124.0369 KOps/s 122.7156 KOps/s $\color{#35bf28}+1.08\%$
test_creation_nested_1 37.4800μs 11.4614μs 87.2497 KOps/s 88.8688 KOps/s $\color{#d91a1a}-1.82\%$
test_creation_nested_2 45.0140μs 15.0569μs 66.4149 KOps/s 67.2970 KOps/s $\color{#d91a1a}-1.31\%$
test_clone 89.3160μs 13.5879μs 73.5946 KOps/s 73.2861 KOps/s $\color{#35bf28}+0.42\%$
test_getitem[int] 56.9960μs 13.7111μs 72.9334 KOps/s 74.6402 KOps/s $\color{#d91a1a}-2.29\%$
test_getitem[slice_int] 68.0360μs 25.8999μs 38.6102 KOps/s 38.7923 KOps/s $\color{#d91a1a}-0.47\%$
test_getitem[range] 0.1113ms 44.5064μs 22.4687 KOps/s 22.0172 KOps/s $\color{#35bf28}+2.05\%$
test_getitem[tuple] 52.1370μs 20.5269μs 48.7166 KOps/s 47.5752 KOps/s $\color{#35bf28}+2.40\%$
test_getitem[list] 89.4460μs 39.6087μs 25.2470 KOps/s 24.2070 KOps/s $\color{#35bf28}+4.30\%$
test_setitem_dim[int] 49.0310μs 27.9804μs 35.7392 KOps/s 34.3512 KOps/s $\color{#35bf28}+4.04\%$
test_setitem_dim[slice_int] 95.2180μs 53.6050μs 18.6550 KOps/s 18.6459 KOps/s $\color{#35bf28}+0.05\%$
test_setitem_dim[range] 0.1356ms 71.3980μs 14.0060 KOps/s 13.7133 KOps/s $\color{#35bf28}+2.13\%$
test_setitem_dim[tuple] 60.3320μs 41.2904μs 24.2187 KOps/s 23.6386 KOps/s $\color{#35bf28}+2.45\%$
test_setitem 89.6570μs 18.4432μs 54.2207 KOps/s 53.8962 KOps/s $\color{#35bf28}+0.60\%$
test_set 0.1135ms 17.8320μs 56.0790 KOps/s 55.5243 KOps/s $\color{#35bf28}+1.00\%$
test_set_shared 2.1018ms 0.1414ms 7.0697 KOps/s 6.8602 KOps/s $\color{#35bf28}+3.05\%$
test_update 0.1162ms 18.6829μs 53.5248 KOps/s 53.1755 KOps/s $\color{#35bf28}+0.66\%$
test_update_nested 0.1305ms 27.7187μs 36.0767 KOps/s 36.8739 KOps/s $\color{#d91a1a}-2.16\%$
test_set_nested 82.2530μs 19.9636μs 50.0913 KOps/s 50.3588 KOps/s $\color{#d91a1a}-0.53\%$
test_set_nested_new 0.1200ms 25.6141μs 39.0411 KOps/s 39.8530 KOps/s $\color{#d91a1a}-2.04\%$
test_select 0.1551ms 51.4707μs 19.4285 KOps/s 19.6730 KOps/s $\color{#d91a1a}-1.24\%$
test_unbind_speed 3.3539ms 0.3860ms 2.5907 KOps/s 2.6839 KOps/s $\color{#d91a1a}-3.47\%$
test_unbind_speed_stack0 74.4399ms 4.7293ms 211.4460 Ops/s 210.4938 Ops/s $\color{#35bf28}+0.45\%$
test_unbind_speed_stack1 1.8680μs 0.6397μs 1.5633 MOps/s 1.5579 MOps/s $\color{#35bf28}+0.34\%$
test_split 57.8602ms 1.7992ms 555.8150 Ops/s 603.8570 Ops/s $\textbf{\color{#d91a1a}-7.96\%}$
test_chunk 57.7871ms 1.7719ms 564.3593 Ops/s 568.4741 Ops/s $\color{#d91a1a}-0.72\%$
test_creation[device0] 0.7707ms 0.2989ms 3.3455 KOps/s 3.3159 KOps/s $\color{#35bf28}+0.89\%$
test_creation_from_tensor 2.9849ms 0.3327ms 3.0059 KOps/s 2.7383 KOps/s $\textbf{\color{#35bf28}+9.77\%}$
test_add_one[memmap_tensor0] 0.3253ms 25.3865μs 39.3911 KOps/s 38.8756 KOps/s $\color{#35bf28}+1.33\%$
test_contiguous[memmap_tensor0] 44.9030μs 6.0091μs 166.4154 KOps/s 166.0257 KOps/s $\color{#35bf28}+0.23\%$
test_stack[memmap_tensor0] 77.1630μs 19.6719μs 50.8338 KOps/s 50.6836 KOps/s $\color{#35bf28}+0.30\%$
test_memmaptd_index 0.9465ms 0.4003ms 2.4981 KOps/s 2.4755 KOps/s $\color{#35bf28}+0.91\%$
test_memmaptd_index_astensor 0.5457ms 0.4587ms 2.1801 KOps/s 2.1669 KOps/s $\color{#35bf28}+0.61\%$
test_memmaptd_index_op 0.9228ms 0.7145ms 1.3996 KOps/s 1.4151 KOps/s $\color{#d91a1a}-1.10\%$
test_reshape_pytree 0.3207ms 23.1117μs 43.2680 KOps/s 42.6682 KOps/s $\color{#35bf28}+1.41\%$
test_reshape_td 74.9900μs 32.8357μs 30.4546 KOps/s 31.6949 KOps/s $\color{#d91a1a}-3.91\%$
test_view_pytree 51.9570μs 23.1839μs 43.1334 KOps/s 42.8760 KOps/s $\color{#35bf28}+0.60\%$
test_view_td 21.4400μs 5.0607μs 197.6026 KOps/s 204.6414 KOps/s $\color{#d91a1a}-3.44\%$
test_unbind_pytree 59.7720μs 26.3759μs 37.9134 KOps/s 37.0635 KOps/s $\color{#35bf28}+2.29\%$
test_unbind_td 0.1191ms 60.9989μs 16.3937 KOps/s 16.8402 KOps/s $\color{#d91a1a}-2.65\%$
test_split_pytree 56.5950μs 26.1628μs 38.2222 KOps/s 37.7695 KOps/s $\color{#35bf28}+1.20\%$
test_split_td 94.6970μs 47.6575μs 20.9831 KOps/s 21.3354 KOps/s $\color{#d91a1a}-1.65\%$
test_add_pytree 75.0510μs 31.9101μs 31.3380 KOps/s 31.0562 KOps/s $\color{#35bf28}+0.91\%$
test_add_td 0.1339ms 45.2847μs 22.0825 KOps/s 22.3297 KOps/s $\color{#d91a1a}-1.11\%$
test_distributed 22.6020μs 6.2041μs 161.1847 KOps/s 163.1059 KOps/s $\color{#d91a1a}-1.18\%$
test_tdmodule 0.1744ms 20.8250μs 48.0192 KOps/s 48.4374 KOps/s $\color{#d91a1a}-0.86\%$
test_tdmodule_dispatch 0.2477ms 38.9401μs 25.6805 KOps/s 25.8643 KOps/s $\color{#d91a1a}-0.71\%$
test_tdseq 42.6490μs 23.5892μs 42.3924 KOps/s 41.5351 KOps/s $\color{#35bf28}+2.06\%$
test_tdseq_dispatch 0.1804ms 42.6166μs 23.4650 KOps/s 23.2535 KOps/s $\color{#35bf28}+0.91\%$
test_instantiation_functorch 2.2490ms 1.2893ms 775.5849 Ops/s 766.7032 Ops/s $\color{#35bf28}+1.16\%$
test_instantiation_td 1.6054ms 1.0151ms 985.0951 Ops/s 963.9933 Ops/s $\color{#35bf28}+2.19\%$
test_exec_functorch 0.2420ms 0.1583ms 6.3169 KOps/s 6.2099 KOps/s $\color{#35bf28}+1.72\%$
test_exec_functional_call 0.2431ms 0.1513ms 6.6111 KOps/s 6.7613 KOps/s $\color{#d91a1a}-2.22\%$
test_exec_td 0.2783ms 0.1485ms 6.7349 KOps/s 6.8952 KOps/s $\color{#d91a1a}-2.32\%$
test_exec_td_decorator 0.7991ms 0.1801ms 5.5510 KOps/s 5.6463 KOps/s $\color{#d91a1a}-1.69\%$
test_vmap_mlp_speed[True-True] 1.2773ms 0.8941ms 1.1184 KOps/s 1.1216 KOps/s $\color{#d91a1a}-0.28\%$
test_vmap_mlp_speed[True-False] 1.0472ms 0.4683ms 2.1356 KOps/s 2.0954 KOps/s $\color{#35bf28}+1.92\%$
test_vmap_mlp_speed[False-True] 1.3337ms 0.8003ms 1.2495 KOps/s 1.2959 KOps/s $\color{#d91a1a}-3.58\%$
test_vmap_mlp_speed[False-False] 0.6187ms 0.3875ms 2.5809 KOps/s 2.6141 KOps/s $\color{#d91a1a}-1.27\%$
test_vmap_mlp_speed_decorator[True-True] 2.2796ms 1.7514ms 570.9765 Ops/s 568.9412 Ops/s $\color{#35bf28}+0.36\%$
test_vmap_mlp_speed_decorator[True-False] 1.1339ms 0.5202ms 1.9225 KOps/s 1.9332 KOps/s $\color{#d91a1a}-0.55\%$
test_vmap_mlp_speed_decorator[False-True] 1.9380ms 1.4610ms 684.4537 Ops/s 676.7681 Ops/s $\color{#35bf28}+1.14\%$
test_vmap_mlp_speed_decorator[False-False] 0.9285ms 0.3987ms 2.5083 KOps/s 2.5140 KOps/s $\color{#d91a1a}-0.23\%$

Copy link

github-actions bot commented Nov 29, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 88.8510μs 12.5041μs 79.9739 KOps/s 78.8226 KOps/s $\color{#35bf28}+1.46\%$
test_plain_set_stack_nested 0.1423ms 0.1148ms 8.7145 KOps/s 8.6295 KOps/s $\color{#35bf28}+0.99\%$
test_plain_set_nested_inplace 31.6900μs 14.8081μs 67.5305 KOps/s 66.3369 KOps/s $\color{#35bf28}+1.80\%$
test_plain_set_stack_nested_inplace 0.1747ms 0.1388ms 7.2028 KOps/s 7.1260 KOps/s $\color{#35bf28}+1.08\%$
test_items 22.4600μs 4.6644μs 214.3906 KOps/s 215.1867 KOps/s $\color{#d91a1a}-0.37\%$
test_items_nested 0.3957ms 0.3393ms 2.9475 KOps/s 2.9780 KOps/s $\color{#d91a1a}-1.02\%$
test_items_nested_locked 0.4028ms 0.3404ms 2.9374 KOps/s 2.9547 KOps/s $\color{#d91a1a}-0.59\%$
test_items_nested_leaf 0.2442ms 0.2000ms 5.0006 KOps/s 5.0192 KOps/s $\color{#d91a1a}-0.37\%$
test_items_stack_nested 1.6209ms 1.5513ms 644.6012 Ops/s 671.0866 Ops/s $\color{#d91a1a}-3.95\%$
test_items_stack_nested_leaf 1.6260ms 1.3817ms 723.7435 Ops/s 751.6990 Ops/s $\color{#d91a1a}-3.72\%$
test_items_stack_nested_locked 2.0963ms 0.8200ms 1.2195 KOps/s 1.1894 KOps/s $\color{#35bf28}+2.53\%$
test_keys 50.9310μs 4.5450μs 220.0200 KOps/s 220.1861 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_nested 0.4761ms 90.8540μs 11.0067 KOps/s 11.0849 KOps/s $\color{#d91a1a}-0.71\%$
test_keys_nested_locked 0.1122ms 90.3132μs 11.0726 KOps/s 11.1554 KOps/s $\color{#d91a1a}-0.74\%$
test_keys_nested_leaf 44.5164ms 87.8362μs 11.3848 KOps/s 12.1856 KOps/s $\textbf{\color{#d91a1a}-6.57\%}$
test_keys_stack_nested 1.6176ms 1.3719ms 728.8922 Ops/s 766.6344 Ops/s $\color{#d91a1a}-4.92\%$
test_keys_stack_nested_leaf 1.5524ms 1.3508ms 740.2976 Ops/s 767.8527 Ops/s $\color{#d91a1a}-3.59\%$
test_keys_stack_nested_locked 0.7700ms 0.6502ms 1.5381 KOps/s 1.5698 KOps/s $\color{#d91a1a}-2.02\%$
test_values 13.0505μs 1.9170μs 521.6382 KOps/s 527.4118 KOps/s $\color{#d91a1a}-1.09\%$
test_values_nested 0.1252ms 43.3068μs 23.0911 KOps/s 23.5603 KOps/s $\color{#d91a1a}-1.99\%$
test_values_nested_locked 70.9110μs 45.5588μs 21.9496 KOps/s 22.2438 KOps/s $\color{#d91a1a}-1.32\%$
test_values_nested_leaf 60.4400μs 37.8117μs 26.4468 KOps/s 27.0707 KOps/s $\color{#d91a1a}-2.30\%$
test_values_stack_nested 1.3378ms 1.1863ms 842.9859 Ops/s 881.4165 Ops/s $\color{#d91a1a}-4.36\%$
test_values_stack_nested_leaf 1.2462ms 1.1648ms 858.4927 Ops/s 894.6978 Ops/s $\color{#d91a1a}-4.05\%$
test_values_stack_nested_locked 0.5311ms 0.4896ms 2.0423 KOps/s 1.9639 KOps/s $\color{#35bf28}+3.99\%$
test_membership 5.3662μs 0.9315μs 1.0735 MOps/s 937.8397 KOps/s $\textbf{\color{#35bf28}+14.47\%}$
test_membership_nested 28.3200μs 2.1925μs 456.0933 KOps/s 465.3626 KOps/s $\color{#d91a1a}-1.99\%$
test_membership_nested_leaf 19.8905μs 2.0888μs 478.7348 KOps/s 483.0522 KOps/s $\color{#d91a1a}-0.89\%$
test_membership_stacked_nested 35.9600μs 10.9243μs 91.5390 KOps/s 93.0189 KOps/s $\color{#d91a1a}-1.59\%$
test_membership_stacked_nested_leaf 54.8620μs 10.9447μs 91.3682 KOps/s 92.7507 KOps/s $\color{#d91a1a}-1.49\%$
test_membership_nested_last 18.3400μs 4.5583μs 219.3788 KOps/s 222.7430 KOps/s $\color{#d91a1a}-1.51\%$
test_membership_nested_leaf_last 30.6710μs 4.5336μs 220.5755 KOps/s 222.1041 KOps/s $\color{#d91a1a}-0.69\%$
test_membership_stacked_nested_last 0.1680ms 0.1341ms 7.4559 KOps/s 7.4143 KOps/s $\color{#35bf28}+0.56\%$
test_membership_stacked_nested_leaf_last 37.1700μs 13.0140μs 76.8401 KOps/s 79.2993 KOps/s $\color{#d91a1a}-3.10\%$
test_nested_getleaf 33.1200μs 8.3951μs 119.1169 KOps/s 119.8868 KOps/s $\color{#d91a1a}-0.64\%$
test_nested_get 27.9500μs 7.9383μs 125.9717 KOps/s 126.4750 KOps/s $\color{#d91a1a}-0.40\%$
test_stacked_getleaf 0.6292ms 0.5655ms 1.7682 KOps/s 1.7723 KOps/s $\color{#d91a1a}-0.23\%$
test_stacked_get 0.5953ms 0.5290ms 1.8902 KOps/s 1.9015 KOps/s $\color{#d91a1a}-0.59\%$
test_nested_getitemleaf 30.3900μs 8.4488μs 118.3601 KOps/s 119.0021 KOps/s $\color{#d91a1a}-0.54\%$
test_nested_getitem 32.0010μs 7.9678μs 125.5052 KOps/s 125.5544 KOps/s $\color{#d91a1a}-0.04\%$
test_stacked_getitemleaf 0.6092ms 0.5657ms 1.7677 KOps/s 1.7937 KOps/s $\color{#d91a1a}-1.45\%$
test_stacked_getitem 0.6130ms 0.5410ms 1.8484 KOps/s 1.9072 KOps/s $\color{#d91a1a}-3.08\%$
test_lock_nested 3.2157ms 0.5555ms 1.8002 KOps/s 1.7428 KOps/s $\color{#35bf28}+3.29\%$
test_lock_stack_nested 81.4624ms 7.1912ms 139.0591 Ops/s 137.2764 Ops/s $\color{#35bf28}+1.30\%$
test_unlock_nested 2.3681ms 0.4287ms 2.3328 KOps/s 2.2840 KOps/s $\color{#35bf28}+2.14\%$
test_unlock_stack_nested 67.2619ms 6.2021ms 161.2363 Ops/s 158.9190 Ops/s $\color{#35bf28}+1.46\%$
test_flatten_speed 0.2331ms 0.1877ms 5.3289 KOps/s 5.3450 KOps/s $\color{#d91a1a}-0.30\%$
test_unflatten_speed 0.4124ms 0.3620ms 2.7625 KOps/s 2.7497 KOps/s $\color{#35bf28}+0.46\%$
test_common_ops 1.1259ms 0.6001ms 1.6663 KOps/s 1.6030 KOps/s $\color{#35bf28}+3.95\%$
test_creation 40.2300μs 2.0960μs 477.0972 KOps/s 486.4524 KOps/s $\color{#d91a1a}-1.92\%$
test_creation_empty 21.9400μs 6.7707μs 147.6943 KOps/s 140.5415 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_creation_nested_1 34.8510μs 9.0783μs 110.1525 KOps/s 106.5707 KOps/s $\color{#35bf28}+3.36\%$
test_creation_nested_2 68.7210μs 11.8509μs 84.3819 KOps/s 83.4903 KOps/s $\color{#35bf28}+1.07\%$
test_clone 82.8600μs 15.2959μs 65.3771 KOps/s 68.2094 KOps/s $\color{#d91a1a}-4.15\%$
test_getitem[int] 38.8500μs 12.2385μs 81.7096 KOps/s 79.5961 KOps/s $\color{#35bf28}+2.66\%$
test_getitem[slice_int] 61.0110μs 25.1854μs 39.7056 KOps/s 41.0753 KOps/s $\color{#d91a1a}-3.33\%$
test_getitem[range] 74.7000μs 42.1527μs 23.7233 KOps/s 24.1566 KOps/s $\color{#d91a1a}-1.79\%$
test_getitem[tuple] 63.2610μs 21.3229μs 46.8980 KOps/s 47.7370 KOps/s $\color{#d91a1a}-1.76\%$
test_getitem[list] 0.2526ms 38.7399μs 25.8132 KOps/s 27.0116 KOps/s $\color{#d91a1a}-4.44\%$
test_setitem_dim[int] 50.8000μs 27.0457μs 36.9745 KOps/s 37.4913 KOps/s $\color{#d91a1a}-1.38\%$
test_setitem_dim[slice_int] 65.3900μs 47.8863μs 20.8828 KOps/s 21.2156 KOps/s $\color{#d91a1a}-1.57\%$
test_setitem_dim[range] 0.1014ms 66.6962μs 14.9934 KOps/s 15.4466 KOps/s $\color{#d91a1a}-2.93\%$
test_setitem_dim[tuple] 56.9910μs 39.7577μs 25.1524 KOps/s 24.8216 KOps/s $\color{#35bf28}+1.33\%$
test_setitem 90.9600μs 19.1556μs 52.2041 KOps/s 54.2184 KOps/s $\color{#d91a1a}-3.72\%$
test_set 95.4300μs 18.2362μs 54.8360 KOps/s 56.1195 KOps/s $\color{#d91a1a}-2.29\%$
test_set_shared 2.7074ms 0.1086ms 9.2117 KOps/s 8.5094 KOps/s $\textbf{\color{#35bf28}+8.25\%}$
test_update 93.9110μs 17.7858μs 56.2246 KOps/s 51.0610 KOps/s $\textbf{\color{#35bf28}+10.11\%}$
test_update_nested 97.1210μs 24.6595μs 40.5524 KOps/s 37.8813 KOps/s $\textbf{\color{#35bf28}+7.05\%}$
test_set_nested 88.2610μs 18.1054μs 55.2322 KOps/s 51.0540 KOps/s $\textbf{\color{#35bf28}+8.18\%}$
test_set_nested_new 85.3500μs 22.7138μs 44.0261 KOps/s 41.8916 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_select 0.1060ms 46.1253μs 21.6801 KOps/s 21.1232 KOps/s $\color{#35bf28}+2.64\%$
test_to 74.0910μs 53.4160μs 18.7210 KOps/s 19.0012 KOps/s $\color{#d91a1a}-1.47\%$
test_to_nonblocking 57.8510μs 33.8684μs 29.5261 KOps/s 28.3546 KOps/s $\color{#35bf28}+4.13\%$
test_unbind_speed 0.3915ms 0.3586ms 2.7888 KOps/s 2.7130 KOps/s $\color{#35bf28}+2.79\%$
test_unbind_speed_stack0 62.6265ms 4.4703ms 223.6987 Ops/s 243.1203 Ops/s $\textbf{\color{#d91a1a}-7.99\%}$
test_unbind_speed_stack1 1.2115μs 0.5219μs 1.9160 MOps/s 1.9021 MOps/s $\color{#35bf28}+0.73\%$
test_split 54.0334ms 1.7940ms 557.4114 Ops/s 534.9387 Ops/s $\color{#35bf28}+4.20\%$
test_chunk 53.7149ms 1.7777ms 562.5240 Ops/s 539.0194 Ops/s $\color{#35bf28}+4.36\%$
test_creation[device0] 0.3770ms 0.3083ms 3.2433 KOps/s 3.2053 KOps/s $\color{#35bf28}+1.19\%$
test_creation[device1] 0.6709ms 0.3116ms 3.2096 KOps/s 3.1787 KOps/s $\color{#35bf28}+0.97\%$
test_creation_from_tensor 0.6330ms 0.3364ms 2.9722 KOps/s 2.9406 KOps/s $\color{#35bf28}+1.08\%$
test_add_one[memmap_tensor0] 0.1614ms 22.6303μs 44.1885 KOps/s 40.6193 KOps/s $\textbf{\color{#35bf28}+8.79\%}$
test_add_one[memmap_tensor1] 0.2122ms 71.2247μs 14.0401 KOps/s 13.4202 KOps/s $\color{#35bf28}+4.62\%$
test_contiguous[memmap_tensor0] 20.0500μs 5.6526μs 176.9099 KOps/s 163.5155 KOps/s $\textbf{\color{#35bf28}+8.19\%}$
test_contiguous[memmap_tensor1] 51.5900μs 20.8076μs 48.0593 KOps/s 45.6797 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_stack[memmap_tensor0] 48.1900μs 18.3598μs 54.4667 KOps/s 49.6126 KOps/s $\textbf{\color{#35bf28}+9.78\%}$
test_stack[memmap_tensor1] 0.1525ms 71.2816μs 14.0289 KOps/s 13.4385 KOps/s $\color{#35bf28}+4.39\%$
test_memmaptd_index 0.4866ms 0.4230ms 2.3642 KOps/s 2.2940 KOps/s $\color{#35bf28}+3.06\%$
test_memmaptd_index_astensor 0.5330ms 0.4771ms 2.0958 KOps/s 1.9836 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_memmaptd_index_op 0.8628ms 0.7256ms 1.3782 KOps/s 1.3031 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_reshape_pytree 37.9300μs 20.7136μs 48.2775 KOps/s 47.6270 KOps/s $\color{#35bf28}+1.37\%$
test_reshape_td 50.5400μs 30.1362μs 33.1827 KOps/s 33.0071 KOps/s $\color{#35bf28}+0.53\%$
test_view_pytree 38.6010μs 20.5243μs 48.7227 KOps/s 48.1543 KOps/s $\color{#35bf28}+1.18\%$
test_view_td 26.6610μs 4.0817μs 244.9966 KOps/s 251.2570 KOps/s $\color{#d91a1a}-2.49\%$
test_unbind_pytree 43.1700μs 25.0811μs 39.8707 KOps/s 38.8173 KOps/s $\color{#35bf28}+2.71\%$
test_unbind_td 0.1073ms 57.0310μs 17.5343 KOps/s 17.2957 KOps/s $\color{#35bf28}+1.38\%$
test_split_pytree 45.8800μs 24.0685μs 41.5481 KOps/s 41.1791 KOps/s $\color{#35bf28}+0.90\%$
test_split_td 76.9500μs 44.3853μs 22.5300 KOps/s 22.2292 KOps/s $\color{#35bf28}+1.35\%$
test_add_pytree 48.4210μs 29.9414μs 33.3986 KOps/s 32.3975 KOps/s $\color{#35bf28}+3.09\%$
test_add_td 66.4900μs 42.0754μs 23.7669 KOps/s 23.5385 KOps/s $\color{#35bf28}+0.97\%$
test_distributed 18.2010μs 5.4437μs 183.6974 KOps/s 182.4842 KOps/s $\color{#35bf28}+0.66\%$
test_tdmodule 88.9000μs 16.3346μs 61.2198 KOps/s 59.5026 KOps/s $\color{#35bf28}+2.89\%$
test_tdmodule_dispatch 0.1889ms 31.7597μs 31.4864 KOps/s 30.0942 KOps/s $\color{#35bf28}+4.63\%$
test_tdseq 35.1900μs 19.2618μs 51.9161 KOps/s 50.8528 KOps/s $\color{#35bf28}+2.09\%$
test_tdseq_dispatch 53.9710μs 35.0218μs 28.5536 KOps/s 27.8163 KOps/s $\color{#35bf28}+2.65\%$
test_instantiation_functorch 1.7134ms 1.6503ms 605.9353 Ops/s 604.6608 Ops/s $\color{#35bf28}+0.21\%$
test_instantiation_td 1.6861ms 1.1676ms 856.4678 Ops/s 852.8154 Ops/s $\color{#35bf28}+0.43\%$
test_exec_functorch 0.1970ms 0.1524ms 6.5597 KOps/s 6.2363 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_exec_functional_call 0.2102ms 0.1512ms 6.6121 KOps/s 6.4011 KOps/s $\color{#35bf28}+3.30\%$
test_exec_td 0.1989ms 0.1412ms 7.0824 KOps/s 6.8747 KOps/s $\color{#35bf28}+3.02\%$
test_exec_td_decorator 0.8005ms 0.1781ms 5.6158 KOps/s 5.4103 KOps/s $\color{#35bf28}+3.80\%$
test_vmap_mlp_speed[True-True] 1.1531ms 1.0554ms 947.4906 Ops/s 939.4475 Ops/s $\color{#35bf28}+0.86\%$
test_vmap_mlp_speed[True-False] 0.6493ms 0.6000ms 1.6666 KOps/s 1.6507 KOps/s $\color{#35bf28}+0.96\%$
test_vmap_mlp_speed[False-True] 1.0718ms 0.9933ms 1.0067 KOps/s 1.0313 KOps/s $\color{#d91a1a}-2.38\%$
test_vmap_mlp_speed[False-False] 0.5932ms 0.5311ms 1.8827 KOps/s 1.8783 KOps/s $\color{#35bf28}+0.24\%$
test_vmap_mlp_speed_decorator[True-True] 2.7963ms 2.0025ms 499.3714 Ops/s 493.1777 Ops/s $\color{#35bf28}+1.26\%$
test_vmap_mlp_speed_decorator[True-False] 1.0781ms 0.6449ms 1.5507 KOps/s 1.5338 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed_decorator[False-True] 2.1611ms 1.7365ms 575.8793 Ops/s 571.6371 Ops/s $\color{#35bf28}+0.74\%$
test_vmap_mlp_speed_decorator[False-False] 0.9326ms 0.5509ms 1.8151 KOps/s 1.8245 KOps/s $\color{#d91a1a}-0.52\%$
test_vmap_transformer_speed[True-True] 12.3846ms 12.2537ms 81.6083 Ops/s 80.7266 Ops/s $\color{#35bf28}+1.09\%$
test_vmap_transformer_speed[True-False] 8.3653ms 7.9833ms 125.2613 Ops/s 123.9770 Ops/s $\color{#35bf28}+1.04\%$
test_vmap_transformer_speed[False-True] 12.8214ms 12.1761ms 82.1280 Ops/s 81.0798 Ops/s $\color{#35bf28}+1.29\%$
test_vmap_transformer_speed[False-False] 8.2548ms 7.9577ms 125.6639 Ops/s 124.4685 Ops/s $\color{#35bf28}+0.96\%$
test_vmap_transformer_speed_decorator[True-True] 0.1377s 67.6519ms 14.7816 Ops/s 14.6132 Ops/s $\color{#35bf28}+1.15\%$
test_vmap_transformer_speed_decorator[True-False] 21.5640ms 19.3314ms 51.7294 Ops/s 51.1980 Ops/s $\color{#35bf28}+1.04\%$
test_vmap_transformer_speed_decorator[False-True] 58.0689ms 56.9897ms 17.5470 Ops/s 17.2077 Ops/s $\color{#35bf28}+1.97\%$
test_vmap_transformer_speed_decorator[False-False] 0.1039s 20.5405ms 48.6843 Ops/s 52.3704 Ops/s $\textbf{\color{#d91a1a}-7.04\%}$

@vmoens vmoens added the enhancement New feature or request label Nov 29, 2023
@vmoens vmoens marked this pull request as ready for review November 29, 2023 09:50
@vmoens vmoens merged commit c80078c into main Nov 29, 2023
0 of 3 checks passed
@vmoens vmoens deleted the named-apply branch November 29, 2023 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants