Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix,Feature] filter_empty in apply #661

Merged
merged 4 commits into from
Feb 5, 2024
Merged

[BugFix,Feature] filter_empty in apply #661

merged 4 commits into from
Feb 5, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 5, 2024

Using apply to filter data returns an empty tensordict, but one could expect to receive nothing at all if the operation doesn't return anything.
This PR solves this issue.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2024
@vmoens vmoens added the bug Something isn't working label Feb 5, 2024
Copy link

github-actions bot commented Feb 5, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 124. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.4000μs 16.0874μs 62.1605 KOps/s 61.4813 KOps/s $\color{#35bf28}+1.10\%$
test_plain_set_stack_nested 0.1944ms 0.1429ms 6.9984 KOps/s 7.0385 KOps/s $\color{#d91a1a}-0.57\%$
test_plain_set_nested_inplace 53.9210μs 18.6673μs 53.5697 KOps/s 53.9683 KOps/s $\color{#d91a1a}-0.74\%$
test_plain_set_stack_nested_inplace 0.3824ms 0.1776ms 5.6293 KOps/s 5.6629 KOps/s $\color{#d91a1a}-0.59\%$
test_items 36.4680μs 2.4701μs 404.8436 KOps/s 408.5484 KOps/s $\color{#d91a1a}-0.91\%$
test_items_nested 0.4279ms 0.2782ms 3.5946 KOps/s 3.6969 KOps/s $\color{#d91a1a}-2.77\%$
test_items_nested_locked 0.8820ms 0.2823ms 3.5428 KOps/s 3.7683 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_items_nested_leaf 0.2986ms 0.1722ms 5.8081 KOps/s 6.0367 KOps/s $\color{#d91a1a}-3.79\%$
test_items_stack_nested 1.6018ms 1.3284ms 752.7741 Ops/s 771.1645 Ops/s $\color{#d91a1a}-2.38\%$
test_items_stack_nested_leaf 1.2991ms 1.2014ms 832.3746 Ops/s 856.2124 Ops/s $\color{#d91a1a}-2.78\%$
test_items_stack_nested_locked 1.2065ms 0.8758ms 1.1419 KOps/s 1.1868 KOps/s $\color{#d91a1a}-3.79\%$
test_keys 36.7890μs 3.8295μs 261.1290 KOps/s 256.1794 KOps/s $\color{#35bf28}+1.93\%$
test_keys_nested 1.7845ms 0.1517ms 6.5911 KOps/s 6.7780 KOps/s $\color{#d91a1a}-2.76\%$
test_keys_nested_locked 0.2238ms 0.1573ms 6.3580 KOps/s 6.5956 KOps/s $\color{#d91a1a}-3.60\%$
test_keys_nested_leaf 0.2523ms 0.1338ms 7.4750 KOps/s 7.7432 KOps/s $\color{#d91a1a}-3.46\%$
test_keys_stack_nested 1.4503ms 1.2616ms 792.6324 Ops/s 803.9246 Ops/s $\color{#d91a1a}-1.40\%$
test_keys_stack_nested_leaf 1.8718ms 1.2590ms 794.2789 Ops/s 805.0733 Ops/s $\color{#d91a1a}-1.34\%$
test_keys_stack_nested_locked 1.0379ms 0.7995ms 1.2508 KOps/s 1.2841 KOps/s $\color{#d91a1a}-2.59\%$
test_values 9.1220μs 1.1604μs 861.7741 KOps/s 857.9846 KOps/s $\color{#35bf28}+0.44\%$
test_values_nested 0.1001ms 51.8360μs 19.2916 KOps/s 19.5868 KOps/s $\color{#d91a1a}-1.51\%$
test_values_nested_locked 0.1007ms 51.9682μs 19.2425 KOps/s 19.3652 KOps/s $\color{#d91a1a}-0.63\%$
test_values_nested_leaf 81.3120μs 46.0768μs 21.7029 KOps/s 21.4939 KOps/s $\color{#35bf28}+0.97\%$
test_values_stack_nested 1.6070ms 1.0207ms 979.7392 Ops/s 993.7174 Ops/s $\color{#d91a1a}-1.41\%$
test_values_stack_nested_leaf 1.2544ms 1.0163ms 983.9316 Ops/s 1.0027 KOps/s $\color{#d91a1a}-1.87\%$
test_values_stack_nested_locked 1.1219ms 0.6064ms 1.6491 KOps/s 1.7044 KOps/s $\color{#d91a1a}-3.24\%$
test_membership 18.4440μs 1.3403μs 746.1218 KOps/s 741.7957 KOps/s $\color{#35bf28}+0.58\%$
test_membership_nested 27.1810μs 3.4267μs 291.8281 KOps/s 275.1532 KOps/s $\textbf{\color{#35bf28}+6.06\%}$
test_membership_nested_leaf 43.8420μs 3.4295μs 291.5858 KOps/s 277.7391 KOps/s $\color{#35bf28}+4.99\%$
test_membership_stacked_nested 51.5060μs 11.8966μs 84.0574 KOps/s 82.1652 KOps/s $\color{#35bf28}+2.30\%$
test_membership_stacked_nested_leaf 36.8790μs 11.9339μs 83.7946 KOps/s 82.6133 KOps/s $\color{#35bf28}+1.43\%$
test_membership_nested_last 43.9820μs 6.7184μs 148.8444 KOps/s 150.7172 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_nested_leaf_last 27.5610μs 6.7607μs 147.9128 KOps/s 150.0744 KOps/s $\color{#d91a1a}-1.44\%$
test_membership_stacked_nested_last 0.3360ms 0.1740ms 5.7469 KOps/s 5.7723 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_stacked_nested_leaf_last 70.5420μs 14.0336μs 71.2575 KOps/s 70.7096 KOps/s $\color{#35bf28}+0.77\%$
test_nested_getleaf 51.3250μs 10.5448μs 94.8334 KOps/s 96.1192 KOps/s $\color{#d91a1a}-1.34\%$
test_nested_get 46.7780μs 9.9799μs 100.2015 KOps/s 100.5372 KOps/s $\color{#d91a1a}-0.33\%$
test_stacked_getleaf 0.5965ms 0.3958ms 2.5266 KOps/s 2.6012 KOps/s $\color{#d91a1a}-2.87\%$
test_stacked_get 0.5749ms 0.3643ms 2.7448 KOps/s 2.8095 KOps/s $\color{#d91a1a}-2.30\%$
test_nested_getitemleaf 50.4740μs 12.0121μs 83.2493 KOps/s 82.4929 KOps/s $\color{#35bf28}+0.92\%$
test_nested_getitem 47.0280μs 11.6865μs 85.5685 KOps/s 86.3529 KOps/s $\color{#d91a1a}-0.91\%$
test_stacked_getitemleaf 0.6953ms 0.3988ms 2.5073 KOps/s 2.5349 KOps/s $\color{#d91a1a}-1.09\%$
test_stacked_getitem 0.5924ms 0.3674ms 2.7219 KOps/s 2.7769 KOps/s $\color{#d91a1a}-1.98\%$
test_lock_nested 0.7001ms 0.3318ms 3.0135 KOps/s 3.0638 KOps/s $\color{#d91a1a}-1.64\%$
test_lock_stack_nested 83.3134ms 5.7698ms 173.3149 Ops/s 185.5933 Ops/s $\textbf{\color{#d91a1a}-6.62\%}$
test_unlock_nested 71.5178ms 0.4061ms 2.4627 KOps/s 3.0307 KOps/s $\textbf{\color{#d91a1a}-18.74\%}$
test_unlock_stack_nested 84.6589ms 5.8619ms 170.5944 Ops/s 184.1504 Ops/s $\textbf{\color{#d91a1a}-7.36\%}$
test_flatten_speed 0.7483ms 0.3718ms 2.6898 KOps/s 2.7736 KOps/s $\color{#d91a1a}-3.02\%$
test_unflatten_speed 0.7718ms 0.4594ms 2.1769 KOps/s 2.2332 KOps/s $\color{#d91a1a}-2.52\%$
test_common_ops 3.8534ms 0.6555ms 1.5256 KOps/s 1.5397 KOps/s $\color{#d91a1a}-0.92\%$
test_creation 45.0740μs 1.8558μs 538.8572 KOps/s 520.8058 KOps/s $\color{#35bf28}+3.47\%$
test_creation_empty 45.6350μs 8.6794μs 115.2157 KOps/s 110.6290 KOps/s $\color{#35bf28}+4.15\%$
test_creation_nested_1 33.8330μs 11.4194μs 87.5702 KOps/s 85.4389 KOps/s $\color{#35bf28}+2.49\%$
test_creation_nested_2 39.2730μs 14.6388μs 68.3116 KOps/s 67.0319 KOps/s $\color{#35bf28}+1.91\%$
test_clone 42.8100μs 13.1668μs 75.9484 KOps/s 78.4480 KOps/s $\color{#d91a1a}-3.19\%$
test_getitem[int] 36.0170μs 11.0514μs 90.4863 KOps/s 92.0827 KOps/s $\color{#d91a1a}-1.73\%$
test_getitem[slice_int] 53.1590μs 22.2209μs 45.0027 KOps/s 45.9709 KOps/s $\color{#d91a1a}-2.11\%$
test_getitem[range] 0.1033ms 41.9052μs 23.8634 KOps/s 25.0409 KOps/s $\color{#d91a1a}-4.70\%$
test_getitem[tuple] 68.6690μs 18.4122μs 54.3118 KOps/s 56.2007 KOps/s $\color{#d91a1a}-3.36\%$
test_getitem[list] 0.1173ms 37.1161μs 26.9425 KOps/s 28.5577 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_setitem_dim[int] 56.3760μs 28.1024μs 35.5841 KOps/s 34.5975 KOps/s $\color{#35bf28}+2.85\%$
test_setitem_dim[slice_int] 88.2250μs 53.5805μs 18.6635 KOps/s 18.1717 KOps/s $\color{#35bf28}+2.71\%$
test_setitem_dim[range] 0.1228ms 73.2582μs 13.6503 KOps/s 13.8885 KOps/s $\color{#d91a1a}-1.71\%$
test_setitem_dim[tuple] 89.0670μs 42.7669μs 23.3826 KOps/s 22.7794 KOps/s $\color{#35bf28}+2.65\%$
test_setitem 67.4960μs 19.3686μs 51.6301 KOps/s 54.4895 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_set 54.0610μs 18.2962μs 54.6560 KOps/s 56.2136 KOps/s $\color{#d91a1a}-2.77\%$
test_set_shared 3.3457ms 0.1376ms 7.2678 KOps/s 7.3137 KOps/s $\color{#d91a1a}-0.63\%$
test_update 0.1046ms 19.9828μs 50.0429 KOps/s 49.8141 KOps/s $\color{#35bf28}+0.46\%$
test_update_nested 0.1017ms 28.6474μs 34.9072 KOps/s 36.6819 KOps/s $\color{#d91a1a}-4.84\%$
test_set_nested 0.1625ms 21.0294μs 47.5524 KOps/s 50.4406 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_set_nested_new 94.1160μs 24.4397μs 40.9170 KOps/s 42.9231 KOps/s $\color{#d91a1a}-4.67\%$
test_select 89.7270μs 36.9150μs 27.0892 KOps/s 27.5511 KOps/s $\color{#d91a1a}-1.68\%$
test_select_nested 0.1392ms 56.8204μs 17.5993 KOps/s 17.7223 KOps/s $\color{#d91a1a}-0.69\%$
test_exclude_nested 0.2273ms 0.1177ms 8.4958 KOps/s 8.6456 KOps/s $\color{#d91a1a}-1.73\%$
test_empty[True] 1.1166ms 0.4179ms 2.3931 KOps/s 2.4648 KOps/s $\color{#d91a1a}-2.91\%$
test_empty[False] 7.5020μs 1.0535μs 949.2128 KOps/s 986.8411 KOps/s $\color{#d91a1a}-3.81\%$
test_unbind_speed 0.4438ms 0.2433ms 4.1102 KOps/s 4.1479 KOps/s $\color{#d91a1a}-0.91\%$
test_unbind_speed_stack0 81.7793ms 3.3140ms 301.7457 Ops/s 309.6160 Ops/s $\color{#d91a1a}-2.54\%$
test_unbind_speed_stack1 36.4780μs 2.0413μs 489.8803 KOps/s 522.8382 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_split 72.7985ms 1.6090ms 621.4929 Ops/s 699.8221 Ops/s $\textbf{\color{#d91a1a}-11.19\%}$
test_chunk 61.6824ms 1.5438ms 647.7695 Ops/s 650.3165 Ops/s $\color{#d91a1a}-0.39\%$
test_creation[device0] 0.1880ms 98.7003μs 10.1317 KOps/s 9.7649 KOps/s $\color{#35bf28}+3.76\%$
test_creation_from_tensor 3.3871ms 81.2487μs 12.3079 KOps/s 12.2171 KOps/s $\color{#35bf28}+0.74\%$
test_add_one[memmap_tensor0] 0.1798ms 5.3716μs 186.1629 KOps/s 190.7634 KOps/s $\color{#d91a1a}-2.41\%$
test_contiguous[memmap_tensor0] 14.2360μs 0.6271μs 1.5945 MOps/s 1.6042 MOps/s $\color{#d91a1a}-0.60\%$
test_stack[memmap_tensor0] 58.9800μs 3.5557μs 281.2378 KOps/s 284.1787 KOps/s $\color{#d91a1a}-1.03\%$
test_memmaptd_index 1.0429ms 0.2354ms 4.2474 KOps/s 4.3298 KOps/s $\color{#d91a1a}-1.90\%$
test_memmaptd_index_astensor 5.0222ms 0.2974ms 3.3621 KOps/s 3.4292 KOps/s $\color{#d91a1a}-1.96\%$
test_memmaptd_index_op 0.9027ms 0.5650ms 1.7698 KOps/s 1.8024 KOps/s $\color{#d91a1a}-1.81\%$
test_serialize_model 0.1724s 0.1108s 9.0281 Ops/s 9.0665 Ops/s $\color{#d91a1a}-0.42\%$
test_serialize_model_pickle 0.4476s 0.3788s 2.6396 Ops/s 2.6266 Ops/s $\color{#35bf28}+0.49\%$
test_serialize_weights 0.1670s 0.1029s 9.7184 Ops/s 9.3607 Ops/s $\color{#35bf28}+3.82\%$
test_serialize_weights_returnearly 0.3526s 0.1444s 6.9274 Ops/s 8.1221 Ops/s $\textbf{\color{#d91a1a}-14.71\%}$
test_serialize_weights_pickle 0.6595s 0.4849s 2.0621 Ops/s 2.4193 Ops/s $\textbf{\color{#d91a1a}-14.76\%}$
test_serialize_weights_filesystem 99.3771ms 92.8895ms 10.7655 Ops/s 10.5395 Ops/s $\color{#35bf28}+2.14\%$
test_serialize_model_filesystem 0.1633s 97.7891ms 10.2261 Ops/s 10.2806 Ops/s $\color{#d91a1a}-0.53\%$
test_reshape_pytree 60.6940μs 20.9108μs 47.8222 KOps/s 48.4423 KOps/s $\color{#d91a1a}-1.28\%$
test_reshape_td 88.0850μs 31.0329μs 32.2239 KOps/s 33.9837 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_view_pytree 72.7360μs 20.6673μs 48.3857 KOps/s 48.5341 KOps/s $\color{#d91a1a}-0.31\%$
test_view_td 81.8820ms 11.2384μs 88.9804 KOps/s 94.0764 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_unbind_pytree 50.7750μs 24.0899μs 41.5111 KOps/s 41.7016 KOps/s $\color{#d91a1a}-0.46\%$
test_unbind_td 0.1369ms 36.0025μs 27.7759 KOps/s 28.6273 KOps/s $\color{#d91a1a}-2.97\%$
test_split_pytree 50.7260μs 23.6998μs 42.1944 KOps/s 42.9812 KOps/s $\color{#d91a1a}-1.83\%$
test_split_td 0.3491ms 40.0524μs 24.9673 KOps/s 25.8843 KOps/s $\color{#d91a1a}-3.54\%$
test_add_pytree 65.0010μs 29.5565μs 33.8335 KOps/s 34.0004 KOps/s $\color{#d91a1a}-0.49\%$
test_add_td 0.1104ms 51.6779μs 19.3506 KOps/s 21.0544 KOps/s $\textbf{\color{#d91a1a}-8.09\%}$
test_distributed 0.2124ms 99.4738μs 10.0529 KOps/s 10.1393 KOps/s $\color{#d91a1a}-0.85\%$
test_tdmodule 97.9830μs 21.0789μs 47.4408 KOps/s 47.0386 KOps/s $\color{#35bf28}+0.85\%$
test_tdmodule_dispatch 0.1512ms 41.4653μs 24.1165 KOps/s 24.0441 KOps/s $\color{#35bf28}+0.30\%$
test_tdseq 51.7770μs 23.9144μs 41.8158 KOps/s 41.5706 KOps/s $\color{#35bf28}+0.59\%$
test_tdseq_dispatch 0.1376ms 44.7582μs 22.3423 KOps/s 22.1215 KOps/s $\color{#35bf28}+1.00\%$
test_instantiation_functorch 1.9271ms 1.3311ms 751.2741 Ops/s 780.1487 Ops/s $\color{#d91a1a}-3.70\%$
test_instantiation_td 1.4927ms 1.0093ms 990.7889 Ops/s 1.0231 KOps/s $\color{#d91a1a}-3.16\%$
test_exec_functorch 0.3368ms 0.1578ms 6.3389 KOps/s 6.3812 KOps/s $\color{#d91a1a}-0.66\%$
test_exec_functional_call 0.2828ms 0.1447ms 6.9088 KOps/s 6.6738 KOps/s $\color{#35bf28}+3.52\%$
test_exec_td 0.2855ms 0.1421ms 7.0370 KOps/s 7.0190 KOps/s $\color{#35bf28}+0.26\%$
test_exec_td_decorator 0.8352ms 0.1978ms 5.0547 KOps/s 5.1532 KOps/s $\color{#d91a1a}-1.91\%$
test_vmap_mlp_speed[True-True] 1.3364ms 0.8719ms 1.1469 KOps/s 1.1460 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_mlp_speed[True-False] 0.6166ms 0.4561ms 2.1923 KOps/s 2.1710 KOps/s $\color{#35bf28}+0.98\%$
test_vmap_mlp_speed[False-True] 0.9518ms 0.7547ms 1.3251 KOps/s 1.3288 KOps/s $\color{#d91a1a}-0.28\%$
test_vmap_mlp_speed[False-False] 0.5788ms 0.3759ms 2.6606 KOps/s 2.5367 KOps/s $\color{#35bf28}+4.88\%$
test_vmap_mlp_speed_decorator[True-True] 2.9998ms 2.2864ms 437.3773 Ops/s 450.5868 Ops/s $\color{#d91a1a}-2.93\%$
test_vmap_mlp_speed_decorator[True-False] 1.0401ms 0.5301ms 1.8863 KOps/s 1.8839 KOps/s $\color{#35bf28}+0.12\%$
test_vmap_mlp_speed_decorator[False-True] 3.5039ms 1.8680ms 535.3236 Ops/s 554.6873 Ops/s $\color{#d91a1a}-3.49\%$
test_vmap_mlp_speed_decorator[False-False] 0.7117ms 0.4090ms 2.4448 KOps/s 2.4528 KOps/s $\color{#d91a1a}-0.32\%$

Copy link

github-actions bot commented Feb 5, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 132. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 86.0510μs 14.1413μs 70.7151 KOps/s 69.0971 KOps/s $\color{#35bf28}+2.34\%$
test_plain_set_stack_nested 0.1746ms 0.1204ms 8.3023 KOps/s 8.2513 KOps/s $\color{#35bf28}+0.62\%$
test_plain_set_nested_inplace 50.7110μs 15.4023μs 64.9252 KOps/s 63.6198 KOps/s $\color{#35bf28}+2.05\%$
test_plain_set_stack_nested_inplace 0.2217ms 0.1490ms 6.7121 KOps/s 6.6873 KOps/s $\color{#35bf28}+0.37\%$
test_items 28.2610μs 4.7165μs 212.0207 KOps/s 211.9847 KOps/s $\color{#35bf28}+0.02\%$
test_items_nested 0.3983ms 0.3388ms 2.9514 KOps/s 2.9477 KOps/s $\color{#35bf28}+0.12\%$
test_items_nested_locked 0.4157ms 0.3445ms 2.9026 KOps/s 2.9203 KOps/s $\color{#d91a1a}-0.61\%$
test_items_nested_leaf 0.2436ms 0.2007ms 4.9818 KOps/s 4.9906 KOps/s $\color{#d91a1a}-0.18\%$
test_items_stack_nested 1.4225ms 1.3264ms 753.9386 Ops/s 757.6844 Ops/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_leaf 1.3397ms 1.1738ms 851.9129 Ops/s 862.3032 Ops/s $\color{#d91a1a}-1.20\%$
test_items_stack_nested_locked 0.9973ms 0.9259ms 1.0801 KOps/s 1.1072 KOps/s $\color{#d91a1a}-2.45\%$
test_keys 31.0000μs 4.5721μs 218.7199 KOps/s 220.1305 KOps/s $\color{#d91a1a}-0.64\%$
test_keys_nested 0.8206ms 95.1077μs 10.5144 KOps/s 10.4915 KOps/s $\color{#35bf28}+0.22\%$
test_keys_nested_locked 0.1393ms 98.4410μs 10.1584 KOps/s 10.0496 KOps/s $\color{#35bf28}+1.08\%$
test_keys_nested_leaf 0.1846ms 78.9232μs 12.6705 KOps/s 12.7156 KOps/s $\color{#d91a1a}-0.35\%$
test_keys_stack_nested 1.2792ms 1.1654ms 858.0649 Ops/s 863.2518 Ops/s $\color{#d91a1a}-0.60\%$
test_keys_stack_nested_leaf 1.2697ms 1.1353ms 880.8255 Ops/s 871.7342 Ops/s $\color{#35bf28}+1.04\%$
test_keys_stack_nested_locked 0.8799ms 0.7451ms 1.3421 KOps/s 1.3628 KOps/s $\color{#d91a1a}-1.52\%$
test_values 8.7333μs 1.8832μs 531.0030 KOps/s 529.2852 KOps/s $\color{#35bf28}+0.32\%$
test_values_nested 66.0910μs 45.0602μs 22.1925 KOps/s 22.2459 KOps/s $\color{#d91a1a}-0.24\%$
test_values_nested_locked 70.3910μs 47.7686μs 20.9343 KOps/s 21.1202 KOps/s $\color{#d91a1a}-0.88\%$
test_values_nested_leaf 65.5510μs 39.5489μs 25.2851 KOps/s 25.4353 KOps/s $\color{#d91a1a}-0.59\%$
test_values_stack_nested 1.0212ms 0.9581ms 1.0438 KOps/s 1.0417 KOps/s $\color{#35bf28}+0.20\%$
test_values_stack_nested_leaf 1.0137ms 0.9538ms 1.0485 KOps/s 1.0423 KOps/s $\color{#35bf28}+0.59\%$
test_values_stack_nested_locked 0.6999ms 0.5660ms 1.7669 KOps/s 1.7396 KOps/s $\color{#35bf28}+1.57\%$
test_membership 5.3020μs 0.9247μs 1.0814 MOps/s 1.0734 MOps/s $\color{#35bf28}+0.75\%$
test_membership_nested 24.0800μs 2.8783μs 347.4222 KOps/s 339.7383 KOps/s $\color{#35bf28}+2.26\%$
test_membership_nested_leaf 33.3910μs 2.9314μs 341.1313 KOps/s 338.6598 KOps/s $\color{#35bf28}+0.73\%$
test_membership_stacked_nested 34.3910μs 11.4074μs 87.6623 KOps/s 87.0180 KOps/s $\color{#35bf28}+0.74\%$
test_membership_stacked_nested_leaf 25.7500μs 11.3923μs 87.7784 KOps/s 87.6523 KOps/s $\color{#35bf28}+0.14\%$
test_membership_nested_last 38.3100μs 5.3951μs 185.3532 KOps/s 190.4159 KOps/s $\color{#d91a1a}-2.66\%$
test_membership_nested_leaf_last 35.3800μs 5.3504μs 186.9033 KOps/s 190.2934 KOps/s $\color{#d91a1a}-1.78\%$
test_membership_stacked_nested_last 0.1981ms 0.1561ms 6.4044 KOps/s 6.3030 KOps/s $\color{#35bf28}+1.61\%$
test_membership_stacked_nested_leaf_last 44.3700μs 13.1955μs 75.7835 KOps/s 75.9568 KOps/s $\color{#d91a1a}-0.23\%$
test_nested_getleaf 32.7310μs 8.4810μs 117.9112 KOps/s 118.5285 KOps/s $\color{#d91a1a}-0.52\%$
test_nested_get 29.4600μs 7.9640μs 125.5645 KOps/s 125.5773 KOps/s $\color{#d91a1a}-0.01\%$
test_stacked_getleaf 0.3781ms 0.3303ms 3.0272 KOps/s 3.0309 KOps/s $\color{#d91a1a}-0.12\%$
test_stacked_get 0.3204ms 0.2990ms 3.3450 KOps/s 3.3628 KOps/s $\color{#d91a1a}-0.53\%$
test_nested_getitemleaf 39.4500μs 9.8321μs 101.7078 KOps/s 102.0088 KOps/s $\color{#d91a1a}-0.30\%$
test_nested_getitem 33.4610μs 9.3645μs 106.7860 KOps/s 106.5922 KOps/s $\color{#35bf28}+0.18\%$
test_stacked_getitemleaf 0.3769ms 0.3330ms 3.0029 KOps/s 3.0001 KOps/s $\color{#35bf28}+0.09\%$
test_stacked_getitem 0.3559ms 0.3001ms 3.3318 KOps/s 3.3612 KOps/s $\color{#d91a1a}-0.87\%$
test_lock_nested 0.7800ms 0.3549ms 2.8175 KOps/s 2.7713 KOps/s $\color{#35bf28}+1.67\%$
test_lock_stack_nested 86.0058ms 6.3848ms 156.6226 Ops/s 158.1819 Ops/s $\color{#d91a1a}-0.99\%$
test_unlock_nested 79.0979ms 0.4330ms 2.3093 KOps/s 2.8079 KOps/s $\textbf{\color{#d91a1a}-17.76\%}$
test_unlock_stack_nested 86.4698ms 6.4731ms 154.4851 Ops/s 154.1654 Ops/s $\color{#35bf28}+0.21\%$
test_flatten_speed 0.3219ms 0.2630ms 3.8018 KOps/s 3.8040 KOps/s $\color{#d91a1a}-0.06\%$
test_unflatten_speed 0.4186ms 0.3640ms 2.7475 KOps/s 2.7540 KOps/s $\color{#d91a1a}-0.24\%$
test_common_ops 1.0511ms 0.6077ms 1.6455 KOps/s 1.5557 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_creation 50.1510μs 1.5617μs 640.3394 KOps/s 631.3782 KOps/s $\color{#35bf28}+1.42\%$
test_creation_empty 27.3100μs 9.0555μs 110.4301 KOps/s 99.9986 KOps/s $\textbf{\color{#35bf28}+10.43\%}$
test_creation_nested_1 39.6900μs 10.7760μs 92.7988 KOps/s 85.0153 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_creation_nested_2 29.1800μs 13.1417μs 76.0934 KOps/s 69.6412 KOps/s $\textbf{\color{#35bf28}+9.26\%}$
test_clone 57.8310μs 13.4163μs 74.5364 KOps/s 71.7492 KOps/s $\color{#35bf28}+3.88\%$
test_getitem[int] 33.6610μs 10.9059μs 91.6939 KOps/s 90.6532 KOps/s $\color{#35bf28}+1.15\%$
test_getitem[slice_int] 49.6810μs 21.3278μs 46.8871 KOps/s 44.9825 KOps/s $\color{#35bf28}+4.23\%$
test_getitem[range] 0.1641ms 36.9628μs 27.0542 KOps/s 25.6416 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_getitem[tuple] 45.5610μs 18.9508μs 52.7683 KOps/s 51.3590 KOps/s $\color{#35bf28}+2.74\%$
test_getitem[list] 0.1903ms 33.5327μs 29.8217 KOps/s 27.7964 KOps/s $\textbf{\color{#35bf28}+7.29\%}$
test_setitem_dim[int] 43.2610μs 27.4830μs 36.3861 KOps/s 35.2656 KOps/s $\color{#35bf28}+3.18\%$
test_setitem_dim[slice_int] 74.7510μs 49.8741μs 20.0505 KOps/s 20.0367 KOps/s $\color{#35bf28}+0.07\%$
test_setitem_dim[range] 0.1009ms 68.3147μs 14.6381 KOps/s 15.3083 KOps/s $\color{#d91a1a}-4.38\%$
test_setitem_dim[tuple] 65.7600μs 43.8556μs 22.8021 KOps/s 23.2732 KOps/s $\color{#d91a1a}-2.02\%$
test_setitem 63.9310μs 19.1395μs 52.2479 KOps/s 50.7790 KOps/s $\color{#35bf28}+2.89\%$
test_set 70.5810μs 18.7647μs 53.2916 KOps/s 52.2449 KOps/s $\color{#35bf28}+2.00\%$
test_set_shared 2.6404ms 0.1032ms 9.6919 KOps/s 9.6607 KOps/s $\color{#35bf28}+0.32\%$
test_update 82.8610μs 21.0824μs 47.4329 KOps/s 44.0904 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_update_nested 75.4010μs 27.7152μs 36.0813 KOps/s 34.3752 KOps/s $\color{#35bf28}+4.96\%$
test_set_nested 74.2510μs 19.2147μs 52.0436 KOps/s 48.1027 KOps/s $\textbf{\color{#35bf28}+8.19\%}$
test_set_nested_new 76.4110μs 22.3078μs 44.8273 KOps/s 42.0292 KOps/s $\textbf{\color{#35bf28}+6.66\%}$
test_select 73.1110μs 34.8462μs 28.6975 KOps/s 26.6346 KOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_select_nested 74.7910μs 52.6981μs 18.9760 KOps/s 18.8969 KOps/s $\color{#35bf28}+0.42\%$
test_exclude_nested 0.1461ms 0.1121ms 8.9208 KOps/s 8.7474 KOps/s $\color{#35bf28}+1.98\%$
test_empty[True] 1.2853ms 0.3921ms 2.5502 KOps/s 2.5805 KOps/s $\color{#d91a1a}-1.17\%$
test_empty[False] 2.8681μs 0.8397μs 1.1909 MOps/s 1.1711 MOps/s $\color{#35bf28}+1.68\%$
test_to 74.3610μs 55.3168μs 18.0777 KOps/s 17.9190 KOps/s $\color{#35bf28}+0.89\%$
test_to_nonblocking 59.4000μs 35.4078μs 28.2423 KOps/s 27.7872 KOps/s $\color{#35bf28}+1.64\%$
test_unbind_speed 0.4018ms 0.2681ms 3.7306 KOps/s 3.6800 KOps/s $\color{#35bf28}+1.38\%$
test_unbind_speed_stack0 3.1426ms 3.0177ms 331.3823 Ops/s 285.1018 Ops/s $\textbf{\color{#35bf28}+16.23\%}$
test_unbind_speed_stack1 19.0900μs 1.8090μs 552.7795 KOps/s 548.3014 KOps/s $\color{#35bf28}+0.82\%$
test_split 2.1451ms 1.5296ms 653.7841 Ops/s 629.4648 Ops/s $\color{#35bf28}+3.86\%$
test_chunk 81.5098ms 1.6563ms 603.7680 Ops/s 583.0582 Ops/s $\color{#35bf28}+3.55\%$
test_creation[device0] 0.1366ms 73.1666μs 13.6674 KOps/s 13.5332 KOps/s $\color{#35bf28}+0.99\%$
test_creation_from_tensor 0.1438ms 57.7721μs 17.3094 KOps/s 18.1719 KOps/s $\color{#d91a1a}-4.75\%$
test_add_one[memmap_tensor0] 0.2336ms 6.7493μs 148.1639 KOps/s 145.7685 KOps/s $\color{#35bf28}+1.64\%$
test_contiguous[memmap_tensor0] 11.0500μs 0.6522μs 1.5332 MOps/s 1.5109 MOps/s $\color{#35bf28}+1.47\%$
test_stack[memmap_tensor0] 42.9510μs 4.4676μs 223.8349 KOps/s 216.3017 KOps/s $\color{#35bf28}+3.48\%$
test_memmaptd_index 1.0529ms 0.2644ms 3.7816 KOps/s 3.6979 KOps/s $\color{#35bf28}+2.26\%$
test_memmaptd_index_astensor 0.6349ms 0.3232ms 3.0937 KOps/s 3.0702 KOps/s $\color{#35bf28}+0.77\%$
test_memmaptd_index_op 0.9687ms 0.6262ms 1.5970 KOps/s 1.5354 KOps/s $\color{#35bf28}+4.01\%$
test_serialize_model 0.1742s 97.9928ms 10.2048 Ops/s 9.6443 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_serialize_model_pickle 1.3686s 1.2394s 0.8069 Ops/s 0.8077 Ops/s $\color{#d91a1a}-0.10\%$
test_serialize_weights 0.1723s 96.3330ms 10.3807 Ops/s 9.9485 Ops/s $\color{#35bf28}+4.34\%$
test_serialize_weights_returnearly 0.2187s 71.0024ms 14.0840 Ops/s 14.2536 Ops/s $\color{#d91a1a}-1.19\%$
test_serialize_weights_pickle 1.3577s 1.2365s 0.8087 Ops/s 0.8084 Ops/s $\color{#35bf28}+0.04\%$
test_reshape_pytree 46.6110μs 25.3940μs 39.3794 KOps/s 39.1306 KOps/s $\color{#35bf28}+0.64\%$
test_reshape_td 0.1841ms 33.5541μs 29.8027 KOps/s 32.7897 KOps/s $\textbf{\color{#d91a1a}-9.11\%}$
test_view_pytree 0.1538ms 26.5066μs 37.7264 KOps/s 39.6346 KOps/s $\color{#d91a1a}-4.81\%$
test_view_td 0.5272ms 6.9132μs 144.6512 KOps/s 146.9994 KOps/s $\color{#d91a1a}-1.60\%$
test_unbind_pytree 53.6910μs 31.0826μs 32.1724 KOps/s 32.0257 KOps/s $\color{#35bf28}+0.46\%$
test_unbind_td 0.3919ms 40.7836μs 24.5197 KOps/s 22.3938 KOps/s $\textbf{\color{#35bf28}+9.49\%}$
test_split_pytree 53.3800μs 29.6778μs 33.6953 KOps/s 33.4284 KOps/s $\color{#35bf28}+0.80\%$
test_split_td 0.1121ms 39.2471μs 25.4796 KOps/s 24.9549 KOps/s $\color{#35bf28}+2.10\%$
test_add_pytree 58.2110μs 36.8088μs 27.1674 KOps/s 27.4773 KOps/s $\color{#d91a1a}-1.13\%$
test_add_td 92.1210μs 50.4001μs 19.8412 KOps/s 18.6483 KOps/s $\textbf{\color{#35bf28}+6.40\%}$
test_distributed 1.6315ms 71.6282μs 13.9610 KOps/s 13.2522 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_tdmodule 56.0100μs 18.3982μs 54.3531 KOps/s 52.4967 KOps/s $\color{#35bf28}+3.54\%$
test_tdmodule_dispatch 0.2158ms 37.7789μs 26.4698 KOps/s 25.3309 KOps/s $\color{#35bf28}+4.50\%$
test_tdseq 41.5410μs 21.2323μs 47.0982 KOps/s 46.5250 KOps/s $\color{#35bf28}+1.23\%$
test_tdseq_dispatch 57.1410μs 40.2215μs 24.8623 KOps/s 24.2667 KOps/s $\color{#35bf28}+2.45\%$
test_instantiation_functorch 1.7781ms 1.6786ms 595.7299 Ops/s 600.8326 Ops/s $\color{#d91a1a}-0.85\%$
test_instantiation_td 1.6972ms 1.1654ms 858.0837 Ops/s 876.2204 Ops/s $\color{#d91a1a}-2.07\%$
test_exec_functorch 0.2113ms 0.1597ms 6.2619 KOps/s 6.1872 KOps/s $\color{#35bf28}+1.21\%$
test_exec_functional_call 0.2323ms 0.1574ms 6.3538 KOps/s 6.3414 KOps/s $\color{#35bf28}+0.20\%$
test_exec_td 0.1874ms 0.1518ms 6.5871 KOps/s 6.7379 KOps/s $\color{#d91a1a}-2.24\%$
test_exec_td_decorator 0.8466ms 0.2075ms 4.8192 KOps/s 4.8686 KOps/s $\color{#d91a1a}-1.01\%$
test_vmap_mlp_speed[True-True] 1.2300ms 1.0567ms 946.3398 Ops/s 936.0161 Ops/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed[True-False] 0.6874ms 0.6159ms 1.6235 KOps/s 1.6133 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_mlp_speed[False-True] 1.0762ms 0.9943ms 1.0058 KOps/s 1.0294 KOps/s $\color{#d91a1a}-2.30\%$
test_vmap_mlp_speed[False-False] 0.6036ms 0.5548ms 1.8024 KOps/s 1.8354 KOps/s $\color{#d91a1a}-1.80\%$
test_vmap_mlp_speed_decorator[True-True] 3.0501ms 2.3836ms 419.5388 Ops/s 413.8199 Ops/s $\color{#35bf28}+1.38\%$
test_vmap_mlp_speed_decorator[True-False] 1.1271ms 0.6803ms 1.4700 KOps/s 1.3085 KOps/s $\textbf{\color{#35bf28}+12.34\%}$
test_vmap_mlp_speed_decorator[False-True] 0.1173s 2.2177ms 450.9093 Ops/s 496.1545 Ops/s $\textbf{\color{#d91a1a}-9.12\%}$
test_vmap_mlp_speed_decorator[False-False] 1.0429ms 0.5821ms 1.7178 KOps/s 1.7311 KOps/s $\color{#d91a1a}-0.77\%$
test_vmap_transformer_speed[True-True] 12.6318ms 12.4799ms 80.1289 Ops/s 79.9200 Ops/s $\color{#35bf28}+0.26\%$
test_vmap_transformer_speed[True-False] 8.4793ms 8.2348ms 121.4358 Ops/s 121.1000 Ops/s $\color{#35bf28}+0.28\%$
test_vmap_transformer_speed[False-True] 12.4949ms 12.3441ms 81.0105 Ops/s 80.5045 Ops/s $\color{#35bf28}+0.63\%$
test_vmap_transformer_speed[False-False] 8.4841ms 8.1638ms 122.4921 Ops/s 121.9523 Ops/s $\color{#35bf28}+0.44\%$
test_vmap_transformer_speed_decorator[True-True] 75.3349ms 74.5797ms 13.4085 Ops/s 13.3465 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_transformer_speed_decorator[True-False] 21.8009ms 20.1021ms 49.7460 Ops/s 49.7970 Ops/s $\color{#d91a1a}-0.10\%$
test_vmap_transformer_speed_decorator[False-True] 67.7733ms 66.9209ms 14.9430 Ops/s 14.8325 Ops/s $\color{#35bf28}+0.75\%$
test_vmap_transformer_speed_decorator[False-False] 0.1538s 22.2432ms 44.9575 Ops/s 44.9148 Ops/s $\color{#35bf28}+0.10\%$

@vmoens vmoens merged commit f11eac6 into main Feb 5, 2024
48 checks passed
@vmoens vmoens deleted the filter-empty branch February 5, 2024 18:16
vmoens added a commit that referenced this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants