-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BE] Better warning for composite_lp_aggregate value #1201
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 69.8300μs | 20.2566μs | 49.3666 KOps/s | 47.1086 KOps/s | |
test_plain_set_stack_nested | 41.0560μs | 20.6276μs | 48.4788 KOps/s | 46.6557 KOps/s | |
test_plain_set_nested_inplace | 54.0710μs | 22.1401μs | 45.1669 KOps/s | 43.3248 KOps/s | |
test_plain_set_stack_nested_inplace | 60.4430μs | 22.2204μs | 45.0038 KOps/s | 43.5230 KOps/s | |
test_items | 22.3810μs | 4.2421μs | 235.7345 KOps/s | 238.2533 KOps/s | |
test_items_nested | 0.6137ms | 0.3991ms | 2.5055 KOps/s | 2.4310 KOps/s | |
test_items_nested_locked | 0.5736ms | 0.3965ms | 2.5224 KOps/s | 2.4417 KOps/s | |
test_items_nested_leaf | 0.1397ms | 76.2608μs | 13.1129 KOps/s | 13.0911 KOps/s | |
test_items_stack_nested | 0.7166ms | 0.4028ms | 2.4824 KOps/s | 2.4216 KOps/s | |
test_items_stack_nested_leaf | 0.1462ms | 78.6149μs | 12.7202 KOps/s | 13.0746 KOps/s | |
test_items_stack_nested_locked | 0.6942ms | 0.3991ms | 2.5054 KOps/s | 2.4176 KOps/s | |
test_keys | 23.3040μs | 3.4632μs | 288.7485 KOps/s | 288.5053 KOps/s | |
test_keys_nested | 0.2237ms | 0.1629ms | 6.1392 KOps/s | 6.0842 KOps/s | |
test_keys_nested_locked | 3.8872ms | 0.1743ms | 5.7383 KOps/s | 5.8248 KOps/s | |
test_keys_nested_leaf | 0.2362ms | 0.1429ms | 7.0003 KOps/s | 6.9024 KOps/s | |
test_keys_stack_nested | 0.3018ms | 0.1628ms | 6.1436 KOps/s | 5.9894 KOps/s | |
test_keys_stack_nested_leaf | 0.2763ms | 0.1422ms | 7.0303 KOps/s | 6.8801 KOps/s | |
test_keys_stack_nested_locked | 1.0325ms | 0.1711ms | 5.8436 KOps/s | 5.7538 KOps/s | |
test_values | 29.4788μs | 1.0564μs | 946.5801 KOps/s | 939.1601 KOps/s | |
test_values_nested | 0.1146ms | 62.9268μs | 15.8915 KOps/s | 15.2026 KOps/s | |
test_values_nested_locked | 0.5687ms | 64.2595μs | 15.5619 KOps/s | 15.2650 KOps/s | |
test_values_nested_leaf | 0.2355ms | 71.6335μs | 13.9600 KOps/s | 13.4127 KOps/s | |
test_values_stack_nested | 0.1365ms | 63.2779μs | 15.8033 KOps/s | 15.9669 KOps/s | |
test_values_stack_nested_leaf | 0.1273ms | 71.7101μs | 13.9450 KOps/s | 13.7769 KOps/s | |
test_values_stack_nested_locked | 0.1101ms | 62.5714μs | 15.9817 KOps/s | 16.0129 KOps/s | |
test_membership | 31.1780μs | 0.8657μs | 1.1551 MOps/s | 1.1355 MOps/s | |
test_membership_nested | 43.1200μs | 2.8514μs | 350.7038 KOps/s | 328.1171 KOps/s | |
test_membership_nested_leaf | 0.1335ms | 2.8935μs | 345.6044 KOps/s | 341.1862 KOps/s | |
test_membership_stacked_nested | 38.6120μs | 2.8566μs | 350.0705 KOps/s | 342.5015 KOps/s | |
test_membership_stacked_nested_leaf | 46.8770μs | 2.8696μs | 348.4849 KOps/s | 345.5286 KOps/s | |
test_membership_nested_last | 36.5780μs | 4.2530μs | 235.1278 KOps/s | 226.6910 KOps/s | |
test_membership_nested_leaf_last | 39.5130μs | 4.3020μs | 232.4498 KOps/s | 225.4202 KOps/s | |
test_membership_stacked_nested_last | 27.6410μs | 4.2660μs | 234.4126 KOps/s | 227.0042 KOps/s | |
test_membership_stacked_nested_leaf_last | 42.1890μs | 4.2641μs | 234.5142 KOps/s | 225.8147 KOps/s | |
test_nested_getleaf | 64.7310μs | 10.3529μs | 96.5914 KOps/s | 94.0030 KOps/s | |
test_nested_get | 42.5890μs | 9.9022μs | 100.9880 KOps/s | 99.6325 KOps/s | |
test_stacked_getleaf | 44.8730μs | 10.2971μs | 97.1149 KOps/s | 94.6848 KOps/s | |
test_stacked_get | 30.1660μs | 9.7557μs | 102.5042 KOps/s | 98.3351 KOps/s | |
test_nested_getitemleaf | 0.2785ms | 11.8879μs | 84.1192 KOps/s | 90.0850 KOps/s | |
test_nested_getitem | 60.1740μs | 10.2104μs | 97.9390 KOps/s | 94.3636 KOps/s | |
test_stacked_getitemleaf | 36.5580μs | 10.8965μs | 91.7724 KOps/s | 90.4514 KOps/s | |
test_stacked_getitem | 30.1160μs | 10.3944μs | 96.2052 KOps/s | 94.9791 KOps/s | |
test_lock_nested | 0.8614ms | 0.4065ms | 2.4601 KOps/s | 2.4873 KOps/s | |
test_lock_stack_nested | 0.5832ms | 0.4188ms | 2.3878 KOps/s | 2.4137 KOps/s | |
test_unlock_nested | 0.5241ms | 0.3362ms | 2.9743 KOps/s | 3.0445 KOps/s | |
test_unlock_stack_nested | 0.6119ms | 0.3413ms | 2.9304 KOps/s | 2.9707 KOps/s | |
test_flatten_speed | 0.1799ms | 98.3125μs | 10.1716 KOps/s | 10.0295 KOps/s | |
test_unflatten_speed | 0.6812ms | 0.5038ms | 1.9850 KOps/s | 1.9424 KOps/s | |
test_common_ops | 5.3475ms | 0.7983ms | 1.2527 KOps/s | 1.2475 KOps/s | |
test_creation | 25.5580μs | 2.4346μs | 410.7368 KOps/s | 396.8136 KOps/s | |
test_creation_empty | 45.8050μs | 11.9080μs | 83.9769 KOps/s | 77.9587 KOps/s | |
test_creation_nested_1 | 40.2150μs | 14.6407μs | 68.3025 KOps/s | 63.1048 KOps/s | |
test_creation_nested_2 | 45.0840μs | 19.2310μs | 51.9995 KOps/s | 49.4307 KOps/s | |
test_clone | 57.0270μs | 13.3235μs | 75.0553 KOps/s | 74.7731 KOps/s | |
test_getitem[int] | 0.7468ms | 13.0340μs | 76.7222 KOps/s | 79.4960 KOps/s | |
test_getitem[slice_int] | 0.1289ms | 24.5081μs | 40.8029 KOps/s | 42.3812 KOps/s | |
test_getitem[range] | 0.1677ms | 49.5740μs | 20.1719 KOps/s | 19.9415 KOps/s | |
test_getitem[tuple] | 0.1201ms | 19.9739μs | 50.0652 KOps/s | 51.2974 KOps/s | |
test_getitem[list] | 0.1534ms | 45.1459μs | 22.1504 KOps/s | 21.9173 KOps/s | |
test_setitem_dim[int] | 61.2740μs | 26.1958μs | 38.1740 KOps/s | 39.8124 KOps/s | |
test_setitem_dim[slice_int] | 90.4180μs | 51.3031μs | 19.4920 KOps/s | 19.7100 KOps/s | |
test_setitem_dim[range] | 0.1277ms | 74.5475μs | 13.4143 KOps/s | 12.6873 KOps/s | |
test_setitem_dim[tuple] | 73.2670μs | 40.6320μs | 24.6112 KOps/s | 24.4217 KOps/s | |
test_setitem | 62.0260μs | 20.2697μs | 49.3346 KOps/s | 48.4633 KOps/s | |
test_set | 69.4900μs | 19.9025μs | 50.2449 KOps/s | 50.0933 KOps/s | |
test_set_shared | 0.3046ms | 0.1790ms | 5.5879 KOps/s | 5.6206 KOps/s | |
test_update | 0.1217ms | 22.8444μs | 43.7745 KOps/s | 43.2971 KOps/s | |
test_update_nested | 79.7590μs | 32.0217μs | 31.2289 KOps/s | 29.7404 KOps/s | |
test_update__nested | 0.5599ms | 33.0141μs | 30.2901 KOps/s | 30.7594 KOps/s | |
test_set_nested | 83.9970μs | 21.8055μs | 45.8599 KOps/s | 44.3435 KOps/s | |
test_set_nested_new | 70.5010μs | 26.3327μs | 37.9756 KOps/s | 36.5713 KOps/s | |
test_select | 0.1615ms | 42.8274μs | 23.3495 KOps/s | 22.9088 KOps/s | |
test_select_nested | 0.1195ms | 63.3477μs | 15.7859 KOps/s | 16.0169 KOps/s | |
test_exclude_nested | 0.1582ms | 79.9221μs | 12.5122 KOps/s | 12.4028 KOps/s | |
test_empty[True] | 0.6120ms | 0.4029ms | 2.4822 KOps/s | 2.4278 KOps/s | |
test_empty[False] | 8.5410μs | 1.3732μs | 728.2473 KOps/s | 731.0845 KOps/s | |
test_unbind_speed | 0.3205ms | 0.2712ms | 3.6877 KOps/s | 3.7606 KOps/s | |
test_unbind_speed_stack0 | 0.3942ms | 0.2683ms | 3.7277 KOps/s | 3.7956 KOps/s | |
test_unbind_speed_stack1 | 99.5713ms | 0.7334ms | 1.3635 KOps/s | 1.2439 KOps/s | |
test_split | 0.1014s | 1.7367ms | 575.8191 Ops/s | 640.1314 Ops/s | |
test_chunk | 0.1059s | 1.7526ms | 570.5942 Ops/s | 527.7032 Ops/s | |
test_consolidate_njt[False-None] | 8.4711ms | 8.2241ms | 121.5940 Ops/s | 124.6436 Ops/s | |
test_creation[device0] | 0.2174ms | 89.6966μs | 11.1487 KOps/s | 11.2040 KOps/s | |
test_creation_from_tensor | 4.2409ms | 96.8119μs | 10.3293 KOps/s | 10.5989 KOps/s | |
test_add_one[memmap_tensor0] | 0.1079ms | 4.9027μs | 203.9706 KOps/s | 202.3848 KOps/s | |
test_contiguous[memmap_tensor0] | 20.7690μs | 0.5154μs | 1.9402 MOps/s | 1.9344 MOps/s | |
test_stack[memmap_tensor0] | 36.2070μs | 3.2446μs | 308.2085 KOps/s | 297.4687 KOps/s | |
test_memmaptd_index | 1.2545ms | 0.2267ms | 4.4109 KOps/s | 3.5530 KOps/s | |
test_memmaptd_index_astensor | 0.6673ms | 0.3106ms | 3.2191 KOps/s | 3.1999 KOps/s | |
test_memmaptd_index_op | 1.2956ms | 0.5834ms | 1.7142 KOps/s | 1.7032 KOps/s | |
test_serialize_model | 0.2184s | 0.1307s | 7.6516 Ops/s | 8.7348 Ops/s | |
test_serialize_model_pickle | 0.4908s | 0.4020s | 2.4876 Ops/s | 2.5045 Ops/s | |
test_serialize_weights | 0.1165s | 0.1114s | 8.9761 Ops/s | 8.5440 Ops/s | |
test_serialize_weights_returnearly | 0.1861s | 0.1655s | 6.0419 Ops/s | 6.1815 Ops/s | |
test_serialize_weights_pickle | 1.2068s | 0.7024s | 1.4237 Ops/s | 2.4991 Ops/s | |
test_serialize_weights_filesystem | 0.1564s | 0.1428s | 7.0031 Ops/s | 6.7401 Ops/s | |
test_serialize_model_filesystem | 0.2457s | 0.1559s | 6.4155 Ops/s | 6.7567 Ops/s | |
test_reshape_pytree | 69.3400μs | 25.8458μs | 38.6910 KOps/s | 38.5721 KOps/s | |
test_reshape_td | 87.2520μs | 31.8019μs | 31.4447 KOps/s | 31.4531 KOps/s | |
test_view_pytree | 67.1250μs | 25.9470μs | 38.5400 KOps/s | 38.9485 KOps/s | |
test_view_td | 81.9130μs | 37.4003μs | 26.7377 KOps/s | 26.9014 KOps/s | |
test_unbind_pytree | 91.1000μs | 29.1224μs | 34.3378 KOps/s | 34.5368 KOps/s | |
test_unbind_td | 0.3288ms | 39.4711μs | 25.3350 KOps/s | 25.1418 KOps/s | |
test_split_pytree | 79.6990μs | 28.6166μs | 34.9448 KOps/s | 35.0411 KOps/s | |
test_split_td | 0.2054ms | 45.6424μs | 21.9095 KOps/s | 22.7167 KOps/s | |
test_add_pytree | 79.7890μs | 35.8691μs | 27.8791 KOps/s | 29.2534 KOps/s | |
test_add_td | 0.1132ms | 57.3151μs | 17.4474 KOps/s | 17.5396 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1266ms | 66.3180μs | 15.0789 KOps/s | 15.0877 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3746ms | 0.1714ms | 5.8328 KOps/s | 5.7245 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1051ms | 45.3226μs | 22.0641 KOps/s | 22.2090 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2179ms | 0.1188ms | 8.4191 KOps/s | 8.5711 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 78.8470μs | 27.9650μs | 35.7589 KOps/s | 36.1417 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1153ms | 58.3346μs | 17.1425 KOps/s | 17.2820 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1756ms | 79.4765μs | 12.5823 KOps/s | 12.8184 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1311ms | 66.3720μs | 15.0666 KOps/s | 15.2773 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2312ms | 0.1058ms | 9.4522 KOps/s | 8.7975 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4406ms | 0.2157ms | 4.6354 KOps/s | 4.5740 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1398ms | 46.0191μs | 21.7301 KOps/s | 22.0227 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1933ms | 65.0666μs | 15.3689 KOps/s | 15.2190 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2227ms | 0.1004ms | 9.9598 KOps/s | 10.0641 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4030ms | 0.2053ms | 4.8698 KOps/s | 5.0474 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4210ms | 0.2310ms | 4.3289 KOps/s | 4.2768 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3175ms | 0.1105ms | 9.0524 KOps/s | 9.2936 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1988ms | 61.4326μs | 16.2780 KOps/s | 16.0552 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 98.0630μs | 47.9096μs | 20.8727 KOps/s | 21.1308 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2957ms | 0.1601ms | 6.2470 KOps/s | 6.3848 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1891ms | 0.1006ms | 9.9356 KOps/s | 10.0082 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.8090μs | 21.1383μs | 47.3076 KOps/s | 48.6310 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1293ms | 67.2783μs | 14.8636 KOps/s | 14.8624 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1523ms | 82.1456μs | 12.1735 KOps/s | 11.9505 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1320ms | 69.2277μs | 14.4451 KOps/s | 14.4028 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3136ms | 0.2173ms | 4.6016 KOps/s | 4.6940 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7110ms | 1.3617ms | 734.3954 Ops/s | 728.0485 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.1629ms | 0.2152ms | 4.6459 KOps/s | 4.7951 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7409ms | 0.8337ms | 1.1995 KOps/s | 1.2215 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8315ms | 0.4643ms | 2.1537 KOps/s | 2.2463 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.9598ms | 2.6535ms | 376.8584 Ops/s | 366.9328 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1240ms | 37.0594μs | 26.9837 KOps/s | 27.0560 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5475ms | 32.8864μs | 30.4077 KOps/s | 30.3769 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 89.4260μs | 30.3951μs | 32.9001 KOps/s | 33.1529 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 57.9380μs | 22.3648μs | 44.7132 KOps/s | 44.7095 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 81.6120μs | 31.4961μs | 31.7500 KOps/s | 32.9708 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 59.8620μs | 22.2418μs | 44.9604 KOps/s | 44.9405 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.4385ms | 52.1957μs | 19.1587 KOps/s | 19.4888 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2248s | 26.3643μs | 37.9301 KOps/s | 49.8311 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1696ms | 46.5204μs | 21.4960 KOps/s | 22.2578 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 97.1490μs | 18.3394μs | 54.5273 KOps/s | 55.0041 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 98.5840μs | 45.4024μs | 22.0253 KOps/s | 21.9508 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 81.3210μs | 18.1911μs | 54.9718 KOps/s | 54.7467 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1107ms | 52.4015μs | 19.0834 KOps/s | 18.9871 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9352ms | 20.1262μs | 49.6864 KOps/s | 50.9355 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 98.2530μs | 45.3637μs | 22.0440 KOps/s | 21.5386 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 66.9550μs | 18.2094μs | 54.9166 KOps/s | 54.6448 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1136ms | 45.8507μs | 21.8099 KOps/s | 22.0132 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.1660μs | 18.1360μs | 55.1390 KOps/s | 54.9107 KOps/s | |
test_mod_add[eager] | 77.9950μs | 33.0742μs | 30.2351 KOps/s | 29.0243 KOps/s | |
test_mod_add[compile] | 0.1293ms | 64.0990μs | 15.6009 KOps/s | 15.5008 KOps/s | |
test_mod_add[compile-overhead] | 0.1245ms | 63.3433μs | 15.7870 KOps/s | 15.5322 KOps/s | |
test_mod_wrap[eager] | 0.3276ms | 0.2153ms | 4.6442 KOps/s | 4.5850 KOps/s | |
test_mod_wrap[compile] | 1.7390ms | 0.2253ms | 4.4388 KOps/s | 4.4189 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3415ms | 0.2222ms | 4.5009 KOps/s | 4.5107 KOps/s | |
test_mod_wrap_and_backward[eager] | 17.0517ms | 12.3957ms | 80.6732 Ops/s | 91.8233 Ops/s | |
test_mod_wrap_and_backward[compile] | 18.5079ms | 12.8598ms | 77.7615 Ops/s | 91.7155 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.2461ms | 12.6567ms | 79.0096 Ops/s | 91.9253 Ops/s | |
test_seq_add[eager] | 0.2199ms | 0.1114ms | 8.9807 KOps/s | 8.4172 KOps/s | |
test_seq_add[compile] | 0.1319ms | 76.4060μs | 13.0880 KOps/s | 13.5189 KOps/s | |
test_seq_add[compile-overhead] | 0.1318ms | 75.0317μs | 13.3277 KOps/s | 13.7446 KOps/s | |
test_seq_wrap[eager] | 0.8414ms | 0.4291ms | 2.3303 KOps/s | 2.2572 KOps/s | |
test_seq_wrap[compile] | 0.4641ms | 0.2384ms | 4.1939 KOps/s | 4.2301 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4054ms | 0.2385ms | 4.1925 KOps/s | 4.2314 KOps/s | |
test_func_call_runtime[False-eager] | 0.6668ms | 0.5129ms | 1.9497 KOps/s | 1.8793 KOps/s | |
test_func_call_runtime[False-compile] | 0.5465ms | 0.4314ms | 2.3182 KOps/s | 2.2635 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 1.6819ms | 0.4510ms | 2.2171 KOps/s | 2.2978 KOps/s | |
test_func_call_runtime[True-eager] | 1.5726ms | 0.7329ms | 1.3644 KOps/s | 1.3480 KOps/s | |
test_func_call_runtime[True-compile] | 0.7418ms | 0.4521ms | 2.2120 KOps/s | 2.1853 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5653ms | 0.4526ms | 2.2096 KOps/s | 2.1874 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9447ms | 0.5135ms | 1.9476 KOps/s | 1.8995 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5688ms | 0.4311ms | 2.3196 KOps/s | 2.2884 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5567ms | 0.4289ms | 2.3313 KOps/s | 2.2931 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4683ms | 0.8857ms | 1.1290 KOps/s | 1.1243 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2549ms | 0.7698ms | 1.2991 KOps/s | 1.2749 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1545ms | 0.7748ms | 1.2906 KOps/s | 1.2778 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5906ms | 1.8850ms | 530.5178 Ops/s | 523.4671 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.0073ms | 0.5497ms | 1.8191 KOps/s | 1.8902 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7558ms | 0.5320ms | 1.8798 KOps/s | 1.8897 KOps/s | |
test_distributed | 1.2656ms | 0.1296ms | 7.7182 KOps/s | 7.8362 KOps/s | |
test_tdmodule | 0.1207ms | 26.1987μs | 38.1699 KOps/s | 37.0815 KOps/s | |
test_tdmodule_dispatch | 77.9660μs | 47.7383μs | 20.9475 KOps/s | 20.2674 KOps/s | |
test_tdseq | 59.5610μs | 28.1277μs | 35.5522 KOps/s | 34.7276 KOps/s | |
test_tdseq_dispatch | 80.0390μs | 52.4941μs | 19.0498 KOps/s | 18.2508 KOps/s | |
test_instantiation_functorch | 2.3646ms | 1.5300ms | 653.5946 Ops/s | 661.1600 Ops/s | |
test_exec_functorch | 0.2809ms | 0.1740ms | 5.7477 KOps/s | 5.7718 KOps/s | |
test_exec_functional_call | 0.2311ms | 0.1666ms | 6.0020 KOps/s | 6.0219 KOps/s | |
test_exec_td_decorator | 0.4942ms | 0.2279ms | 4.3880 KOps/s | 4.3111 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9974ms | 0.6540ms | 1.5291 KOps/s | 1.5140 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8711ms | 0.6485ms | 1.5421 KOps/s | 1.5250 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7045ms | 0.5216ms | 1.9170 KOps/s | 1.8830 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8475ms | 0.5261ms | 1.9008 KOps/s | 1.8848 KOps/s | |
test_to_module_speed[True] | 2.0862ms | 1.3196ms | 757.8018 Ops/s | 748.6850 Ops/s | |
test_to_module_speed[False] | 1.5376ms | 1.2864ms | 777.3379 Ops/s | 759.9319 Ops/s | |
test_tc_init | 79.3580μs | 47.5019μs | 21.0518 KOps/s | 20.8213 KOps/s | |
test_tc_init_nested | 0.1683ms | 93.0625μs | 10.7455 KOps/s | 10.1461 KOps/s | |
test_tc_first_layer_tensor | 25.9180μs | 1.5082μs | 663.0410 KOps/s | 635.7216 KOps/s | |
test_tc_first_layer_nontensor | 0.2060ms | 5.0076μs | 199.6968 KOps/s | 215.2957 KOps/s | |
test_tc_second_layer_tensor | 0.3684ms | 3.2265μs | 309.9361 KOps/s | 339.9101 KOps/s | |
test_tc_second_layer_nontensor | 30.4770μs | 5.9721μs | 167.4462 KOps/s | 168.2376 KOps/s | |
test_unbind | 0.2546s | 13.5753ms | 73.6631 Ops/s | 81.7649 Ops/s | |
test_full_like | 9.0484ms | 7.1978ms | 138.9305 Ops/s | 142.6106 Ops/s | |
test_zeros_like | 4.9374ms | 2.8657ms | 348.9555 Ops/s | 222.1784 Ops/s | |
test_ones_like | 3.7164ms | 3.2165ms | 310.8947 Ops/s | 204.4477 Ops/s | |
test_clone | 5.5062ms | 4.9358ms | 202.5996 Ops/s | 147.9028 Ops/s | |
test_squeeze | 60.6230μs | 12.2156μs | 81.8628 KOps/s | 81.3757 KOps/s | |
test_unsqueeze | 0.2687ms | 90.2570μs | 11.0795 KOps/s | 11.2015 KOps/s | |
test_split | 0.3084ms | 0.1947ms | 5.1348 KOps/s | 5.2377 KOps/s | |
test_permute | 0.3098ms | 0.1953ms | 5.1191 KOps/s | 5.0376 KOps/s | |
test_stack | 49.9477ms | 26.6900ms | 37.4672 Ops/s | 39.5401 Ops/s | |
test_cat | 25.8069ms | 24.8441ms | 40.2511 Ops/s | 38.4649 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.6110μs | 11.4952μs | 86.9928 KOps/s | 78.2552 KOps/s | |
test_plain_set_stack_nested | 31.7000μs | 11.7673μs | 84.9812 KOps/s | 76.6846 KOps/s | |
test_plain_set_nested_inplace | 44.1210μs | 12.6418μs | 79.1029 KOps/s | 72.1756 KOps/s | |
test_plain_set_stack_nested_inplace | 35.7300μs | 12.6987μs | 78.7485 KOps/s | 72.1966 KOps/s | |
test_items | 26.5400μs | 2.8936μs | 345.5852 KOps/s | 342.9162 KOps/s | |
test_items_nested | 0.4191ms | 0.3719ms | 2.6892 KOps/s | 2.7532 KOps/s | |
test_items_nested_locked | 0.4325ms | 0.3722ms | 2.6868 KOps/s | 2.7655 KOps/s | |
test_items_nested_leaf | 84.9510μs | 59.8359μs | 16.7124 KOps/s | 17.2652 KOps/s | |
test_items_stack_nested | 0.4325ms | 0.3765ms | 2.6559 KOps/s | 2.7131 KOps/s | |
test_items_stack_nested_leaf | 92.8320μs | 60.8325μs | 16.4386 KOps/s | 16.6767 KOps/s | |
test_items_stack_nested_locked | 0.4181ms | 0.3776ms | 2.6484 KOps/s | 2.7515 KOps/s | |
test_keys | 28.5600μs | 3.4847μs | 286.9673 KOps/s | 288.7771 KOps/s | |
test_keys_nested | 0.1217ms | 89.4770μs | 11.1761 KOps/s | 11.4351 KOps/s | |
test_keys_nested_locked | 0.6988ms | 95.9537μs | 10.4217 KOps/s | 10.7673 KOps/s | |
test_keys_nested_leaf | 0.1084ms | 80.1683μs | 12.4738 KOps/s | 12.7928 KOps/s | |
test_keys_stack_nested | 0.1248ms | 91.1640μs | 10.9692 KOps/s | 11.2014 KOps/s | |
test_keys_stack_nested_leaf | 0.1182ms | 82.6655μs | 12.0969 KOps/s | 12.5951 KOps/s | |
test_keys_stack_nested_locked | 0.1385ms | 97.8862μs | 10.2159 KOps/s | 10.5581 KOps/s | |
test_values | 6.8485μs | 0.8478μs | 1.1796 MOps/s | 1.0833 MOps/s | |
test_values_nested | 63.1820μs | 38.1095μs | 26.2402 KOps/s | 26.5583 KOps/s | |
test_values_nested_locked | 70.9910μs | 39.4412μs | 25.3542 KOps/s | 25.3033 KOps/s | |
test_values_nested_leaf | 68.9020μs | 42.8616μs | 23.3309 KOps/s | 23.9325 KOps/s | |
test_values_stack_nested | 73.1510μs | 38.4930μs | 25.9787 KOps/s | 26.1109 KOps/s | |
test_values_stack_nested_leaf | 77.7120μs | 43.0410μs | 23.2337 KOps/s | 23.5295 KOps/s | |
test_values_stack_nested_locked | 68.3410μs | 40.1952μs | 24.8786 KOps/s | 24.9477 KOps/s | |
test_membership | 1.9985μs | 0.5070μs | 1.9725 MOps/s | 1.9628 MOps/s | |
test_membership_nested | 17.6650μs | 2.0007μs | 499.8256 KOps/s | 478.2226 KOps/s | |
test_membership_nested_leaf | 21.2855μs | 1.9919μs | 502.0281 KOps/s | 501.1068 KOps/s | |
test_membership_stacked_nested | 34.2710μs | 2.0936μs | 477.6459 KOps/s | 483.9733 KOps/s | |
test_membership_stacked_nested_leaf | 39.4110μs | 2.1031μs | 475.4931 KOps/s | 478.5690 KOps/s | |
test_membership_nested_last | 38.5610μs | 3.0973μs | 322.8573 KOps/s | 320.6019 KOps/s | |
test_membership_nested_leaf_last | 32.6610μs | 3.1050μs | 322.0611 KOps/s | 321.4022 KOps/s | |
test_membership_stacked_nested_last | 39.9610μs | 3.8500μs | 259.7428 KOps/s | 321.3608 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.8400μs | 3.8450μs | 260.0757 KOps/s | 316.6925 KOps/s | |
test_nested_getleaf | 32.1510μs | 6.1388μs | 162.8986 KOps/s | 160.0720 KOps/s | |
test_nested_get | 35.0510μs | 5.9478μs | 168.1283 KOps/s | 171.5358 KOps/s | |
test_stacked_getleaf | 28.2610μs | 6.1664μs | 162.1682 KOps/s | 162.6914 KOps/s | |
test_stacked_get | 48.3610μs | 5.9026μs | 169.4167 KOps/s | 172.2735 KOps/s | |
test_nested_getitemleaf | 48.4100μs | 6.5518μs | 152.6295 KOps/s | 155.6226 KOps/s | |
test_nested_getitem | 77.7520μs | 6.0365μs | 165.6587 KOps/s | 164.2387 KOps/s | |
test_stacked_getitemleaf | 32.8200μs | 6.4217μs | 155.7211 KOps/s | 155.0086 KOps/s | |
test_stacked_getitem | 31.3700μs | 6.1570μs | 162.4162 KOps/s | 164.1631 KOps/s | |
test_lock_nested | 8.7416ms | 0.3445ms | 2.9024 KOps/s | 2.8736 KOps/s | |
test_lock_stack_nested | 0.3842ms | 0.3424ms | 2.9207 KOps/s | 2.8873 KOps/s | |
test_unlock_nested | 0.4207ms | 0.2814ms | 3.5535 KOps/s | 3.5415 KOps/s | |
test_unlock_stack_nested | 0.3520ms | 0.2820ms | 3.5459 KOps/s | 3.5019 KOps/s | |
test_flatten_speed | 0.1090ms | 77.6201μs | 12.8833 KOps/s | 13.0343 KOps/s | |
test_unflatten_speed | 0.4060ms | 0.3281ms | 3.0476 KOps/s | 3.0709 KOps/s | |
test_common_ops | 0.7423ms | 0.5780ms | 1.7301 KOps/s | 1.5978 KOps/s | |
test_creation | 74.0610μs | 1.7935μs | 557.5687 KOps/s | 567.1840 KOps/s | |
test_creation_empty | 34.8810μs | 7.0632μs | 141.5782 KOps/s | 105.3758 KOps/s | |
test_creation_nested_1 | 38.7810μs | 8.7765μs | 113.9407 KOps/s | 90.5613 KOps/s | |
test_creation_nested_2 | 39.9700μs | 11.5689μs | 86.4389 KOps/s | 71.4851 KOps/s | |
test_clone | 51.9710μs | 10.0553μs | 99.4503 KOps/s | 100.0110 KOps/s | |
test_getitem[int] | 1.2791ms | 10.6562μs | 93.8423 KOps/s | 92.8896 KOps/s | |
test_getitem[slice_int] | 0.1133ms | 20.6128μs | 48.5135 KOps/s | 48.8718 KOps/s | |
test_getitem[range] | 0.1315ms | 36.3402μs | 27.5178 KOps/s | 25.2489 KOps/s | |
test_getitem[tuple] | 0.1066ms | 18.1154μs | 55.2016 KOps/s | 53.0295 KOps/s | |
test_getitem[list] | 0.1380ms | 31.8544μs | 31.3928 KOps/s | 27.8096 KOps/s | |
test_setitem_dim[int] | 38.4110μs | 17.6384μs | 56.6944 KOps/s | 49.3068 KOps/s | |
test_setitem_dim[slice_int] | 65.9220μs | 36.6597μs | 27.2779 KOps/s | 24.9532 KOps/s | |
test_setitem_dim[range] | 0.1182ms | 51.1931μs | 19.5339 KOps/s | 17.7962 KOps/s | |
test_setitem_dim[tuple] | 57.5310μs | 31.0413μs | 32.2151 KOps/s | 29.9986 KOps/s | |
test_setitem | 47.5410μs | 13.7232μs | 72.8694 KOps/s | 59.4986 KOps/s | |
test_set | 39.7510μs | 13.2032μs | 75.7394 KOps/s | 62.9779 KOps/s | |
test_set_shared | 0.5056ms | 0.1554ms | 6.4353 KOps/s | 6.2831 KOps/s | |
test_update | 0.2372ms | 15.4405μs | 64.7648 KOps/s | 55.0780 KOps/s | |
test_update_nested | 58.3510μs | 21.0019μs | 47.6148 KOps/s | 41.6231 KOps/s | |
test_update__nested | 0.5183ms | 24.7353μs | 40.4280 KOps/s | 41.1561 KOps/s | |
test_set_nested | 54.0110μs | 14.4115μs | 69.3890 KOps/s | 62.3321 KOps/s | |
test_set_nested_new | 52.6910μs | 17.0830μs | 58.5377 KOps/s | 54.6843 KOps/s | |
test_select | 78.2520μs | 28.7764μs | 34.7507 KOps/s | 32.6749 KOps/s | |
test_select_nested | 74.9810μs | 44.0094μs | 22.7224 KOps/s | 22.7702 KOps/s | |
test_exclude_nested | 95.5720μs | 64.7409μs | 15.4462 KOps/s | 15.7183 KOps/s | |
test_empty[True] | 0.3476ms | 0.2958ms | 3.3803 KOps/s | 3.3840 KOps/s | |
test_empty[False] | 3.5531μs | 0.8417μs | 1.1881 MOps/s | 1.2078 MOps/s | |
test_to | 83.8420μs | 55.5340μs | 18.0070 KOps/s | 18.1792 KOps/s | |
test_to_nonblocking | 0.1090ms | 47.5497μs | 21.0306 KOps/s | 20.1680 KOps/s | |
test_unbind_speed | 0.2873ms | 0.2411ms | 4.1469 KOps/s | 4.2115 KOps/s | |
test_unbind_speed_stack0 | 0.2827ms | 0.2428ms | 4.1183 KOps/s | 4.1918 KOps/s | |
test_unbind_speed_stack1 | 92.1207ms | 0.7298ms | 1.3703 KOps/s | 1.3753 KOps/s | |
test_split | 95.0847ms | 1.6031ms | 623.7805 Ops/s | 623.3996 Ops/s | |
test_chunk | 93.9495ms | 1.5889ms | 629.3695 Ops/s | 626.2557 Ops/s | |
test_consolidate[False-None] | 3.5112ms | 2.6831ms | 372.7075 Ops/s | 373.3947 Ops/s | |
test_consolidate[default-None] | 1.7504ms | 1.6650ms | 600.5834 Ops/s | 596.2902 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7854ms | 1.7205ms | 581.2391 Ops/s | 584.0982 Ops/s | |
test_consolidate_njt[False-None] | 6.9316ms | 6.7419ms | 148.3252 Ops/s | 151.3883 Ops/s | |
test_to[False-False-None] | 1.9478ms | 1.7200ms | 581.4122 Ops/s | 595.6532 Ops/s | |
test_to[True-False-None] | 1.6488ms | 1.3642ms | 733.0375 Ops/s | 736.4438 Ops/s | |
test_to[within-False-None] | 4.4657ms | 4.2494ms | 235.3249 Ops/s | 243.2118 Ops/s | |
test_to[True-default-None] | 5.7559ms | 5.5330ms | 180.7353 Ops/s | 186.6429 Ops/s | |
test_to_njt[False-False-None] | 7.0492ms | 6.8757ms | 145.4400 Ops/s | 146.3719 Ops/s | |
test_to_njt[True-False-None] | 5.9777ms | 5.5411ms | 180.4711 Ops/s | 181.5499 Ops/s | |
test_to_njt[within-False-None] | 12.3676ms | 12.2379ms | 81.7135 Ops/s | 82.1910 Ops/s | |
test_creation[device0] | 0.4688ms | 78.8328μs | 12.6851 KOps/s | 12.7401 KOps/s | |
test_creation_from_tensor | 0.6558ms | 82.7323μs | 12.0872 KOps/s | 12.0081 KOps/s | |
test_add_one[memmap_tensor0] | 0.3387ms | 6.2882μs | 159.0271 KOps/s | 158.2128 KOps/s | |
test_contiguous[memmap_tensor0] | 2.0275μs | 0.4100μs | 2.4391 MOps/s | 2.4328 MOps/s | |
test_stack[memmap_tensor0] | 41.8310μs | 4.3380μs | 230.5234 KOps/s | 222.4854 KOps/s | |
test_memmaptd_index | 1.6146ms | 0.2362ms | 4.2334 KOps/s | 4.0939 KOps/s | |
test_memmaptd_index_astensor | 0.5827ms | 0.2967ms | 3.3701 KOps/s | 3.2589 KOps/s | |
test_memmaptd_index_op | 0.6743ms | 0.5319ms | 1.8801 KOps/s | 1.7046 KOps/s | |
test_serialize_model | 0.4145s | 0.1710s | 5.8492 Ops/s | 7.6900 Ops/s | |
test_serialize_model_pickle | 1.3461s | 1.1902s | 0.8402 Ops/s | 0.8250 Ops/s | |
test_serialize_weights | 0.1311s | 0.1292s | 7.7413 Ops/s | 7.7381 Ops/s | |
test_serialize_weights_returnearly | 0.3254s | 53.9446ms | 18.5375 Ops/s | 23.2166 Ops/s | |
test_serialize_weights_pickle | 1.3891s | 1.1921s | 0.8389 Ops/s | 0.8222 Ops/s | |
test_reshape_pytree | 61.4020μs | 22.3252μs | 44.7925 KOps/s | 44.6414 KOps/s | |
test_reshape_td | 58.6910μs | 26.9340μs | 37.1278 KOps/s | 35.5758 KOps/s | |
test_view_pytree | 50.3710μs | 21.9460μs | 45.5663 KOps/s | 45.6305 KOps/s | |
test_view_td | 57.8310μs | 30.6677μs | 32.6076 KOps/s | 29.8381 KOps/s | |
test_unbind_pytree | 62.1410μs | 27.9630μs | 35.7616 KOps/s | 35.4421 KOps/s | |
test_unbind_td | 0.7527ms | 36.4363μs | 27.4451 KOps/s | 26.7658 KOps/s | |
test_split_pytree | 70.0210μs | 29.8793μs | 33.4680 KOps/s | 33.2707 KOps/s | |
test_split_td | 0.9589ms | 38.2484μs | 26.1449 KOps/s | 25.6035 KOps/s | |
test_add_pytree | 72.6110μs | 32.3405μs | 30.9209 KOps/s | 29.7143 KOps/s | |
test_add_td | 76.5120μs | 45.1087μs | 22.1687 KOps/s | 20.4414 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1761ms | 0.1234ms | 8.1014 KOps/s | 7.7762 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2358ms | 0.1332ms | 7.5069 KOps/s | 7.5055 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1371ms | 96.3429μs | 10.3796 KOps/s | 10.2706 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.3623ms | 0.1514ms | 6.6066 KOps/s | 6.7805 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 64.9620μs | 25.2479μs | 39.6073 KOps/s | 39.3933 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 69.1510μs | 29.5082μs | 33.8889 KOps/s | 33.7027 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3636ms | 64.1934μs | 15.5779 KOps/s | 15.2806 KOps/s | |
test_compile_copy_nested[pytree-eager] | 82.7720μs | 49.0149μs | 20.4019 KOps/s | 20.0124 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1822ms | 0.1414ms | 7.0706 KOps/s | 7.0771 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3108ms | 0.2175ms | 4.5974 KOps/s | 4.6369 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1358ms | 97.8534μs | 10.2194 KOps/s | 9.7650 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1138ms | 56.4800μs | 17.7054 KOps/s | 17.0840 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1842ms | 0.1371ms | 7.2944 KOps/s | 7.5000 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5843ms | 0.4859ms | 2.0580 KOps/s | 2.1345 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3881ms | 0.2616ms | 3.8227 KOps/s | 3.8472 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1977ms | 0.1492ms | 6.7004 KOps/s | 7.0453 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1829ms | 68.5913μs | 14.5791 KOps/s | 14.5744 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1450ms | 99.2578μs | 10.0748 KOps/s | 10.2560 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4619ms | 0.4081ms | 2.4506 KOps/s | 2.5029 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1804ms | 0.1357ms | 7.3713 KOps/s | 7.4810 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 80.9020μs | 18.8017μs | 53.1868 KOps/s | 54.3156 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 76.6320μs | 31.5614μs | 31.6843 KOps/s | 32.0359 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1085ms | 69.8162μs | 14.3233 KOps/s | 14.1446 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1960ms | 50.8200μs | 19.6773 KOps/s | 19.3055 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6234ms | 0.3897ms | 2.5662 KOps/s | 2.1288 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8063ms | 2.6396ms | 378.8394 Ops/s | 386.1760 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5532ms | 0.4223ms | 2.3683 KOps/s | 2.2833 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.6943ms | 2.6057ms | 383.7676 Ops/s | 388.3098 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6328ms | 0.1099ms | 9.0996 KOps/s | 8.5820 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5516ms | 79.8513μs | 12.5233 KOps/s | 12.2198 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2384ms | 0.1024ms | 9.7690 KOps/s | 9.1196 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1249ms | 66.0018μs | 15.1511 KOps/s | 14.1379 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2047ms | 0.1031ms | 9.7023 KOps/s | 8.9965 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1174ms | 69.0230μs | 14.4879 KOps/s | 14.1033 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1562ms | 99.4616μs | 10.0541 KOps/s | 9.5599 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1385ms | 17.0443μs | 58.6707 KOps/s | 53.0019 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1398ms | 94.5419μs | 10.5773 KOps/s | 10.0045 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 76.3610μs | 15.5642μs | 64.2501 KOps/s | 55.2800 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1382ms | 95.4358μs | 10.4782 KOps/s | 9.7468 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.2810μs | 15.5487μs | 64.3140 KOps/s | 60.1505 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1452ms | 99.5912μs | 10.0410 KOps/s | 9.4184 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5768ms | 16.6887μs | 59.9208 KOps/s | 52.8591 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1386ms | 95.1821μs | 10.5062 KOps/s | 9.8307 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 40.8710μs | 15.5960μs | 64.1189 KOps/s | 59.6460 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1669ms | 99.3406μs | 10.0664 KOps/s | 9.7903 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.2610μs | 15.4049μs | 64.9144 KOps/s | 60.3761 KOps/s | |
test_mod_add[eager] | 0.1975ms | 37.3189μs | 26.7961 KOps/s | 25.2460 KOps/s | |
test_mod_add[compile] | 0.3671ms | 82.2126μs | 12.1636 KOps/s | 11.6489 KOps/s | |
test_mod_add[compile-overhead] | 0.3228ms | 0.1723ms | 5.8033 KOps/s | 5.6895 KOps/s | |
test_mod_wrap[eager] | 0.3267ms | 0.2480ms | 4.0314 KOps/s | 3.9103 KOps/s | |
test_mod_wrap[compile] | 0.3610ms | 0.2826ms | 3.5383 KOps/s | 3.4965 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0015ms | 3.7488ms | 266.7512 Ops/s | 267.8628 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4569ms | 1.3180ms | 758.7348 Ops/s | 712.3669 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3211ms | 1.2562ms | 796.0400 Ops/s | 730.9971 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3551ms | 0.9049ms | 1.1051 KOps/s | 977.5546 Ops/s | |
test_seq_add[eager] | 0.1784ms | 0.1154ms | 8.6675 KOps/s | 8.6103 KOps/s | |
test_seq_add[compile] | 0.1773ms | 86.9463μs | 11.5014 KOps/s | 11.1957 KOps/s | |
test_seq_add[compile-overhead] | 0.1842ms | 0.1274ms | 7.8475 KOps/s | 7.8330 KOps/s | |
test_seq_wrap[eager] | 0.4784ms | 0.4106ms | 2.4356 KOps/s | 2.3735 KOps/s | |
test_seq_wrap[compile] | 0.3525ms | 0.2988ms | 3.3469 KOps/s | 3.3289 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2657ms | 0.2218ms | 4.5088 KOps/s | 4.4703 KOps/s | |
test_func_call_runtime[False-eager] | 0.8067ms | 0.7172ms | 1.3943 KOps/s | 1.3982 KOps/s | |
test_func_call_runtime[False-compile] | 0.8005ms | 0.7400ms | 1.3513 KOps/s | 1.3150 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4170ms | 0.3566ms | 2.8043 KOps/s | 2.7581 KOps/s | |
test_func_call_runtime[True-eager] | 0.9634ms | 0.8768ms | 1.1405 KOps/s | 1.1325 KOps/s | |
test_func_call_runtime[True-compile] | 0.9036ms | 0.7615ms | 1.3132 KOps/s | 1.2804 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4270ms | 0.3789ms | 2.6394 KOps/s | 2.6098 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7789ms | 0.7096ms | 1.4093 KOps/s | 1.3659 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8348ms | 0.7436ms | 1.3449 KOps/s | 1.3434 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5082ms | 0.3597ms | 2.7804 KOps/s | 2.7400 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0974ms | 0.9875ms | 1.0126 KOps/s | 1.0064 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0742ms | 0.9666ms | 1.0345 KOps/s | 1.0401 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0383ms | 0.9653ms | 1.0359 KOps/s | 1.0329 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4270ms | 2.0150ms | 496.2864 Ops/s | 491.4058 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8602ms | 0.8078ms | 1.2379 KOps/s | 1.2281 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5149ms | 0.4107ms | 2.4347 KOps/s | 2.4121 KOps/s | |
test_distributed | 7.8121ms | 0.3294ms | 3.0360 KOps/s | 8.7067 KOps/s | |
test_tdmodule | 0.3373ms | 19.5638μs | 51.1147 KOps/s | 49.2986 KOps/s | |
test_tdmodule_dispatch | 57.5210μs | 34.3521μs | 29.1103 KOps/s | 26.7245 KOps/s | |
test_tdseq | 40.8610μs | 20.1136μs | 49.7176 KOps/s | 46.4033 KOps/s | |
test_tdseq_dispatch | 46.6510μs | 37.1028μs | 26.9521 KOps/s | 24.9414 KOps/s | |
test_instantiation_functorch | 1.6426ms | 1.5528ms | 644.0184 Ops/s | 639.8721 Ops/s | |
test_exec_functorch | 0.1942ms | 0.1414ms | 7.0737 KOps/s | 6.9985 KOps/s | |
test_exec_functional_call | 0.1918ms | 0.1312ms | 7.6239 KOps/s | 7.4549 KOps/s | |
test_exec_td_decorator | 0.3662ms | 0.1803ms | 5.5475 KOps/s | 5.4418 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7826ms | 0.6604ms | 1.5142 KOps/s | 1.4973 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7902ms | 0.6701ms | 1.4923 KOps/s | 1.4882 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7110ms | 0.5995ms | 1.6681 KOps/s | 1.7275 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7294ms | 0.5990ms | 1.6694 KOps/s | 1.7288 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.2348ms | 18.6140ms | 53.7230 Ops/s | 54.0174 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.1635ms | 18.6083ms | 53.7394 Ops/s | 53.9950 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.4969ms | 18.4177ms | 54.2957 Ops/s | 54.5133 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.4899ms | 18.4349ms | 54.2449 Ops/s | 54.5213 Ops/s | |
test_to_module_speed[True] | 1.2405ms | 0.9803ms | 1.0201 KOps/s | 1.0261 KOps/s | |
test_to_module_speed[False] | 1.3001ms | 0.9708ms | 1.0301 KOps/s | 1.0302 KOps/s | |
test_tc_init | 0.1199ms | 34.2876μs | 29.1650 KOps/s | 26.2724 KOps/s | |
test_tc_init_nested | 0.1251ms | 68.3517μs | 14.6302 KOps/s | 12.9305 KOps/s | |
test_tc_first_layer_tensor | 5.5059μs | 0.7168μs | 1.3951 MOps/s | 1.3766 MOps/s | |
test_tc_first_layer_nontensor | 28.0100μs | 2.2569μs | 443.0793 KOps/s | 437.8826 KOps/s | |
test_tc_second_layer_tensor | 16.9203μs | 1.4125μs | 707.9637 KOps/s | 699.2781 KOps/s | |
test_tc_second_layer_nontensor | 30.5110μs | 3.0126μs | 331.9394 KOps/s | 334.1564 KOps/s | |
test_unbind | 0.2232s | 10.0157ms | 99.8435 Ops/s | 144.1046 Ops/s | |
test_full_like | 10.7421ms | 9.1752ms | 108.9900 Ops/s | 105.6926 Ops/s | |
test_zeros_like | 4.9381ms | 4.3161ms | 231.6914 Ops/s | 233.3964 Ops/s | |
test_ones_like | 4.6679ms | 4.3161ms | 231.6921 Ops/s | 96.9818 Ops/s | |
test_clone | 6.5348ms | 6.3704ms | 156.9759 Ops/s | 156.7537 Ops/s | |
test_squeeze | 60.6010μs | 9.8255μs | 101.7763 KOps/s | 103.8359 KOps/s | |
test_unsqueeze | 0.1671ms | 73.1966μs | 13.6618 KOps/s | 13.6498 KOps/s | |
test_split | 0.3554ms | 0.1620ms | 6.1741 KOps/s | 6.2324 KOps/s | |
test_permute | 0.2946ms | 0.1767ms | 5.6578 KOps/s | 5.5025 KOps/s | |
test_stack | 50.4706ms | 50.1410ms | 19.9438 Ops/s | 19.9678 Ops/s | |
test_cat | 50.4184ms | 50.1282ms | 19.9488 Ops/s | 19.9985 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 3, 2025
ghstack-source-id: c0a801e4922878d65b0f81357afd7ecddc6943fb Pull Request resolved: #1201
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):