Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deprecation] Softly change default behavior of auto_unwrap #2793

Merged
merged 3 commits into from
Feb 20, 2025

Conversation

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 1f526de8a22baed9ca378b7c25e42268b293e39a
Pull Request resolved: #2793
Copy link

pytorch-bot bot commented Feb 19, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2793

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 1 Unrelated Failure

As of commit 73a47c9 with merge base 76aa9bc (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 19, 2025
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 1f526de8a22baed9ca378b7c25e42268b293e39a
Pull Request resolved: #2793
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: a3115798ad5ed176c6411d2021413a73f584709d
Pull Request resolved: #2793
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 20, 2025
ghstack-source-id: c28c11ecf68fba0ffde652205ea8e46f8da07cf1
Pull Request resolved: #2793
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6294s 0.5263s 1.9001 Ops/s 1.9020 Ops/s $\color{#d91a1a}-0.10\%$
test_transformed 1.1388s 1.0302s 0.9707 Ops/s 0.9622 Ops/s $\color{#35bf28}+0.88\%$
test_serial 1.6310s 1.5346s 0.6516 Ops/s 0.6492 Ops/s $\color{#35bf28}+0.38\%$
test_parallel 1.4259s 1.3218s 0.7565 Ops/s 0.7561 Ops/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-True-True-True-True] 0.1913ms 33.0509μs 30.2564 KOps/s 33.1350 KOps/s $\textbf{\color{#d91a1a}-8.69\%}$
test_step_mdp_speed[True-True-True-True-False] 50.2840μs 17.9179μs 55.8100 KOps/s 55.3632 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-True-False-True] 47.9990μs 17.1258μs 58.3913 KOps/s 58.6619 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[True-True-True-False-False] 65.2220μs 10.0530μs 99.4728 KOps/s 98.7455 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-True-False-True-True] 63.2480μs 31.9984μs 31.2516 KOps/s 30.7033 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[True-True-False-True-False] 50.1730μs 19.8935μs 50.2676 KOps/s 49.9965 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-True-False-False-True] 0.6360ms 18.8929μs 52.9300 KOps/s 52.7025 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-True-False-False-False] 85.6070μs 11.9436μs 83.7267 KOps/s 82.7376 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-False-True-True-True] 74.3890μs 33.4295μs 29.9137 KOps/s 29.6280 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-False-True-True-False] 62.1860μs 21.3476μs 46.8437 KOps/s 45.7023 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[True-False-True-False-True] 55.2930μs 18.7944μs 53.2072 KOps/s 52.4752 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-False-True-False-False] 52.9590μs 11.8600μs 84.3172 KOps/s 82.8898 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-False-False-True-True] 76.9840μs 35.3097μs 28.3208 KOps/s 27.7766 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[True-False-False-True-False] 73.6970μs 23.4139μs 42.7096 KOps/s 41.9271 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-False-False-False-True] 61.0140μs 20.4886μs 48.8075 KOps/s 48.4834 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-False-False-False] 52.9490μs 13.7393μs 72.7837 KOps/s 72.8153 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-True-True-True-True] 0.1137ms 33.6872μs 29.6849 KOps/s 29.4459 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-True-True-True-False] 0.1194ms 21.3199μs 46.9046 KOps/s 46.1169 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-True-True-False-True] 63.9400μs 22.0895μs 45.2704 KOps/s 46.0728 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[False-True-True-False-False] 74.4590μs 13.7418μs 72.7709 KOps/s 74.5876 KOps/s $\color{#d91a1a}-2.44\%$
test_step_mdp_speed[False-True-False-True-True] 0.1001ms 37.1378μs 26.9267 KOps/s 28.0334 KOps/s $\color{#d91a1a}-3.95\%$
test_step_mdp_speed[False-True-False-True-False] 65.2620μs 23.7684μs 42.0727 KOps/s 42.2017 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[False-True-False-False-True] 2.8183ms 23.1929μs 43.1166 KOps/s 42.5318 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-True-False-False-False] 0.6010ms 15.0840μs 66.2955 KOps/s 65.1194 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-False-True-True-True] 78.0360μs 37.2694μs 26.8316 KOps/s 26.0828 KOps/s $\color{#35bf28}+2.87\%$
test_step_mdp_speed[False-False-True-True-False] 66.1430μs 25.0294μs 39.9529 KOps/s 39.0001 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-False-True-False-True] 60.6630μs 23.1747μs 43.1504 KOps/s 42.3059 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[False-False-True-False-False] 58.2590μs 15.1461μs 66.0237 KOps/s 65.4630 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-False-True-True] 79.2580μs 39.1124μs 25.5673 KOps/s 25.4943 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-False-False-True-False] 71.2830μs 27.3535μs 36.5584 KOps/s 37.2859 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-False-False-False-True] 0.1047ms 25.5882μs 39.0806 KOps/s 38.3449 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[False-False-False-False-False] 70.3510μs 17.0737μs 58.5696 KOps/s 59.2578 KOps/s $\color{#d91a1a}-1.16\%$
test_values[generalized_advantage_estimate-True-True] 11.1038ms 10.0445ms 99.5569 Ops/s 99.0498 Ops/s $\color{#35bf28}+0.51\%$
test_values[vec_generalized_advantage_estimate-True-True] 29.5791ms 25.4054ms 39.3617 Ops/s 37.1958 Ops/s $\textbf{\color{#35bf28}+5.82\%}$
test_values[td0_return_estimate-False-False] 0.2830ms 0.2011ms 4.9731 KOps/s 4.5021 KOps/s $\textbf{\color{#35bf28}+10.46\%}$
test_values[td1_return_estimate-False-False] 28.3738ms 25.0043ms 39.9932 Ops/s 39.8992 Ops/s $\color{#35bf28}+0.24\%$
test_values[vec_td1_return_estimate-False-False] 27.0611ms 24.5905ms 40.6662 Ops/s 36.6716 Ops/s $\textbf{\color{#35bf28}+10.89\%}$
test_values[td_lambda_return_estimate-True-False] 38.3832ms 35.6745ms 28.0313 Ops/s 28.1029 Ops/s $\color{#d91a1a}-0.25\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.4788ms 24.4739ms 40.8599 Ops/s 37.0731 Ops/s $\textbf{\color{#35bf28}+10.21\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.7432ms 8.6584ms 115.4949 Ops/s 115.8637 Ops/s $\color{#d91a1a}-0.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.6651ms 1.9532ms 511.9753 Ops/s 494.4874 Ops/s $\color{#35bf28}+3.54\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4604ms 0.3720ms 2.6879 KOps/s 2.7024 KOps/s $\color{#d91a1a}-0.54\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.3602ms 41.7233ms 23.9674 Ops/s 22.3621 Ops/s $\textbf{\color{#35bf28}+7.18\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.8430ms 3.5302ms 283.2669 Ops/s 281.8030 Ops/s $\color{#35bf28}+0.52\%$
test_dqn_speed[False-None] 1.6693ms 1.4066ms 710.9127 Ops/s 691.5444 Ops/s $\color{#35bf28}+2.80\%$
test_dqn_speed[False-backward] 1.9802ms 1.9255ms 519.3547 Ops/s 514.0161 Ops/s $\color{#35bf28}+1.04\%$
test_dqn_speed[True-None] 0.7780ms 0.4913ms 2.0356 KOps/s 1.9749 KOps/s $\color{#35bf28}+3.08\%$
test_dqn_speed[True-backward] 1.0072ms 0.9373ms 1.0669 KOps/s 803.5164 Ops/s $\textbf{\color{#35bf28}+32.78\%}$
test_dqn_speed[reduce-overhead-None] 0.7775ms 0.4971ms 2.0118 KOps/s 1.9993 KOps/s $\color{#35bf28}+0.62\%$
test_dqn_speed[reduce-overhead-backward] 1.1142ms 0.9461ms 1.0570 KOps/s 1.0397 KOps/s $\color{#35bf28}+1.67\%$
test_ddpg_speed[False-None] 3.3209ms 2.9375ms 340.4229 Ops/s 333.2994 Ops/s $\color{#35bf28}+2.14\%$
test_ddpg_speed[False-backward] 4.4768ms 4.1696ms 239.8309 Ops/s 238.5635 Ops/s $\color{#35bf28}+0.53\%$
test_ddpg_speed[True-None] 1.4511ms 1.2581ms 794.8579 Ops/s 789.8112 Ops/s $\color{#35bf28}+0.64\%$
test_ddpg_speed[True-backward] 2.7339ms 2.3949ms 417.5552 Ops/s 447.2778 Ops/s $\textbf{\color{#d91a1a}-6.65\%}$
test_ddpg_speed[reduce-overhead-None] 0.2590s 1.6388ms 610.2157 Ops/s 786.3549 Ops/s $\textbf{\color{#d91a1a}-22.40\%}$
test_ddpg_speed[reduce-overhead-backward] 2.3974ms 2.2146ms 451.5430 Ops/s 439.7679 Ops/s $\color{#35bf28}+2.68\%$
test_sac_speed[False-None] 9.4637ms 8.5868ms 116.4577 Ops/s 117.1255 Ops/s $\color{#d91a1a}-0.57\%$
test_sac_speed[False-backward] 13.7845ms 11.9149ms 83.9288 Ops/s 87.7303 Ops/s $\color{#d91a1a}-4.33\%$
test_sac_speed[True-None] 3.1913ms 2.2652ms 441.4693 Ops/s 429.7642 Ops/s $\color{#35bf28}+2.72\%$
test_sac_speed[True-backward] 4.8146ms 4.3234ms 231.3008 Ops/s 236.7568 Ops/s $\color{#d91a1a}-2.30\%$
test_sac_speed[reduce-overhead-None] 5.0510ms 2.4115ms 414.6821 Ops/s 405.8477 Ops/s $\color{#35bf28}+2.18\%$
test_sac_speed[reduce-overhead-backward] 5.2193ms 4.3268ms 231.1201 Ops/s 238.3031 Ops/s $\color{#d91a1a}-3.01\%$
test_redq_speed[False-None] 16.1271ms 13.4484ms 74.3581 Ops/s 69.8174 Ops/s $\textbf{\color{#35bf28}+6.50\%}$
test_redq_speed[False-backward] 25.4673ms 23.0797ms 43.3281 Ops/s 42.4155 Ops/s $\color{#35bf28}+2.15\%$
test_redq_speed[True-None] 8.9609ms 6.0516ms 165.2448 Ops/s 164.6844 Ops/s $\color{#35bf28}+0.34\%$
test_redq_speed[True-backward] 14.7027ms 13.4182ms 74.5255 Ops/s 72.3175 Ops/s $\color{#35bf28}+3.05\%$
test_redq_speed[reduce-overhead-None] 6.6638ms 5.9577ms 167.8494 Ops/s 164.0091 Ops/s $\color{#35bf28}+2.34\%$
test_redq_speed[reduce-overhead-backward] 15.0508ms 13.4834ms 74.1655 Ops/s 70.7819 Ops/s $\color{#35bf28}+4.78\%$
test_redq_deprec_speed[False-None] 15.6080ms 14.3424ms 69.7235 Ops/s 67.9109 Ops/s $\color{#35bf28}+2.67\%$
test_redq_deprec_speed[False-backward] 21.7198ms 20.3837ms 49.0587 Ops/s 48.1368 Ops/s $\color{#35bf28}+1.92\%$
test_redq_deprec_speed[True-None] 6.0908ms 4.6489ms 215.1043 Ops/s 207.9289 Ops/s $\color{#35bf28}+3.45\%$
test_redq_deprec_speed[True-backward] 11.3419ms 9.8286ms 101.7434 Ops/s 99.8869 Ops/s $\color{#35bf28}+1.86\%$
test_redq_deprec_speed[reduce-overhead-None] 5.8982ms 4.5863ms 218.0400 Ops/s 222.1264 Ops/s $\color{#d91a1a}-1.84\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.9017ms 9.2421ms 108.2010 Ops/s 99.0303 Ops/s $\textbf{\color{#35bf28}+9.26\%}$
test_td3_speed[False-None] 8.9758ms 8.5151ms 117.4379 Ops/s 113.8782 Ops/s $\color{#35bf28}+3.13\%$
test_td3_speed[False-backward] 11.5545ms 10.9497ms 91.3269 Ops/s 83.2867 Ops/s $\textbf{\color{#35bf28}+9.65\%}$
test_td3_speed[True-None] 2.0564ms 1.8457ms 541.8074 Ops/s 501.3343 Ops/s $\textbf{\color{#35bf28}+8.07\%}$
test_td3_speed[True-backward] 4.2660ms 3.6537ms 273.6955 Ops/s 264.5019 Ops/s $\color{#35bf28}+3.48\%$
test_td3_speed[reduce-overhead-None] 2.0269ms 1.8387ms 543.8547 Ops/s 513.5862 Ops/s $\textbf{\color{#35bf28}+5.89\%}$
test_td3_speed[reduce-overhead-backward] 4.1952ms 3.5311ms 283.1967 Ops/s 240.7211 Ops/s $\textbf{\color{#35bf28}+17.65\%}$
test_cql_speed[False-None] 39.1439ms 37.4509ms 26.7016 Ops/s 25.2037 Ops/s $\textbf{\color{#35bf28}+5.94\%}$
test_cql_speed[False-backward] 53.4104ms 48.0788ms 20.7992 Ops/s 20.5026 Ops/s $\color{#35bf28}+1.45\%$
test_cql_speed[True-None] 17.4371ms 16.5358ms 60.4748 Ops/s 58.4082 Ops/s $\color{#35bf28}+3.54\%$
test_cql_speed[True-backward] 24.6235ms 23.7873ms 42.0392 Ops/s 41.0667 Ops/s $\color{#35bf28}+2.37\%$
test_cql_speed[reduce-overhead-None] 17.5833ms 16.7101ms 59.8440 Ops/s 59.6055 Ops/s $\color{#35bf28}+0.40\%$
test_cql_speed[reduce-overhead-backward] 24.9710ms 23.7848ms 42.0437 Ops/s 41.5171 Ops/s $\color{#35bf28}+1.27\%$
test_a2c_speed[False-None] 8.9499ms 7.5523ms 132.4097 Ops/s 127.3344 Ops/s $\color{#35bf28}+3.99\%$
test_a2c_speed[False-backward] 20.9974ms 15.7855ms 63.3492 Ops/s 63.8790 Ops/s $\color{#d91a1a}-0.83\%$
test_a2c_speed[True-None] 4.8711ms 3.9203ms 255.0846 Ops/s 249.0593 Ops/s $\color{#35bf28}+2.42\%$
test_a2c_speed[True-backward] 11.4183ms 10.9656ms 91.1946 Ops/s 87.5894 Ops/s $\color{#35bf28}+4.12\%$
test_a2c_speed[reduce-overhead-None] 4.9823ms 4.0104ms 249.3487 Ops/s 244.2265 Ops/s $\color{#35bf28}+2.10\%$
test_a2c_speed[reduce-overhead-backward] 12.0731ms 11.1928ms 89.3433 Ops/s 90.5496 Ops/s $\color{#d91a1a}-1.33\%$
test_ppo_speed[False-None] 9.0231ms 8.1390ms 122.8651 Ops/s 123.6375 Ops/s $\color{#d91a1a}-0.62\%$
test_ppo_speed[False-backward] 16.4625ms 15.7615ms 63.4458 Ops/s 63.2658 Ops/s $\color{#35bf28}+0.28\%$
test_ppo_speed[True-None] 5.0665ms 4.4559ms 224.4203 Ops/s 220.9273 Ops/s $\color{#35bf28}+1.58\%$
test_ppo_speed[True-backward] 11.2639ms 10.7730ms 92.8244 Ops/s 90.3887 Ops/s $\color{#35bf28}+2.69\%$
test_ppo_speed[reduce-overhead-None] 4.9695ms 4.3940ms 227.5807 Ops/s 220.8518 Ops/s $\color{#35bf28}+3.05\%$
test_ppo_speed[reduce-overhead-backward] 12.8440ms 11.0915ms 90.1592 Ops/s 91.5168 Ops/s $\color{#d91a1a}-1.48\%$
test_reinforce_speed[False-None] 9.2252ms 7.2319ms 138.2769 Ops/s 143.7327 Ops/s $\color{#d91a1a}-3.80\%$
test_reinforce_speed[False-backward] 11.6097ms 10.5644ms 94.6578 Ops/s 96.7106 Ops/s $\color{#d91a1a}-2.12\%$
test_reinforce_speed[True-None] 3.9500ms 3.3051ms 302.5632 Ops/s 301.2085 Ops/s $\color{#35bf28}+0.45\%$
test_reinforce_speed[True-backward] 10.9391ms 9.6923ms 103.1742 Ops/s 103.0229 Ops/s $\color{#35bf28}+0.15\%$
test_reinforce_speed[reduce-overhead-None] 3.9326ms 3.4178ms 292.5834 Ops/s 306.3423 Ops/s $\color{#d91a1a}-4.49\%$
test_reinforce_speed[reduce-overhead-backward] 11.3391ms 10.0761ms 99.2443 Ops/s 100.2509 Ops/s $\color{#d91a1a}-1.00\%$
test_iql_speed[False-None] 35.1257ms 33.7855ms 29.5985 Ops/s 28.9944 Ops/s $\color{#35bf28}+2.08\%$
test_iql_speed[False-backward] 54.4707ms 47.1076ms 21.2280 Ops/s 21.1416 Ops/s $\color{#35bf28}+0.41\%$
test_iql_speed[True-None] 13.0186ms 11.8711ms 84.2380 Ops/s 82.8456 Ops/s $\color{#35bf28}+1.68\%$
test_iql_speed[True-backward] 24.9641ms 23.6903ms 42.2113 Ops/s 41.2391 Ops/s $\color{#35bf28}+2.36\%$
test_iql_speed[reduce-overhead-None] 12.8032ms 12.0059ms 83.2923 Ops/s 80.1144 Ops/s $\color{#35bf28}+3.97\%$
test_iql_speed[reduce-overhead-backward] 25.3616ms 23.6155ms 42.3451 Ops/s 41.2698 Ops/s $\color{#35bf28}+2.61\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.7040ms 5.2742ms 189.6024 Ops/s 178.1625 Ops/s $\textbf{\color{#35bf28}+6.42\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8935ms 0.5661ms 1.7665 KOps/s 1.7977 KOps/s $\color{#d91a1a}-1.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8451ms 0.5374ms 1.8609 KOps/s 1.8729 KOps/s $\color{#d91a1a}-0.64\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.9353ms 5.1069ms 195.8129 Ops/s 193.4104 Ops/s $\color{#35bf28}+1.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1147ms 0.5542ms 1.8044 KOps/s 1.7753 KOps/s $\color{#35bf28}+1.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7975ms 0.5276ms 1.8953 KOps/s 1.8941 KOps/s $\color{#35bf28}+0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3928ms 1.7359ms 576.0642 Ops/s 556.8703 Ops/s $\color{#35bf28}+3.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.5449ms 1.6482ms 606.7191 Ops/s 585.6123 Ops/s $\color{#35bf28}+3.60\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4061ms 5.1531ms 194.0575 Ops/s 186.0887 Ops/s $\color{#35bf28}+4.28\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1378ms 0.6879ms 1.4537 KOps/s 1.3668 KOps/s $\textbf{\color{#35bf28}+6.36\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0924ms 0.6719ms 1.4884 KOps/s 1.4295 KOps/s $\color{#35bf28}+4.12\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0095ms 5.0069ms 199.7233 Ops/s 189.5447 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4937ms 0.5636ms 1.7743 KOps/s 1.7392 KOps/s $\color{#35bf28}+2.02\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7382ms 0.5252ms 1.9040 KOps/s 1.7893 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.7441ms 5.0039ms 199.8445 Ops/s 192.0841 Ops/s $\color{#35bf28}+4.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.0594ms 0.5523ms 1.8105 KOps/s 1.7559 KOps/s $\color{#35bf28}+3.11\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8532ms 0.5335ms 1.8745 KOps/s 1.8314 KOps/s $\color{#35bf28}+2.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.3548ms 5.5134ms 181.3775 Ops/s 180.2839 Ops/s $\color{#35bf28}+0.61\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3950ms 0.7598ms 1.3162 KOps/s 1.4144 KOps/s $\textbf{\color{#d91a1a}-6.94\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9384ms 0.6625ms 1.5094 KOps/s 1.4450 KOps/s $\color{#35bf28}+4.46\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.2053ms 4.4747ms 223.4808 Ops/s 228.9337 Ops/s $\color{#d91a1a}-2.38\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.0824ms 2.4368ms 410.3786 Ops/s 401.0211 Ops/s $\color{#35bf28}+2.33\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.7332ms 1.4389ms 694.9720 Ops/s 710.3574 Ops/s $\color{#d91a1a}-2.17\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5339s 15.0025ms 66.6554 Ops/s 230.9560 Ops/s $\textbf{\color{#d91a1a}-71.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.3236ms 2.4742ms 404.1701 Ops/s 409.4862 Ops/s $\color{#d91a1a}-1.30\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.7999ms 1.3856ms 721.7064 Ops/s 642.2632 Ops/s $\textbf{\color{#35bf28}+12.37\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3088ms 4.5600ms 219.2987 Ops/s 29.9763 Ops/s $\textbf{\color{#35bf28}+631.57\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.6691ms 2.7368ms 365.3920 Ops/s 389.9222 Ops/s $\textbf{\color{#d91a1a}-6.29\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.8137ms 1.6404ms 609.5963 Ops/s 616.0268 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3945ms 11.7283ms 85.2639 Ops/s 81.8462 Ops/s $\color{#35bf28}+4.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 0.4531s 23.7768ms 42.0578 Ops/s 68.2113 Ops/s $\textbf{\color{#d91a1a}-38.34\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.8980ms 20.7256ms 48.2494 Ops/s 47.3598 Ops/s $\color{#35bf28}+1.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.9086ms 15.0966ms 66.2400 Ops/s 67.7553 Ops/s $\color{#d91a1a}-2.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.7137ms 20.8083ms 48.0577 Ops/s 47.1367 Ops/s $\color{#35bf28}+1.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.5389ms 16.4594ms 60.7556 Ops/s 60.9990 Ops/s $\color{#d91a1a}-0.40\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9146s 0.8241s 1.2134 Ops/s 1.2728 Ops/s $\color{#d91a1a}-4.66\%$
test_transformed 1.5363s 1.4415s 0.6937 Ops/s 0.7129 Ops/s $\color{#d91a1a}-2.69\%$
test_serial 2.2949s 2.2908s 0.4365 Ops/s 0.4393 Ops/s $\color{#d91a1a}-0.63\%$
test_parallel 1.8774s 1.8570s 0.5385 Ops/s 0.5417 Ops/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-True-True-True-True] 0.2354ms 39.6443μs 25.2243 KOps/s 24.9451 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-True-True-True-False] 52.4310μs 23.3722μs 42.7859 KOps/s 43.2573 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-True-False-True] 54.8610μs 22.3976μs 44.6476 KOps/s 45.5812 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[True-True-True-False-False] 43.2410μs 13.1222μs 76.2070 KOps/s 78.5710 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[True-True-False-True-True] 89.4710μs 42.5093μs 23.5243 KOps/s 23.3999 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-True-False-True-False] 61.2810μs 25.6870μs 38.9302 KOps/s 39.6541 KOps/s $\color{#d91a1a}-1.83\%$
test_step_mdp_speed[True-True-False-False-True] 76.3410μs 24.5124μs 40.7957 KOps/s 40.8852 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-True-False-False-False] 42.3810μs 15.3082μs 65.3246 KOps/s 65.3938 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-False-True-True-True] 80.2010μs 44.3535μs 22.5462 KOps/s 22.4396 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-False-True-True-False] 61.2610μs 27.8423μs 35.9166 KOps/s 35.7989 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-False-True] 61.4410μs 24.4054μs 40.9745 KOps/s 40.7477 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-True-False-False] 76.0910μs 14.6921μs 68.0638 KOps/s 65.6220 KOps/s $\color{#35bf28}+3.72\%$
test_step_mdp_speed[True-False-False-True-True] 83.4510μs 46.7205μs 21.4039 KOps/s 21.3726 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[True-False-False-True-False] 65.3810μs 30.0682μs 33.2578 KOps/s 33.1501 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-False-False-False-True] 0.1512ms 26.3365μs 37.9701 KOps/s 37.7880 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[True-False-False-False-False] 43.9500μs 17.7460μs 56.3507 KOps/s 57.1924 KOps/s $\color{#d91a1a}-1.47\%$
test_step_mdp_speed[False-True-True-True-True] 84.1020μs 45.1493μs 22.1487 KOps/s 22.3950 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-True-True-True-False] 64.8910μs 28.3592μs 35.2619 KOps/s 36.1730 KOps/s $\color{#d91a1a}-2.52\%$
test_step_mdp_speed[False-True-True-False-True] 2.7215ms 29.0070μs 34.4745 KOps/s 35.8611 KOps/s $\color{#d91a1a}-3.87\%$
test_step_mdp_speed[False-True-True-False-False] 42.9710μs 17.2748μs 57.8878 KOps/s 58.7671 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-False-True-True] 86.7110μs 47.0880μs 21.2368 KOps/s 21.5061 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-True-False-True-False] 64.9810μs 30.5133μs 32.7725 KOps/s 33.2507 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[False-True-False-False-True] 72.5820μs 30.4218μs 32.8712 KOps/s 32.4167 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-True-False-False-False] 84.8920μs 19.4514μs 51.4101 KOps/s 52.0723 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-False-True-True-True] 87.6620μs 48.9339μs 20.4357 KOps/s 20.3385 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[False-False-True-True-False] 70.6010μs 32.1649μs 31.0898 KOps/s 31.0714 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[False-False-True-False-True] 65.9310μs 29.9502μs 33.3887 KOps/s 32.8813 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[False-False-True-False-False] 50.4910μs 19.0963μs 52.3660 KOps/s 52.4302 KOps/s $\color{#d91a1a}-0.12\%$
test_step_mdp_speed[False-False-False-True-True] 88.2310μs 50.6347μs 19.7493 KOps/s 19.6846 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-False-True-False] 0.2136ms 34.7550μs 28.7728 KOps/s 28.9380 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-False-False-True] 62.8510μs 31.4389μs 31.8077 KOps/s 31.2754 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-False-False-False-False] 58.4210μs 21.2592μs 47.0385 KOps/s 46.7358 KOps/s $\color{#35bf28}+0.65\%$
test_values[generalized_advantage_estimate-True-True] 27.4332ms 25.7378ms 38.8534 Ops/s 39.9985 Ops/s $\color{#d91a1a}-2.86\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1100s 3.0966ms 322.9373 Ops/s 335.6762 Ops/s $\color{#d91a1a}-3.80\%$
test_values[td0_return_estimate-False-False] 0.1111ms 80.7509μs 12.3838 KOps/s 12.6081 KOps/s $\color{#d91a1a}-1.78\%$
test_values[td1_return_estimate-False-False] 61.6204ms 58.9081ms 16.9756 Ops/s 17.3177 Ops/s $\color{#d91a1a}-1.98\%$
test_values[vec_td1_return_estimate-False-False] 1.4432ms 1.0988ms 910.1036 Ops/s 922.6124 Ops/s $\color{#d91a1a}-1.36\%$
test_values[td_lambda_return_estimate-True-False] 98.4775ms 93.9970ms 10.6386 Ops/s 10.6502 Ops/s $\color{#d91a1a}-0.11\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4082ms 1.0845ms 922.0530 Ops/s 924.8258 Ops/s $\color{#d91a1a}-0.30\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.9885ms 25.9254ms 38.5722 Ops/s 38.2766 Ops/s $\color{#35bf28}+0.77\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0410ms 0.7542ms 1.3259 KOps/s 1.3132 KOps/s $\color{#35bf28}+0.97\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8105ms 0.7061ms 1.4162 KOps/s 1.4495 KOps/s $\color{#d91a1a}-2.30\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5820ms 1.4882ms 671.9336 Ops/s 672.0465 Ops/s $\color{#d91a1a}-0.02\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7308ms 0.6803ms 1.4700 KOps/s 1.4222 KOps/s $\color{#35bf28}+3.36\%$
test_dqn_speed[False-None] 8.4371ms 1.5853ms 630.7902 Ops/s 663.2121 Ops/s $\color{#d91a1a}-4.89\%$
test_dqn_speed[False-backward] 2.3826ms 2.1488ms 465.3745 Ops/s 475.6100 Ops/s $\color{#d91a1a}-2.15\%$
test_dqn_speed[True-None] 0.1630s 0.6766ms 1.4781 KOps/s 1.7015 KOps/s $\textbf{\color{#d91a1a}-13.13\%}$
test_dqn_speed[True-backward] 1.3023ms 1.2517ms 798.9382 Ops/s 838.7756 Ops/s $\color{#d91a1a}-4.75\%$
test_dqn_speed[reduce-overhead-None] 0.7371ms 0.5954ms 1.6797 KOps/s 1.6548 KOps/s $\color{#35bf28}+1.50\%$
test_dqn_speed[reduce-overhead-backward] 1.1745ms 1.0910ms 916.5599 Ops/s 1.0014 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_ddpg_speed[False-None] 3.1854ms 2.8585ms 349.8371 Ops/s 348.6839 Ops/s $\color{#35bf28}+0.33\%$
test_ddpg_speed[False-backward] 4.7665ms 4.2450ms 235.5732 Ops/s 245.5826 Ops/s $\color{#d91a1a}-4.08\%$
test_ddpg_speed[True-None] 1.5679ms 1.3815ms 723.8626 Ops/s 717.0498 Ops/s $\color{#35bf28}+0.95\%$
test_ddpg_speed[True-backward] 2.5418ms 2.4804ms 403.1607 Ops/s 399.7557 Ops/s $\color{#35bf28}+0.85\%$
test_ddpg_speed[reduce-overhead-None] 1.5349ms 1.3923ms 718.2547 Ops/s 700.9857 Ops/s $\color{#35bf28}+2.46\%$
test_ddpg_speed[reduce-overhead-backward] 1.9882ms 1.9397ms 515.5390 Ops/s 473.8198 Ops/s $\textbf{\color{#35bf28}+8.80\%}$
test_sac_speed[False-None] 8.4875ms 8.0615ms 124.0470 Ops/s 121.0304 Ops/s $\color{#35bf28}+2.49\%$
test_sac_speed[False-backward] 11.3803ms 10.8987ms 91.7542 Ops/s 89.0174 Ops/s $\color{#35bf28}+3.07\%$
test_sac_speed[True-None] 2.0936ms 1.9138ms 522.5165 Ops/s 514.5466 Ops/s $\color{#35bf28}+1.55\%$
test_sac_speed[True-backward] 3.8164ms 3.6840ms 271.4465 Ops/s 259.4894 Ops/s $\color{#35bf28}+4.61\%$
test_sac_speed[reduce-overhead-None] 20.9522ms 12.0479ms 83.0017 Ops/s 82.4881 Ops/s $\color{#35bf28}+0.62\%$
test_sac_speed[reduce-overhead-backward] 1.7456ms 1.6688ms 599.2470 Ops/s 535.1602 Ops/s $\textbf{\color{#35bf28}+11.98\%}$
test_redq_speed[False-None] 7.9306ms 7.4913ms 133.4875 Ops/s 132.1685 Ops/s $\color{#35bf28}+1.00\%$
test_redq_speed[False-backward] 11.7004ms 11.2161ms 89.1578 Ops/s 85.1958 Ops/s $\color{#35bf28}+4.65\%$
test_redq_speed[True-None] 2.7081ms 2.3776ms 420.6006 Ops/s 409.9490 Ops/s $\color{#35bf28}+2.60\%$
test_redq_speed[True-backward] 4.6167ms 4.2862ms 233.3057 Ops/s 240.8305 Ops/s $\color{#d91a1a}-3.12\%$
test_redq_speed[reduce-overhead-None] 2.5502ms 2.3961ms 417.3375 Ops/s 408.9403 Ops/s $\color{#35bf28}+2.05\%$
test_redq_speed[reduce-overhead-backward] 4.7194ms 4.3086ms 232.0932 Ops/s 237.2572 Ops/s $\color{#d91a1a}-2.18\%$
test_redq_deprec_speed[False-None] 9.4454ms 9.0674ms 110.2855 Ops/s 110.7762 Ops/s $\color{#d91a1a}-0.44\%$
test_redq_deprec_speed[False-backward] 12.7267ms 12.2581ms 81.5790 Ops/s 84.4149 Ops/s $\color{#d91a1a}-3.36\%$
test_redq_deprec_speed[True-None] 2.8763ms 2.7169ms 368.0666 Ops/s 357.2940 Ops/s $\color{#35bf28}+3.02\%$
test_redq_deprec_speed[True-backward] 5.0113ms 4.5619ms 219.2049 Ops/s 220.9676 Ops/s $\color{#d91a1a}-0.80\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7776ms 2.7011ms 370.2148 Ops/s 360.0154 Ops/s $\color{#35bf28}+2.83\%$
test_redq_deprec_speed[reduce-overhead-backward] 5.1777ms 4.6042ms 217.1925 Ops/s 214.2208 Ops/s $\color{#35bf28}+1.39\%$
test_td3_speed[False-None] 8.4200ms 8.0657ms 123.9819 Ops/s 126.1978 Ops/s $\color{#d91a1a}-1.76\%$
test_td3_speed[False-backward] 11.0958ms 10.5253ms 95.0088 Ops/s 95.2548 Ops/s $\color{#d91a1a}-0.26\%$
test_td3_speed[True-None] 1.7477ms 1.7300ms 578.0315 Ops/s 562.6178 Ops/s $\color{#35bf28}+2.74\%$
test_td3_speed[True-backward] 3.8855ms 3.4540ms 289.5207 Ops/s 298.1215 Ops/s $\color{#d91a1a}-2.88\%$
test_td3_speed[reduce-overhead-None] 52.7668ms 26.9618ms 37.0895 Ops/s 37.7120 Ops/s $\color{#d91a1a}-1.65\%$
test_td3_speed[reduce-overhead-backward] 1.6923ms 1.5421ms 648.4716 Ops/s 704.6492 Ops/s $\textbf{\color{#d91a1a}-7.97\%}$
test_cql_speed[False-None] 17.2670ms 16.8276ms 59.4261 Ops/s 59.3070 Ops/s $\color{#35bf28}+0.20\%$
test_cql_speed[False-backward] 22.6009ms 22.2228ms 44.9989 Ops/s 45.7682 Ops/s $\color{#d91a1a}-1.68\%$
test_cql_speed[True-None] 3.6653ms 3.3797ms 295.8880 Ops/s 286.9663 Ops/s $\color{#35bf28}+3.11\%$
test_cql_speed[True-backward] 6.1121ms 5.6687ms 176.4083 Ops/s 167.8030 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_cql_speed[reduce-overhead-None] 20.9845ms 13.1804ms 75.8700 Ops/s 73.9599 Ops/s $\color{#35bf28}+2.58\%$
test_cql_speed[reduce-overhead-backward] 2.3092ms 1.9263ms 519.1170 Ops/s 486.1357 Ops/s $\textbf{\color{#35bf28}+6.78\%}$
test_a2c_speed[False-None] 3.4340ms 3.2160ms 310.9472 Ops/s 309.9233 Ops/s $\color{#35bf28}+0.33\%$
test_a2c_speed[False-backward] 6.6028ms 6.1148ms 163.5378 Ops/s 157.2946 Ops/s $\color{#35bf28}+3.97\%$
test_a2c_speed[True-None] 1.4750ms 1.3947ms 716.9946 Ops/s 709.9864 Ops/s $\color{#35bf28}+0.99\%$
test_a2c_speed[True-backward] 3.2286ms 3.1395ms 318.5174 Ops/s 316.8290 Ops/s $\color{#35bf28}+0.53\%$
test_a2c_speed[reduce-overhead-None] 16.0772ms 9.0849ms 110.0725 Ops/s 111.6892 Ops/s $\color{#d91a1a}-1.45\%$
test_a2c_speed[reduce-overhead-backward] 1.5983ms 1.4877ms 672.1767 Ops/s 604.3520 Ops/s $\textbf{\color{#35bf28}+11.22\%}$
test_ppo_speed[False-None] 3.8960ms 3.7313ms 268.0022 Ops/s 269.4740 Ops/s $\color{#d91a1a}-0.55\%$
test_ppo_speed[False-backward] 7.3791ms 6.8412ms 146.1740 Ops/s 143.1649 Ops/s $\color{#35bf28}+2.10\%$
test_ppo_speed[True-None] 1.6235ms 1.4590ms 685.3809 Ops/s 673.9274 Ops/s $\color{#35bf28}+1.70\%$
test_ppo_speed[True-backward] 3.1836ms 3.1136ms 321.1738 Ops/s 314.0216 Ops/s $\color{#35bf28}+2.28\%$
test_ppo_speed[reduce-overhead-None] 1.1518ms 0.9991ms 1.0009 KOps/s 1.0008 KOps/s $+0.00\%$
test_ppo_speed[reduce-overhead-backward] 1.5268ms 1.4263ms 701.1163 Ops/s 669.6259 Ops/s $\color{#35bf28}+4.70\%$
test_reinforce_speed[False-None] 2.4457ms 2.2699ms 440.5560 Ops/s 436.9656 Ops/s $\color{#35bf28}+0.82\%$
test_reinforce_speed[False-backward] 3.5187ms 3.2750ms 305.3424 Ops/s 304.6102 Ops/s $\color{#35bf28}+0.24\%$
test_reinforce_speed[True-None] 1.4919ms 1.3366ms 748.1435 Ops/s 734.7471 Ops/s $\color{#35bf28}+1.82\%$
test_reinforce_speed[True-backward] 3.1529ms 3.0101ms 332.2124 Ops/s 336.6177 Ops/s $\color{#d91a1a}-1.31\%$
test_reinforce_speed[reduce-overhead-None] 18.2730ms 10.1191ms 98.8227 Ops/s 99.8325 Ops/s $\color{#d91a1a}-1.01\%$
test_reinforce_speed[reduce-overhead-backward] 1.5905ms 1.5243ms 656.0477 Ops/s 641.2423 Ops/s $\color{#35bf28}+2.31\%$
test_iql_speed[False-None] 9.6203ms 9.1758ms 108.9826 Ops/s 108.1857 Ops/s $\color{#35bf28}+0.74\%$
test_iql_speed[False-backward] 13.2917ms 12.7638ms 78.3467 Ops/s 77.4662 Ops/s $\color{#35bf28}+1.14\%$
test_iql_speed[True-None] 2.5871ms 2.3224ms 430.5866 Ops/s 418.1394 Ops/s $\color{#35bf28}+2.98\%$
test_iql_speed[True-backward] 5.1552ms 4.9144ms 203.4842 Ops/s 195.1751 Ops/s $\color{#35bf28}+4.26\%$
test_iql_speed[reduce-overhead-None] 0.5059s 12.9987ms 76.9309 Ops/s 90.0718 Ops/s $\textbf{\color{#d91a1a}-14.59\%}$
test_iql_speed[reduce-overhead-backward] 2.0618ms 1.9612ms 509.8948 Ops/s 455.8408 Ops/s $\textbf{\color{#35bf28}+11.86\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7428ms 6.3237ms 158.1354 Ops/s 155.7529 Ops/s $\color{#35bf28}+1.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7054ms 0.2704ms 3.6982 KOps/s 3.7611 KOps/s $\color{#d91a1a}-1.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5094ms 0.2458ms 4.0684 KOps/s 4.1065 KOps/s $\color{#d91a1a}-0.93\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3399ms 6.0056ms 166.5104 Ops/s 164.2037 Ops/s $\color{#35bf28}+1.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7754ms 0.3437ms 2.9093 KOps/s 3.2768 KOps/s $\textbf{\color{#d91a1a}-11.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5839ms 0.3331ms 3.0018 KOps/s 3.5409 KOps/s $\textbf{\color{#d91a1a}-15.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6383ms 1.4063ms 711.1068 Ops/s 756.0204 Ops/s $\textbf{\color{#d91a1a}-5.94\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4377ms 1.1743ms 851.5591 Ops/s 702.0670 Ops/s $\textbf{\color{#35bf28}+21.29\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4134ms 6.1931ms 161.4702 Ops/s 157.6744 Ops/s $\color{#35bf28}+2.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3266ms 0.4556ms 2.1951 KOps/s 2.0485 KOps/s $\textbf{\color{#35bf28}+7.15\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6109ms 0.3971ms 2.5181 KOps/s 2.2062 KOps/s $\textbf{\color{#35bf28}+14.14\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2438ms 6.0434ms 165.4711 Ops/s 162.4536 Ops/s $\color{#35bf28}+1.86\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8653ms 0.2973ms 3.3639 KOps/s 3.4786 KOps/s $\color{#d91a1a}-3.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5539ms 0.3143ms 3.1816 KOps/s 4.1161 KOps/s $\textbf{\color{#d91a1a}-22.70\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3804ms 6.0360ms 165.6726 Ops/s 163.4648 Ops/s $\color{#35bf28}+1.35\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9303ms 0.3295ms 3.0351 KOps/s 3.8300 KOps/s $\textbf{\color{#d91a1a}-20.75\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4948ms 0.2592ms 3.8577 KOps/s 4.1694 KOps/s $\textbf{\color{#d91a1a}-7.47\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4956ms 6.2137ms 160.9335 Ops/s 158.7676 Ops/s $\color{#35bf28}+1.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9423ms 0.4515ms 2.2150 KOps/s 2.1878 KOps/s $\color{#35bf28}+1.24\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6103ms 0.3871ms 2.5833 KOps/s 2.4683 KOps/s $\color{#35bf28}+4.66\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9370ms 5.3992ms 185.2127 Ops/s 181.5360 Ops/s $\color{#35bf28}+2.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.3883ms 2.1005ms 476.0833 Ops/s 438.6584 Ops/s $\textbf{\color{#35bf28}+8.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0119ms 1.1458ms 872.7169 Ops/s 788.5067 Ops/s $\textbf{\color{#35bf28}+10.68\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4702s 14.7744ms 67.6846 Ops/s 181.2568 Ops/s $\textbf{\color{#d91a1a}-62.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.5471ms 2.2002ms 454.4941 Ops/s 434.8153 Ops/s $\color{#35bf28}+4.53\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.1382ms 1.1030ms 906.6362 Ops/s 846.8658 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.4903ms 5.6433ms 177.2019 Ops/s 30.1456 Ops/s $\textbf{\color{#35bf28}+487.82\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.7407ms 2.1953ms 455.5093 Ops/s 444.8610 Ops/s $\color{#35bf28}+2.39\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.4357ms 1.4145ms 706.9564 Ops/s 732.5884 Ops/s $\color{#d91a1a}-3.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3632ms 13.2147ms 75.6730 Ops/s 70.4062 Ops/s $\textbf{\color{#35bf28}+7.48\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1784ms 16.6891ms 59.9192 Ops/s 59.3194 Ops/s $\color{#35bf28}+1.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.7535ms 17.8915ms 55.8924 Ops/s 53.2100 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.0431ms 17.2045ms 58.1244 Ops/s 58.3268 Ops/s $\color{#d91a1a}-0.35\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.1131ms 17.8223ms 56.1095 Ops/s 54.0638 Ops/s $\color{#35bf28}+3.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.5907ms 18.8083ms 53.1679 Ops/s 52.6688 Ops/s $\color{#35bf28}+0.95\%$

@vmoens vmoens merged commit 73a47c9 into gh/vmoens/90/base Feb 20, 2025
64 of 74 checks passed
vmoens added a commit that referenced this pull request Feb 20, 2025
ghstack-source-id: c28c11ecf68fba0ffde652205ea8e46f8da07cf1
Pull Request resolved: #2793
@vmoens vmoens deleted the gh/vmoens/90/head branch February 20, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants