Skip to content

changes for latest versions of BenchMARL #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

karthiks1701
Copy link

@karthiks1701 karthiks1701 commented May 1, 2025

A few adaptations for DiCo to work with the latest BenchMARL codebase. No need to install the specific branches of tensordict, torchRL, and BenchMARL. Let me know if additional testing is required.

SND curves for the sampling environment
image
image

Final learned policy
https://github.com/user-attachments/assets/4ba5ae75-02d2-4d80-868b-03bd538af34a

@karthiks1701 karthiks1701 changed the title changes for new version of BenchMaRL changes for latest versions of BenchMaRL May 1, 2025
@karthiks1701 karthiks1701 changed the title changes for latest versions of BenchMaRL changes for latest versions of BenchMARL May 1, 2025
@@ -143,7 +149,9 @@ def _forward(
else: # Gather outputs for one agent on the obs
# tensor of shape [*batch, n_agents, n_actions], where the outputs
# along the n_agent dimension are taken with the same (agent_index) agent network
agent_out = self.agent_mlps.agent_networks[agent_index].forward(input)
# agent_out = self.agent_mlps.agent_networks[agent_index].forward(input)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key change as agent_networks is no longer supported in torchRL.

Copy link
Member

@matteobettini matteobettini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a mil, just a few qs

@@ -13,7 +13,7 @@ use_action_loss: True
action_loss_lr: 0.00003

experiment:
max_n_frames: 5_000_000
max_n_frames: 1_000_000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not change the default config for reproducibility

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, about that I didnt want to run for longer, so pushed this by mistake.

Comment on lines +177 to +179
distance = self.estimate_snd(input)
if update_estimate:
self.estimated_snd[:] = distance.detach()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expalin this a bit?

If those conditions are met, we can avoid computing $\widehat{\mathrm{SND}}$

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to be able to log the estimated_snd during training when the desired snd is -1. Right now it logs Nan's. It was just to be able to see the evolution of snd while training as well. Can be remove if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but you can still see it under eval/snd no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, But if I understand that is only during evaluation right? This was helpful in understanding how the SND evolves while training. But you are right eval/snd is enough. Should we roll back to the previous version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok got it, i ll take care of things don't worry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants