Skip to content

[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

marcosgalleterobbva
Copy link

Description

Adds a new argument string_to_tensor_map that allows the user to parse NonTensorData fields like "mission" or any other categorical values inside the observation TensorDict and assign it any Tensor value they might prefer, be it a one hot encoding version of the categorical field or a BERT-like embedding of the text fields.

Technical Details

I would need @vmoens to take a small look to this specific set of lines. It works, but I don't know if this specific implementation is correct.
https://github.com/marcosgalleterobbva/rl/blob/c5c075d6eb1da858f580392d7567bc78a8679ce0/torchrl/data/datasets/minari_data.py#L386

Motivation and Context

The motivation is specified in issue #3071
close #3071

  • [ x] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • [ x] New feature (non-breaking change which adds core functionality)
  • [ x] Documentation (update in the documentation)
  • [ x] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • [ x] I have read the CONTRIBUTION guide (required)
  • [ x] My change requires a change to the documentation.
  • [ x] I have updated the tests accordingly (required for a bug fix or a new feature).
  • [ x] I have updated the documentation accordingly.

Copy link

pytorch-bot bot commented Jul 16, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3075

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 6 Unrelated Failures

As of commit c5c075d with merge base 77c00b9 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 16, 2025
@vmoens vmoens added the enhancement New feature or request label Jul 16, 2025
Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work thanks
Overall LGTM but I left a few comments.
My biggest question is why don't we simply use the NonTensorData feature from the tensordict library to store the text as non-tensor?

@@ -32,6 +32,9 @@ If the generation of this artifact in MacOs M1 doesn't work correctly or in the
ARCHFLAGS="-arch arm64" python setup.py develop
```

In some MacOs devices, the installation of the required libraries errors if the correct version of
clang is not used. Using `llvm@16` (installable with brew), may fix your issues.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -243,38 +234,72 @@ def _download_and_preproc(self):
minari.download_dataset(dataset_id=self.dataset_id)
parent_dir = Path(tmpdir) / self.dataset_id / "data"

td_data = TensorDict()
td_data = TensorDict({}, batch_size=[])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those two are equivalent, and I personally like TensorDict() more :)

Suggested change
td_data = TensorDict({}, batch_size=[])
td_data = TensorDict()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tutorials have a specific formatting, need to be registered etc.
If it's an example, it should go in examples/

steps = val.shape[0]
else:
if steps != val.shape[0]:
if key == "observations":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on the change of logic here?
Hard for me to grasp precisely what is going on

Comment on lines +386 to +404
# TODO: Unfortunately the copy_ method fais when dealing with
# subvals of NonTensorData. It fails with this
# RuntimeError: Cannot update a leaf NonTensorDataBase from a memmaped
# parent NonTensorStack. To update this leaf node, please update the
# NonTensorStack with the proper index.
# Unfortunately, this following method also fails, as lists do not
# have copy_ method
# data_view["observation", subkey].copy_(subval[:-1])
# The only approach that seems to be working it unlocking the
# Tensordict. I would prefer something like the following:
# for i in range(len(subval) - 1):
# data_view[i].set(("observation", subkey), subval[i])
# data_view[i].set(("next", "observation", subkey), subval[i + 1])
# But this three previous lines give this error:
# RuntimeError: Cannot modify locked TensorDict. For in-place
# modification, consider using the `set_()` method and make
# sure the key is present.
# But this current approach takes incredibly long to complete, maybe
# I should do something different?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think anything you think should work should work :)

I will give it a stab, but you can also write the code as you'd expect it to work and I'll try to patch tensordict / torchrl to make the necessary amendments!

@vmoens vmoens added Environments Adds or modifies an environment wrapper Data Data-related PR, will launch data-related jobs labels Jul 16, 2025
@marcosgalleterobbva
Copy link
Author

Great work thanks Overall LGTM but I left a few comments. My biggest question is why don't we simply use the NonTensorData feature from the tensordict library to store the text as non-tensor?

Hi, @vmoens . Love to hear that this PR looks good.

The reason to move from NonTensorData to Tensor is mainly due to datasets similar to minigrid and BabyAI.

In these environments, the 'mission' key of a given observation is absolutely esential for the agent to correctly assess which action to take. And we need to make sure we have a method for integrating these observations in the overall process. Prior to this PR, if you downloaded a dataset_id like minigrid/BabyAI-PutNextS7N4/optimal-v0, the mission key would be downloaded as a NonTensorData.

In order to be able to include this field in the agents Q-function approximation, the natural step forward would be to try to leverage the transform argument of the MinariExperienceReplay to apply a transformation to these categorical observations (mainly "mission" and "direction") but, unfortunately, if we tried to apply a Compose transformation to this observation/mission key, we would get that the mission key is not loaded as an observation in the loading process.

I also thought about a different approach. Maybe I could try to modify the loading process and, instead of changing the download process (like this PR does), just modify the data that is loaded from disk with a transform Compose. But this would not work.

If you downloaded a dataset_id with mission observations (like minigrid/BabyAI-PutNextS7N4/optimal-v0) and took a look at the values that are stored for this mission observation, you would see that MinariExperienceReplay is not downloading to disk the correct missions for each episode. Instead, it is just taking the mission for the first episode and copying it to all the other episodes of the dataset.

This PR allows us to transform the incoming NonTensorData at the moment of saving it to disk, so that, when we might want to apply a transformation to this field, we actually get the correct data. The modification to the env_metadata.json is included in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Data Data-related PR, will launch data-related jobs enhancement New feature or request Environments Adds or modifies an environment wrapper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants