[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

marcosgalleterobbva · 2025-07-16T10:12:58Z

Description

Adds a new argument string_to_tensor_map that allows the user to parse NonTensorData fields like "mission" or any other categorical values inside the observation TensorDict and assign it any Tensor value they might prefer, be it a one hot encoding version of the categorical field or a BERT-like embedding of the text fields.

Technical Details

I would need @vmoens to take a small look to this specific set of lines. It works, but I don't know if this specific implementation is correct.
https://github.com/marcosgalleterobbva/rl/blob/c5c075d6eb1da858f580392d7567bc78a8679ce0/torchrl/data/datasets/minari_data.py#L386

Motivation and Context

The motivation is specified in issue #3071
close #3071

[ x] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[ x] New feature (non-breaking change which adds core functionality)
[ x] Documentation (update in the documentation)
[ x] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

[ x] I have read the CONTRIBUTION guide (required)
[ x] My change requires a change to the documentation.
[ x] I have updated the tests accordingly (required for a bug fix or a new feature).
[ x] I have updated the documentation accordingly.

pytorch-bot · 2025-07-16T10:13:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3075

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 6 Unrelated Failures

As of commit c5c075d with merge base 77c00b9 ():

NEW FAILURES - The following jobs have failed:

Generate documentation / build-docs (3.9, 12.8) / linux-job (gh)
RuntimeError: Command docker exec -t 55186aad52f8ae5d7a6563330219c28ec30fb6c60f4b4e90a27b48fef97882e8 /exec failed with exit code 2
Habitat Tests on Linux / tests (3.9, 12.8) / linux-job (gh)
RuntimeError: Command docker exec -t b89d7b1cb8188dca63cb4bd3bd320d9ae70d9deb78e916187fd9faa17cbee2c5 /exec failed with exit code 1
Unit-tests on Linux / tests-cpu (3.9) / linux-job (gh)
test/test_env.py::TestNonTensorEnv::test_parallel[False-False]

FLAKY - The following job failed but was likely due to flakiness present on trunk:

Continuous Benchmark (PR) / GPU Pytest benchmark (gh) (similar failure)
RuntimeError: PassManager::run failed

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Continuous Benchmark / CPU Pytest benchmark (gh) (trunk failure)
Continuous Benchmark / GPU Pytest benchmark (gh) (trunk failure)
RuntimeError: PassManager::run failed
Libs Tests on Linux / unittests-gym (3.9, 12.8) / linux-job (gh) (trunk failure)
test/test_libs.py::TestGym::test_gym_fake_td[True-False-3-HalfCheetah-v2]
LLM Tests on Linux / unittests (3.9, 12.8) / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
Unit-tests on Linux / tests-olddeps (3.9, 11.6) / linux-job (gh) (trunk failure)
test/test_transforms.py::TestActionDiscretizer::test_transform_env[pendulum-SamplingStrategy.RANDOM-False-True]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens

Great work thanks
Overall LGTM but I left a few comments.
My biggest question is why don't we simply use the NonTensorData feature from the tensordict library to store the text as non-tensor?

vmoens · 2025-07-16T10:39:40Z

CONTRIBUTING.md

@@ -32,6 +32,9 @@ If the generation of this artifact in MacOs M1 doesn't work correctly or in the
 ARCHFLAGS="-arch arm64" python setup.py develop
 ```

+In some MacOs devices, the installation of the required libraries errors if the correct version of
+clang is not used. Using `llvm@16` (installable with brew), may fix your issues.


vmoens · 2025-07-16T10:40:43Z

torchrl/data/datasets/minari_data.py

@@ -243,38 +234,72 @@ def _download_and_preproc(self):
            minari.download_dataset(dataset_id=self.dataset_id)
            parent_dir = Path(tmpdir) / self.dataset_id / "data"

-            td_data = TensorDict()
+            td_data = TensorDict({}, batch_size=[])


those two are equivalent, and I personally like TensorDict() more :)

Suggested change

td_data = TensorDict({}, batch_size=[])

td_data = TensorDict()

vmoens · 2025-07-16T10:46:10Z

tutorials/sphinx-tutorials/minari_data_loading.py

Tutorials have a specific formatting, need to be registered etc.
If it's an example, it should go in examples/

vmoens · 2025-07-16T10:47:54Z

torchrl/data/datasets/minari_data.py

-                                steps = val.shape[0]
-                            else:
-                                if steps != val.shape[0]:
+                        if key == "observations":


Can you comment on the change of logic here?
Hard for me to grasp precisely what is going on

vmoens · 2025-07-16T10:51:02Z

torchrl/data/datasets/minari_data.py

+                                            # TODO: Unfortunately the copy_ method fais when dealing with
+                                            #       subvals of NonTensorData. It fails with this
+                                            #       RuntimeError: Cannot update a leaf NonTensorDataBase from a memmaped
+                                            #       parent NonTensorStack. To update this leaf node, please update the
+                                            #       NonTensorStack with the proper index.
+                                            #       Unfortunately, this following method also fails, as lists do not
+                                            #       have copy_ method
+                                            #           data_view["observation", subkey].copy_(subval[:-1])
+                                            #       The only approach that seems to be working it unlocking the
+                                            #       Tensordict. I would prefer something like the following:
+                                            #           for i in range(len(subval) - 1):
+                                            #               data_view[i].set(("observation", subkey), subval[i])
+                                            #               data_view[i].set(("next", "observation", subkey), subval[i + 1])
+                                            #       But this three previous lines give this error:
+                                            #           RuntimeError: Cannot modify locked TensorDict. For in-place
+                                            #           modification, consider using the `set_()` method and make
+                                            #           sure the key is present.
+                                            #       But this current approach takes incredibly long to complete, maybe
+                                            #       I should do something different?


I think anything you think should work should work :)

I will give it a stab, but you can also write the code as you'd expect it to work and I'll try to patch tensordict / torchrl to make the necessary amendments!

marcosgalleterobbva · 2025-07-17T08:46:23Z

Great work thanks Overall LGTM but I left a few comments. My biggest question is why don't we simply use the NonTensorData feature from the tensordict library to store the text as non-tensor?

Hi, @vmoens . Love to hear that this PR looks good.

The reason to move from NonTensorData to Tensor is mainly due to datasets similar to minigrid and BabyAI.

In these environments, the 'mission' key of a given observation is absolutely esential for the agent to correctly assess which action to take. And we need to make sure we have a method for integrating these observations in the overall process. Prior to this PR, if you downloaded a dataset_id like minigrid/BabyAI-PutNextS7N4/optimal-v0, the mission key would be downloaded as a NonTensorData.

In order to be able to include this field in the agents Q-function approximation, the natural step forward would be to try to leverage the transform argument of the MinariExperienceReplay to apply a transformation to these categorical observations (mainly "mission" and "direction") but, unfortunately, if we tried to apply a Compose transformation to this observation/mission key, we would get that the mission key is not loaded as an observation in the loading process.

I also thought about a different approach. Maybe I could try to modify the loading process and, instead of changing the download process (like this PR does), just modify the data that is loaded from disk with a transform Compose. But this would not work.

If you downloaded a dataset_id with mission observations (like minigrid/BabyAI-PutNextS7N4/optimal-v0) and took a look at the values that are stored for this mission observation, you would see that MinariExperienceReplay is not downloading to disk the correct missions for each episode. Instead, it is just taking the mission for the first episode and copying it to all the other episodes of the dataset.

This PR allows us to transform the incoming NonTensorData at the moment of saving it to disk, so that, when we might want to apply a transformation to this field, we actually get the correct data. The modification to the env_metadata.json is included in the PR.

marcosgalleterobbva added 14 commits July 9, 2025 17:04

[Feature] MinariExperienceReplay now can load text data as tensors

63b79ba

Merge branch 'pytorch:main' into main

80313bd

Small dumb change

256aeeb

Documentation and tests

bc075b1

Merge branch 'pytorch:main' into main

268d32e

Merge branch 'main' of https://github.com/marcosgalleterobbva/rl

a23570b

Final changes

6cec6bc

Adapted for non tensordict observations

ab31a2b

Merge branch 'pytorch:main' into main

b4d9b31

Merge branch 'pytorch:main' into main

52955b8

Final adaptation for NonTensorData population

9ec0427

Merge branch 'main' of https://github.com/marcosgalleterobbva/rl

dbac267

Reversed small test change and fixed patching

7e53301

Pre-commit linting changes

c5c075d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 16, 2025

vmoens added the enhancement New feature or request label Jul 16, 2025

vmoens reviewed Jul 16, 2025

View reviewed changes

vmoens added Environments Adds or modifies an environment wrapper Data Data-related PR, will launch data-related jobs labels Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

marcosgalleterobbva commented Jul 16, 2025

Uh oh!

pytorch-bot bot commented Jul 16, 2025 •

edited

Loading

Uh oh!

vmoens left a comment

Uh oh!

vmoens Jul 16, 2025

Uh oh!

vmoens Jul 16, 2025

Uh oh!

vmoens Jul 16, 2025

Uh oh!

vmoens Jul 16, 2025

Uh oh!

vmoens Jul 16, 2025

Uh oh!

marcosgalleterobbva commented Jul 17, 2025

Uh oh!

Uh oh!

	td_data = TensorDict({}, batch_size=[])
	td_data = TensorDict()

[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

Are you sure you want to change the base?

[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075

Conversation

marcosgalleterobbva commented Jul 16, 2025

Description

Technical Details

Motivation and Context

Types of changes

Checklist

Uh oh!

pytorch-bot bot commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3075

❌ 3 New Failures, 6 Unrelated Failures

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

vmoens Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

marcosgalleterobbva commented Jul 17, 2025

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 16, 2025 •

edited

Loading