-
Notifications
You must be signed in to change notification settings - Fork 388
[Feature] MinariExperienceReplay now can handle text fields like "mission" #3075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3075
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 6 Unrelated FailuresAs of commit c5c075d with merge base 77c00b9 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work thanks
Overall LGTM but I left a few comments.
My biggest question is why don't we simply use the NonTensorData feature from the tensordict library to store the text as non-tensor?
@@ -32,6 +32,9 @@ If the generation of this artifact in MacOs M1 doesn't work correctly or in the | |||
ARCHFLAGS="-arch arm64" python setup.py develop | |||
``` | |||
|
|||
In some MacOs devices, the installation of the required libraries errors if the correct version of | |||
clang is not used. Using `llvm@16` (installable with brew), may fix your issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -243,38 +234,72 @@ def _download_and_preproc(self): | |||
minari.download_dataset(dataset_id=self.dataset_id) | |||
parent_dir = Path(tmpdir) / self.dataset_id / "data" | |||
|
|||
td_data = TensorDict() | |||
td_data = TensorDict({}, batch_size=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those two are equivalent, and I personally like TensorDict()
more :)
td_data = TensorDict({}, batch_size=[]) | |
td_data = TensorDict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tutorials have a specific formatting, need to be registered etc.
If it's an example, it should go in examples/
steps = val.shape[0] | ||
else: | ||
if steps != val.shape[0]: | ||
if key == "observations": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment on the change of logic here?
Hard for me to grasp precisely what is going on
# TODO: Unfortunately the copy_ method fais when dealing with | ||
# subvals of NonTensorData. It fails with this | ||
# RuntimeError: Cannot update a leaf NonTensorDataBase from a memmaped | ||
# parent NonTensorStack. To update this leaf node, please update the | ||
# NonTensorStack with the proper index. | ||
# Unfortunately, this following method also fails, as lists do not | ||
# have copy_ method | ||
# data_view["observation", subkey].copy_(subval[:-1]) | ||
# The only approach that seems to be working it unlocking the | ||
# Tensordict. I would prefer something like the following: | ||
# for i in range(len(subval) - 1): | ||
# data_view[i].set(("observation", subkey), subval[i]) | ||
# data_view[i].set(("next", "observation", subkey), subval[i + 1]) | ||
# But this three previous lines give this error: | ||
# RuntimeError: Cannot modify locked TensorDict. For in-place | ||
# modification, consider using the `set_()` method and make | ||
# sure the key is present. | ||
# But this current approach takes incredibly long to complete, maybe | ||
# I should do something different? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think anything you think should work should work :)
I will give it a stab, but you can also write the code as you'd expect it to work and I'll try to patch tensordict / torchrl to make the necessary amendments!
Hi, @vmoens . Love to hear that this PR looks good. The reason to move from NonTensorData to Tensor is mainly due to datasets similar to minigrid and BabyAI. In these environments, the 'mission' key of a given observation is absolutely esential for the agent to correctly assess which action to take. And we need to make sure we have a method for integrating these observations in the overall process. Prior to this PR, if you downloaded a dataset_id like minigrid/BabyAI-PutNextS7N4/optimal-v0, the mission key would be downloaded as a NonTensorData. In order to be able to include this field in the agents Q-function approximation, the natural step forward would be to try to leverage the transform argument of the MinariExperienceReplay to apply a transformation to these categorical observations (mainly "mission" and "direction") but, unfortunately, if we tried to apply a Compose transformation to this observation/mission key, we would get that the mission key is not loaded as an observation in the loading process. I also thought about a different approach. Maybe I could try to modify the loading process and, instead of changing the download process (like this PR does), just modify the data that is loaded from disk with a transform Compose. But this would not work. If you downloaded a dataset_id with mission observations (like minigrid/BabyAI-PutNextS7N4/optimal-v0) and took a look at the values that are stored for this mission observation, you would see that MinariExperienceReplay is not downloading to disk the correct missions for each episode. Instead, it is just taking the mission for the first episode and copying it to all the other episodes of the dataset. This PR allows us to transform the incoming NonTensorData at the moment of saving it to disk, so that, when we might want to apply a transformation to this field, we actually get the correct data. The modification to the env_metadata.json is included in the PR. |
Description
Adds a new argument string_to_tensor_map that allows the user to parse NonTensorData fields like "mission" or any other categorical values inside the observation TensorDict and assign it any Tensor value they might prefer, be it a one hot encoding version of the categorical field or a BERT-like embedding of the text fields.
Technical Details
I would need @vmoens to take a small look to this specific set of lines. It works, but I don't know if this specific implementation is correct.
https://github.com/marcosgalleterobbva/rl/blob/c5c075d6eb1da858f580392d7567bc78a8679ce0/torchrl/data/datasets/minari_data.py#L386
Motivation and Context
The motivation is specified in issue #3071
close #3071
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!