[BUG] QValueActor does not respect action_spec

## Describe the bug

When QValueActor is used with an action spec that has a singleton dimension (1,) for the action, the action shape instead becomes ().

## To Reproduce

```python
from tensordict import TensorDict
from tensordict.nn import TensorDictModule
from torch import nn
import torch
from torchrl.modules.tensordict_module import QValueActor
from torchrl.data import Categorical

action_spec = Categorical(4, shape=torch.Size((1, 1)), dtype=torch.int64)

module = TensorDictModule(
    module=nn.Linear(3, 1), in_keys=("observation"), out_keys=("action_value")
)

qvalue_actor = QValueActor(
    module=module,
    in_keys=["observation"],
    spec=action_spec,
)

td = TensorDict({"observation": torch.randn(12, 3)})
qvalue_actor(td)
print(td)
```

```bash
TensorDict(
    fields={
        action: Tensor(shape=torch.Size([12]), device=cpu, dtype=torch.int64, is_shared=False),
        action_value: Tensor(shape=torch.Size([12, 1]), device=cpu, dtype=torch.float32, is_shared=False),
        chosen_action_value: Tensor(shape=torch.Size([12, 1]), device=cpu, dtype=torch.float32, is_shared=False),
        observation: Tensor(shape=torch.Size([12, 3]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
```

## Expected behavior

`action.shape` should be `torch.Size([12, 1])`

## System info

Packages:
```
asttokens==3.0.0
cloudpickle==3.1.1
decorator==5.2.1
executing==2.2.0
filelock==3.18.0
fsspec==2025.5.1
importlib-metadata==8.7.0
ipython==9.4.0
ipython-pygments-lexers==1.1.1
jedi==0.19.2
jinja2==3.1.6
markupsafe==3.0.2
matplotlib-inline==0.1.7
mpmath==1.3.0
networkx==3.5
numpy==2.3.1
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
orjson==3.10.18
packaging==25.0
parso==0.8.4
pexpect==4.9.0
prompt-toolkit==3.0.51
ptyprocess==0.7.0
pure-eval==0.2.3
pygments==2.19.2
stack-data==0.6.3
sympy==1.13.1
tensordict==0.8.3
torch==2.6.0
torchrl==0.7.2
traitlets==5.14.3
triton==3.2.0
typing-extensions==4.14.1
wcwidth==0.2.13
zipp==3.23.0
```

```python
import torchrl, numpy, sys
print(torchrl.__version__, numpy.__version__, sys.version, sys.platform)
```
outputs
```
0.7.2 2.3.1 3.11.9 (main, Aug 14 2024, 05:07:28) [Clang 18.1.8 ] linux
```

## Additional context

This problem could of course be avoided by simply removing the last dimension, but I like the idea of letting all tensors in the tensordict have the same number of dimensions. This makes it easier to reason about dimensions once you start adding batches and multiple agents.

## Reason and Possible fixes

No clue.

## Checklist

- [X] I have checked that there is no similar issue in the repo (**required**)
- [X] I have read the [documentation](https://github.com/pytorch/rl/tree/main/docs/) (**required**)
- [X] I have provided a minimal working example to reproduce the bug (**required**)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] QValueActor does not respect action_spec #3059

Describe the bug

To Reproduce

Expected behavior

System info

Additional context

Reason and Possible fixes

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] QValueActor does not respect action_spec #3059

Description

Describe the bug

To Reproduce

Expected behavior

System info

Additional context

Reason and Possible fixes

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions