VR Stuff

TODO:

Data:
- See `owl_vaes/data/video_dir_loader.py`
- Create an equivalent for VR data in
- - I'm currently assuming VR data has a format such that each data instance has a folder with a recording and controls
- New loader should go in `owl_vaes/data/vr_video_dir_loader.py`
- Some specs for new loader
- - It should have randomization to ensure different workers get different videos (existing loader takes `rank` and `world_size`)
- - It should have `target_size` as `Tuple[int,int]` to control the per-eye resolution from the video.
- - `window_length` as `int` to control the number of frames that are sampled (for a video autoencoder)
- - Note that return shape is `[b,c,h,w]` in the `window_length=1` case and `[b,n,c,h,w]`, with `c=6`. This assumes channel-wise concat for both eyes views, which I think it fine. Feel free to push back on this if you disagree. 
- - There may be a need for other VR-specific kwargs for sanity/debugging purposes. If you feel these are needed, feel free to add
- Once a dataloader is created, and a get_loader function is created, add it to `owl_vaes/data/__init__.py`
- In `owl_vaes/data/vr_video_dir_loader.py` you can add a testing function to ensure loader works

Logging:
- See `owl_vaes/utils/logging.py`
- Admittedly it's getting a bit cluttered, best to put new logging code in `owl_vaes/utils/vr_logging.py`
- You can see how these are used in example trainers (will list later)
- `to_wandb_vr` should be a function similar to `to_wandb`, which takes two [b,6,h,w] images and puts them side by side (original and reconstruction)
- `to_wandb_vr_video` should be a function similar to `to_wandb_video_sidebyside`

Modelling:
- For the time being nothing fancy is needed on modelling side, just use `dcae` with a channel count of 6 for both eyes. This is more so just intended to get you started with the codebase

Trainer:
- You should be able to copy `owl_vaes/trainers/rec.py` or `owl_vaes/trainers/video_rec.py` and just replace the logging function. It might also make sense to just add logging information to the config so that this can all be wrapped into existing trainers.

Configs and Launching a Training Job:
- See example configs in `configs/waypoint_1/owl_vae_f16_c16.yml`
- You launch train runs with `python -m train --config_path path_to_config.yml` or `torchrun --nproc_per_node=8 -m train --config_path path_to_config.yml`
- Use skypilot configs for multinode jobs, but keep in mind that for image vae you don't need more than one node

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VR Stuff #51

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

VR Stuff #51

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions