Skip to content

VR Stuff #51

@shahbuland

Description

@shahbuland

TODO:

Data:

  • See owl_vaes/data/video_dir_loader.py
  • Create an equivalent for VR data in
    • I'm currently assuming VR data has a format such that each data instance has a folder with a recording and controls
  • New loader should go in owl_vaes/data/vr_video_dir_loader.py
  • Some specs for new loader
    • It should have randomization to ensure different workers get different videos (existing loader takes rank and world_size)
    • It should have target_size as Tuple[int,int] to control the per-eye resolution from the video.
    • window_length as int to control the number of frames that are sampled (for a video autoencoder)
    • Note that return shape is [b,c,h,w] in the window_length=1 case and [b,n,c,h,w], with c=6. This assumes channel-wise concat for both eyes views, which I think it fine. Feel free to push back on this if you disagree.
    • There may be a need for other VR-specific kwargs for sanity/debugging purposes. If you feel these are needed, feel free to add
  • Once a dataloader is created, and a get_loader function is created, add it to owl_vaes/data/__init__.py
  • In owl_vaes/data/vr_video_dir_loader.py you can add a testing function to ensure loader works

Logging:

  • See owl_vaes/utils/logging.py
  • Admittedly it's getting a bit cluttered, best to put new logging code in owl_vaes/utils/vr_logging.py
  • You can see how these are used in example trainers (will list later)
  • to_wandb_vr should be a function similar to to_wandb, which takes two [b,6,h,w] images and puts them side by side (original and reconstruction)
  • to_wandb_vr_video should be a function similar to to_wandb_video_sidebyside

Modelling:

  • For the time being nothing fancy is needed on modelling side, just use dcae with a channel count of 6 for both eyes. This is more so just intended to get you started with the codebase

Trainer:

  • You should be able to copy owl_vaes/trainers/rec.py or owl_vaes/trainers/video_rec.py and just replace the logging function. It might also make sense to just add logging information to the config so that this can all be wrapped into existing trainers.

Configs and Launching a Training Job:

  • See example configs in configs/waypoint_1/owl_vae_f16_c16.yml
  • You launch train runs with python -m train --config_path path_to_config.yml or torchrun --nproc_per_node=8 -m train --config_path path_to_config.yml
  • Use skypilot configs for multinode jobs, but keep in mind that for image vae you don't need more than one node

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions