Skip to content

The issue of the input of caption #18

@liuxuannan

Description

@liuxuannan

I have a question about the position format of the caption in the input data in the command data. For example, the following sentence in the paper, A video of a Super-hero Movie. Is this sentence part of the text prompt, or does it need to be embedded through the imagebind model and then input into LLM?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions