Skip to content

The implementation of “MR2V: Customized Storytelling” #20

@lianqi1008

Description

@lianqi1008

Hi, thanks for releasing StoryMem.
I am trying to reproduce the MR2V customized storytelling examples shown on the project page, such as the clumsy man / young lovers / cafe examples. From the paper/project page, it seems that MR2V uses one or more reference images as memory and then generates a longer story video while preserving the referenced subject identity. I would like to understand how the long project-page videos were generated.

Could you please clarify the intended reproduction workflow?

  1. Were the long MR2V project-page videos generated shot-by-shot using multiple text prompts, with memory updated after each generated shot? Or were they generated from a single long story-level prompt?
  2. If they were generated shot-by-shot, what is the expected input format for the story script? For example, do you use a JSON file containing story_overview, video_prompts, first_frame_prompt, and scene-cut indicators?
  3. How are the reference images provided as the initial memory for MR2V? Are they saved as initial keyframes before running the pipeline?
  4. Are the exact prompts used for the project-page MR2V demos available, or could you share an example configuration?

I am currently able to generate a short single M2V clip, but I am not sure how to reproduce the longer MR2V story videos shown on the project page.

Any guidance or example command would be very helpful. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions