Skip to content

Image Input (Multi-modal models) #134

Open
@Legerdo

Description

@Legerdo

Describe the feature

hello.
llamafile seems to have image input functions such as jpg/png/gif/bmp.

Example)
llamafile -ngl 9999 --temp 0
--image ~/Pictures/lemurs.jpg
-m llava-v1.5-7b-Q4_K.gguf
--mmproj llava-v1.5-7b-mmproj-Q4_0.gguf
-e -p '### User: What do you see?\n### Assistant: '
--no-display-prompt 2>/dev/null

Is it possible to implement this feature in the future?
Or is there some problem that makes it impossible?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions