[OPENAI] Support image edits with gpt-image-1 #152

sbounmy · 2025-05-05T05:33:37Z

Still a draft but I use it in production in my app https://github.com/sbounmy/hongbao_bitcoin

Usage

avatar = User.last.avatar
image = RubyLLM.edit(
  "Transform into ghibli style",
  model: "gpt-image-1",
  with: { image: [ ActiveStorage::Blob.service.path_for(avatar.key) ] } # accepts a path or remote url
  options: { size: '1024x1024', quality: 'medium' }
)

image.to_blob # image to store 
image.usage # {'input_tokens' => 362, 'input_tokens_details' => { 'image_tokens' => 323, 'text_tokens' => 39 }, 'output_tokens' => 4160, 'total_tokens' => 4522 })
image.total_cost) # 0.17002
image.input_cost) # 0.00362
image.output_cost # 0.1664

Todo :

fix the multipart connection code vs json
able to customize image output https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1#customize-image-output
remove some deprecated code (errors)
see if models.json is correct (didnt import gpt-image-1 when I ran the rake task so I added manually)

…dits to work

sbounmy · 2025-05-08T04:57:49Z

issue #138

…ith-image * 'main' of github.com:crmne/ruby_llm: (24 commits) Enhance Rails guide with detailed persistence flow explanation and setup instructions Remove work-in-progress warning from models documentation generation Add validation considerations for Message model and update persistence flow documentation Add note about upcoming OpenAI headers support in v1.3.0 Handle OpenAI organization and project IDs (crmne#162) Refactor acts_as_message and acts_as_tool_call methods to improve parameter handling and default values Remove reasoning section from available models documentation and rake task Remove debug logging for pricing in OpenRouter models Updated models page Fixed pricing parsing for OpenRouter Updated models Add warning about work in progress for Parsera integration in available models documentation Major refactoring of ModelInfo and Parsera API support for listing LLM capabilities and pricing. Fix inflector (crmne#159) Use foreign_key instead of to_s for acts_as methods (crmne#157) Fixes #embed fails when using default embedding model Add support for logging to file via configuration (crmne#148) Updated acts_as_* helpers to use canonical 'rails-style' foreign keys (crmne#151) refactor(media): streamline content formatting methods across providers Fixed Calling `chat.to_llm` keeps appending messages to the message array ...

sbounmy · 2025-05-12T09:35:52Z

would be great to have your feedback @crmne

I am a bit struggling with capabilities for gpt-image-1 / models.json generation

Resolves #200 Seems pretty simple to me. Borrowed the model passing idea from #152

sbounmy · 2025-06-05T04:02:34Z

@crmne would be great to not ghost PRs. at least pin out what could be done better if you don't agree

crmne · 2025-06-05T08:35:16Z

Thanks for the work on this. Image editing is definitely in scope, but I'd prefer extending the existing paint method rather than adding a separate edit method:

# Generate from scratch (current behavior)
RubyLLM.paint("a sunset over mountains")

# Edit existing image (new behavior)  
RubyLLM.paint("make it more vibrant", with: "path/to/image.png")

This keeps the API consistent with how chat.ask handles attachments.

On "ghosting": I respond when I can. This is unpaid work I do between running my business and other priorities. Characterizing my delayed responses as "ghosting" is inappropriate and creates a toxic environment.

I'll review this properly when I have time.

sbounmy · 2025-06-05T09:18:56Z

@crmne thats the feeling I had, my bad if you were hurt by "ghosting".

As you know we also contribute (for free), when a PR doesn't get merged and we keep having conflicts as we try to catch up main branch which is frustrating.

Regarding the PR I wanted to use #paint but the edit API call involved different changes (multipart/form-data etc) that I thought might be cleaner to have a separate one. #138

Thanks for maintaining this gem

crmne · 2025-06-05T09:43:55Z

I appreciate the apology. The merge conflict frustration is understandable.

I still prefer extending paint rather than adding edit. The technical complexity (multipart vs JSON) should be hidden from users - that's an implementation detail. Having both methods for what's essentially the same operation (image generation) is confusing.

The API should be:

RubyLLM.paint("prompt") # generate
RubyLLM.paint("prompt", with: "path") # edit

This matches how chat.ask handles attachments and keeps the interface clean.

sbounmy added 6 commits April 27, 2025 15:19

merged main but still need to handle multipart connection for image e…

d917ed7

…dits to work

support multiple images

f002e00

added image attachments

4d113ae

fixed specs

fd85df5

store tokens on image

f5c0c81

pass model so we can compute image#cost

bdb16cb

sbounmy marked this pull request as draft May 5, 2025 05:33

This was referenced May 5, 2025

[OPENAI] Support Image edits gpt-image-1 #138

Open

Draft PR ruby_llm sbounmy/hongbao_bitcoin#241

Open

able to specify options in RubyLLM#edit

e765cad

sbounmy added 4 commits May 12, 2025 10:56

update capabilities

7a55a30

removed error

5b93e9d

fix conneciton multipart

a081b90

sbounmy marked this pull request as ready for review May 12, 2025 09:36

sbounmy added 2 commits May 12, 2025 11:53

fix duplicate models json

087a149

set headers content type

1a33ce6

tpaulshippy mentioned this pull request May 22, 2025

Support gpt-image-1 #201

Merged

crmne pushed a commit that referenced this pull request May 22, 2025

Support gpt-image-1 (#201)

213e601

Resolves #200 Seems pretty simple to me. Borrowed the model passing idea from #152

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[OPENAI] Support image edits with gpt-image-1 #152

[OPENAI] Support image edits with gpt-image-1 #152

Uh oh!

sbounmy commented May 5, 2025 •

edited

Loading

Uh oh!

sbounmy commented May 8, 2025

Uh oh!

sbounmy commented May 12, 2025

Uh oh!

sbounmy commented Jun 5, 2025

Uh oh!

crmne commented Jun 5, 2025 •

edited

Loading

Uh oh!

sbounmy commented Jun 5, 2025

Uh oh!

crmne commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

[OPENAI] Support image edits with gpt-image-1 #152

Are you sure you want to change the base?

[OPENAI] Support image edits with gpt-image-1 #152

Uh oh!

Conversation

sbounmy commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbounmy commented May 8, 2025

Uh oh!

sbounmy commented May 12, 2025

Uh oh!

sbounmy commented Jun 5, 2025

Uh oh!

crmne commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbounmy commented Jun 5, 2025

Uh oh!

crmne commented Jun 5, 2025

Uh oh!

Uh oh!

sbounmy commented May 5, 2025 •

edited

Loading

crmne commented Jun 5, 2025 •

edited

Loading