Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geobuddy gemini image agent not work via Api #1930

Closed
isikepalaku opened this issue Jan 29, 2025 · 3 comments
Closed

Geobuddy gemini image agent not work via Api #1930

isikepalaku opened this issue Jan 29, 2025 · 3 comments

Comments

@isikepalaku
Copy link

isikepalaku commented Jan 29, 2025

Hello team, facing problem, based on phidata geobuddy agent on streamlit i try running it via phidata example agent-api (fast api) my code :

from typing import Optional

from phi.agent import Agent
from phi.model.google import Gemini
from phi.tools.duckduckgo import DuckDuckGo
from phi.knowledge.agent import AgentKnowledge
from phi.storage.agent.postgres import PgAgentStorage
from phi.vectordb.pgvector import PgVector, SearchType

from agents.settings import agent_settings
from db.session import db_url

geo_agent_storage = PgAgentStorage(table_name="geo_agent_sessions", db_url=db_url)
geo_agent_knowledge = AgentKnowledge(
    vector_db=PgVector(table_name="geo_agent_knowledge", db_url=db_url, search_type=SearchType.hybrid)
)

def get_geo_agent(
    model_id: Optional[str] = None,
    user_id: Optional[str] = None,
    session_id: Optional[str] = None,
    debug_mode: bool = False,
) -> Agent:

    return Agent(
        name="Geo Image Agent",
        agent_id="geo-image-agent",
        session_id=session_id,
        user_id=user_id,
        model=Gemini(id="gemini-2.0-flash-exp"),
        tools=[DuckDuckGo()],
        description="You are an AI agent specialized in analyzing images and providing geographical and historical context.",
        instructions=[
            "Analyze images thoroughly to identify landmarks, architectural features, and geographical locations.",
            "Provide historical context and background information about identified locations.",
            "Use web search to find recent news or updates about the locations shown in images.",
            "Cite reliable sources when providing information.",
            "If an image is not provided, inform the user that an image is required for analysis.",
        ],
        markdown=True,
        show_tool_calls=True,
        add_datetime_to_instructions=True,
        storage=geo_agent_storage,
        read_chat_history=True,
        knowledge=geo_agent_knowledge,
        search_knowledge=True,
        monitoring=True,
        debug_mode=debug_mode,
    )

after several attempt send image or file using api template with swagger ui, the error stil same

if i send file image it will be :

2025-01-29 19:32:10 DEBUG    AgentRunRequest: Please analyze the geographic location and details    
2025-01-29 19:32:10          shown in this image. geo-expert-agent False False string string        
2025-01-29 19:32:10          [UploadFile(filename='unnamed.jpg', size=46993,                        
2025-01-29 19:32:10          headers=Headers({'content-disposition': 'form-data; name="files";      
2025-01-29 19:32:10          filename="unnamed.jpg"', 'content-type': 'image/jpeg'}))]              
2025-01-29 19:32:10 INFO:     172.18.0.1:34748 - "POST /v1/playground/agent/run HTTP/1.1" 404 Not Found
2025-01-29 19:32:11 DEBUG    AgentRunRequest: Please analyze the geographic location and details    
2025-01-29 19:32:11          shown in this image. geo-expert-agent False False string string        
2025-01-29 19:32:11          [UploadFile(filename='unnamed.jpg', size=46993,                        
2025-01-29 19:32:11          headers=Headers({'content-disposition': 'form-data; name="files";      
2025-01-29 19:32:11          filename="unnamed.jpg"', 'content-type': 'image/jpeg'}))]              
2025-01-29 19:32:11 INFO:     **172.18.0.1:34756 - "POST /v1/playground/agent/run HTTP/1.1" 404 Not Found**

when send in image form :

2025-01-29 17:52:14          X/OXOH5P5w/Q/gw7x4zv85yf8+M4Z2Zzc4f58WczDrOGGcHxjhyZwfWf9Dwx5P8AnBn8L+T
2025-01-29 17:52:14          P3H85+w4fwz+Zn8x/OfqH8M/b/o/wD+N/Wcvi/wATtnnHnHj6zxjzn/Z'              
2025-01-29 17:52:14 WARNING  Unknown image type: <class 'dict'>                                     
2025-01-29 17:52:16 DEBUG    ============== model ==============                                    
2025-01-29 17:52:16 DEBUG    I'm ready to analyze images. Please provide me with an image.          
2025-01-29 17:52:16                                                                                 
2025-01-29 17:52:16 DEBUG    **************** METRICS START ****************                        
2025-01-29 17:52:16 DEBUG    * Time to first token:         1.7758s                                 
2025-01-29 17:52:16 DEBUG    * Time to generate response:   1.7769s                                 
2025-01-29 17:52:16 DEBUG    * Tokens per second:           9.0045 tokens/s                         
2025-01-29 17:52:16 DEBUG    * Input tokens:                339                                     
2025-01-29 17:52:16 DEBUG    * Output tokens:               16                                      
2025-01-29 17:52:16 DEBUG    * Total tokens:                355                                     
2025-01-29 17:52:16 DEBUG    **************** METRICS END ******************                        
2025-01-29 17:52:16 DEBUG    ---------- Gemini Response End ----------

is this not work via api yet?

@dirkbrnd
Copy link
Contributor

Hi @isikepalaku
It looks like the image format is incorrect. We are doing a big release today which unifies the interface for uploading images. I can recommend checking that out and upgrading to using that after today. Also look at the cookbooks for image input for Gemini, then that should resolve the problem.
If not, let me know and we can dig in!

@isikepalaku
Copy link
Author

Hi @isikepalaku It looks like the image format is incorrect. We are doing a big release today which unifies the interface for uploading images. I can recommend checking that out and upgrading to using that after today. Also look at the cookbooks for image input for Gemini, then that should resolve the problem. If not, let me know and we can dig in!

is phidata will change into agno ? how about our old project

@ysolanky
Copy link
Contributor

Hello @isikepalaku ! Yes, we have rebranded to Agno starting today. There have been a number of improvements to the multimodal interface. As Dirk suggested please follow the new format to input images to a model. Check out the migration guide. Do let us know if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants