PgVector embedder parameter is not accepting anything other than it's default OpenAI Embeddings. Help needed! #1746

Cipher-unhsiV · 2025-01-10T05:58:54Z

@manthanguptaa as per your instruction from the issue #1736 I tried several ways of using an opensource embedder but nothing actually worked. I have tried the following embeddings that are available in phidata docs:

MistralAI
Together
Huggingface
SentenceTransformers

I was going through multiple errors like sqlachemy dimensionality is not matching, httpx readtimeout, pydantic.core validation error, incompatible numpy version and a lot other errors just to mention some. It's just a simple agentic rag that should read a pdf through url via PDFUrlKnowledgeBase, store them in PgVector2 and answer a predefined user query by accessing the knowledge_base but getting really hectic and involving. Please do help me in this regard! I'll get you the snippet to better understand the scenario:

import typer
from phi.agent import Agent, RunResponse
from typing import Optional,List
from phi.assistant import Assistant
from phi.model.deepseek import DeepSeekChat
from phi.model.groq import Groq
from phi.storage.assistant.postgres import PgAssistantStorage
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.pgvector import PgVector2
from phi.embedder.mistral import MistralEmbedder
from phi.embedder.huggingface import HuggingfaceCustomEmbedder
from phi.embedder.together import TogetherEmbedder
from phi.embedder.sentence_transformer import SentenceTransformerEmbedder

import os
from dotenv import load_dotenv
load_dotenv()
os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge_base=PDFUrlKnowledgeBase(
    urls=['https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf'],
    vector_db=PgVector2(
        collection="recipies",
        db_url=db_url, 
        embedder=SentenceTransformerEmbedder(dimensions=1536),  # issue here
        )
)

knowledge_base.load(recreate=True, upsert=True)
#knowledge_base.load()

storage=PgAssistantStorage(table_name="pdf-assistant",db_url=db_url)

agent = Agent(
    model=Groq(id="llama-3.3-70b-versatile"),
    #model = SentenceTransformer('all-mpnet-base-v2', truncate_dim=384),
    knowledge=knowledge_base,
    storage=storage,
)

response: RunResponse = agent.run("What is the recipe for chicken curry?")
res = response.content

@manthanguptaa KINDLY DON'T CLOSE THIS ISSUE UNTIL I ACKNOWLEDGE ABOUT THE STATUS OF IMPROVEMENT IN LOCAL

The text was updated successfully, but these errors were encountered:

dirkbrnd · 2025-01-10T07:41:21Z

Hi @Cipher-unhsiV
I suggest using PgAgentStorage instead of PgAssistantStorage (it is deprecated). Also use PgVector instead of PgVector2 (also deprecated).

I'll let @manthanguptaa test after that if thats ok

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PgVector embedder parameter is not accepting anything other than it's default OpenAI Embeddings. Help needed! #1746

PgVector embedder parameter is not accepting anything other than it's default OpenAI Embeddings. Help needed! #1746

Cipher-unhsiV commented Jan 10, 2025

dirkbrnd commented Jan 10, 2025

PgVector embedder parameter is not accepting anything other than it's default OpenAI Embeddings. Help needed! #1746

PgVector embedder parameter is not accepting anything other than it's default OpenAI Embeddings. Help needed! #1746

Comments

Cipher-unhsiV commented Jan 10, 2025

dirkbrnd commented Jan 10, 2025