Skip to content

etanlightstone-domino/etan_genai_qa

Repository files navigation

OpenAI Chat with your Docs

*Disclaimer - Domino Reference Projects are starter kits built by Domino researchers. They are not officially supported by Domino. Once loaded, they are yours to use or modify as you see fit. We hope they will be a beneficial tool on your journey!

This reference project shows how to use OpenAI's LLM to do Q&A over information that OpenAI's models have not been trained on and will not be able to provide answers out of the box. The way this works is to create embeddings of the document(s) that you want to query, run a semantic search to return information that can be provided as context/information along with the user's query as a prompt to the LLM and get results back. The project has the following files

  • OpenAI_QA_FAISS.ipynb : This file loads a PDF, converts it to embeddings, stores the embeddings locally using a FAISS index, runs the semantic search against the embeddings, constructs a prompt and calls OpenAI's models to get a response. You will need your OpenAPI key to be set in the environment for this example.

  • faiss_ddl_doc_store.pkl : This file contains the FAISS embeddings of Domino's documentation and Zendesk articles .

  • faiss_etf_doc_store.pkl : This file contains the FAISS embeddings of Vanguard's Select Global Value Fund ETF report .

  • app.sh : The shell script needed to run the chat app.

  • app.py : Streamlit app code for the Q&A chatbot.

  • ETF_Docs/Select_Global_Value_Fund.pdf : A report that can be used as an example for the flow that has been described above in case you want to compute embeddings on a fresh document

Setup instructions

This project requires the following compute environments to be present:

Rev4_Chapter_1

Environment Base

Domino Standard Environment Py3.9 R4.2

Dockerfile Instructions

USER root

RUN sudo apt-get update
RUN sudo apt -y install tesseract-ocr
RUN sudo apt-get -y install libpoppler-dev
RUN sudo apt-get install poppler-utils
RUN sudo apt -y install libmagic-dev

RUN pip install pinecone-client
RUN pip install langchain==0.0.144
RUN pip install unstructured[local-inference]
RUN pip install poppler-utils
RUN pip install openai
RUN pip install faiss-cpu

RUN pip install streamlit && \
    pip install streamlit-chat && \
    pip install tiktoken 


RUN pip install "detectron2@git+https://github.com/facebookresearch/[email protected]#egg=detectron2"


USER ubuntu

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published