The Wiki Demo App is a FastAPI application that allows you to perform semantic search on a Wikipedia dataset using mixedbread ai's embedding and reranking models. Check out the blog post.
-
Clone the repository:
git clone https://github.com/mixedbread-ai/wiki_demo_app
-
Navigate to the project directory:
cd wiki-demo-app
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up the environment variables:
-
Create a copy of the
.env.example
file and rename it to.env
. -
Open the
.env
file and fill in the required values for the environment variables.
-
-
Download the Indexes
python download_index.py
-
Start the FastAPI server:
uvicorn demo.app:app --reload
-
Open your web browser and go to http://localhost:8000/docs to access the API documentation and test the endpoints.
-
POST /search/
: Performs a semantic search on the Wikipedia dataset.Request body:
-
search_query
(string): The search query. -
search_algorithm
(string, optional): The search algorithm to use. Can be"exact"
or"approx"
. Default:"exact"
. -
num_documents
(integer, optional): The number of top results to return. Default:100
. -
rescore_multiplier
(integer, optional): The multiplier for rescoring. Default:1
. -
reranking
(boolean, optional): Whether to perform reranking of the results. Default:true
. -
rescoring
(boolean, optional): Whether to perform rescoring of the results. Default:false
.Response:
-
results
(array): The search results, including the score, title, content, and source URL for each result. -
embedding_time
(float): The time taken for embedding the query. -
search_time
(float): The time taken for searching the index. -
sort_time
(float): The time taken for sorting the results. -
load_time
(float): The time taken for loading the embeddings (only applicable whenrescoring
istrue
). -
reranking_time
(float): The time taken for reranking the results (only applicable whenreranking
istrue
). -
total_time
(float): The total time taken for the search process.
-
This project is licensed under the MIT License. See the LICENSE file for details.