This project is a demonstration of a hotel search system built using Superlinked implemented in Qdrant Basically this is a clone of the hotel-search-recipe but implemented in Qdrant.
- Natural Language Queries: Search for hotels using everyday language.
- Multi-modal Semantic Search: Utilize different data types for comprehensive search results.
- Cheap but highly rated hotels in Paris, no children
- No pets, posh hotel in Berlin
- Popular hotels in center of London with free breakfast
- Text: Hotel descriptions.
- Numbers: Price, rating, and number of reviews.
- Location: City.
- Numbers: Price, ratings.
- Amenities: Options for property and room amenities; wellnes and spa; accessibility; children.
{
"id": "Lovely Hotel",
"country": "Germany",
"city": "Berlin",
"accomodation_type": "Hotel",
"price": 42,
"image_src": "...",
"description": "A family hotel close to city center ...",
"rating_count": 6543,
"rating": 8.9,
"property_amenities": ["Free parking", "Breakfast"],
"room_amenities": ["Air conditioning", "Balcony"],
"wellness_spa": [],
"accessibility": ["Wheelchair accessible"],
"for_children": ["Childcare", "Cot"],
}
This section provides a step-by-step guide on how to run the whole system locally.
More details are provided below, in the Tutorial section.
Visit Qdrant Cloud to start working with your own VDB. Local and cloud VDBs are supported.
If you are interested in running the project on our DB, please contact us at.
Use superlinked_app/.env-example
as a template, create superlinked_app/.env
and set OPENAI_API_KEY
required for Natural Query Interface, QDRANT_URL
and QDRANT_API_KEY
required for Qdrant Vector Database.
python3.11 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
APP_MODULE_PATH=superlinked_app python -m superlinked.server
It will take some time (depending on the network) to download the sentence-transformers model for the very first time.
API docs will be available at localhost:8080/docs.
To ingest the dataset, run this command in your terminal:
curl -X 'POST' \
'http://localhost:8080/data-loader/hotel/run' \
-H 'accept: application/json' \
-d ''
Please waite until the ingestion is finished. You will see the message.
cd frontend_app
python3.11 -m venv .venv-frontend
. .venv-frontend/bin/activate
pip install -e .
python -m streamlit run app/frontend/main.py
The Streamlit UI will be available at localhost:8501.
Attach to VDB and experiment with different superlinked queries from the jupyter notebook: superlinked-queries.ipynb.
The superlinked cli
is a one-package solution to deploy the Superlinked cluster on your GCP cloud.
Via superlinked cli
you will be able to run superlinked application at scale with additional important components such as batch engine, logging and more, utilizing the same superlinked configuration you used in your local setup!
Want to try it now? Contact us at superlinked.com.
To configure your superlinked application you need to create a simple python package with few files, we will go though them one by one. All files contain necessary inline comments, check them out! Also, feel free to read our docs: docs.superlinked.com.
Once you are happy with your local Superlinked setup, you can use config files without changes for your Cloud deployent. To make transition to the cloud smooth, we provide Superlinked CLI. Contact us if you want to try it now!
It's needed just to make a python package, you can keep it empty.
Settings of our application are read from .env
file.
You can create one simply by copying .env-example
and setting openai_api_key
which is needed for NLQ.
This file defines three important things:
- object schema: declares names and types of raw attributes
- vector spaces: bind embedders to schema fields
- index: combines spaces for multi-modal vector search
In our superlinked application, we will embed one textual field (hotel description
) and three numeric fields (price
, rating
, rating_count
).
Description is embedded using all-mpnet-base-v2.
If you need faster model, you can try all-MiniLM-L6-v2.
Or if you are aiming for better retrieval quality, bigger models like gte-large-en-v1.5 are worth checking out.
Note. Apart from texts and numbers, out-of-the-box Superlinked can embed images, categories, recency. It also supports arbitrary embeddings via custom spaces. Learn more about Superlinked embeddings in our github!
Attribues like city, hotel-type, and amenities are used for hard-filtering.
These two files define superlinked queries used for multi-modal semantic search with Natural Language Interface (NLI) on top. Our github contains many helpful notebooks that show how to configure superlinked queries:
This file sets the following components:
- vector database: in current application we are using Qdrant.
- data loader: our data is ingested from gcp bucket
- REST API: our app will provide endpoints for ingestion (bulk and one-by-one) and for querying. More information is in our docs.
We publish our recipes as a starting point for your own projects. There are many things you might want to try:
- Experiment with superlinked queries. Try to come up with more queries focused on different search scenarios fitting your use-case.
- Bring your own dataset. Want to run Natural Language Query with your data? Define your schema, spaces, index, queries, and data-sources based on this recipe. In case of questions, don't hesitate to contact us!
- Try different VDBs. Depending on your needs you can choose one of the VDBs we currently support. More to come!
- Try other text embedding models. There are a ton of different text embedding models out there. Discover sentence-transformers, hugging-face and select models that suit your use-case best.
- Explore additional use-cases. Check out our notebooks and docs.