Semantic Search Engine

Semantic Search Engine is an intelligent PDF-based search tool that allows you to ask natural language questions and get semantically relevant answers from documents.
It uses Sentence Transformers for embeddings and FAISS for vector similarity search.

Didnt used LanagChain, Langchain had lot of abstractions that made this alot simpler and efficient. But I didnt wanted it to be simple. So, went the traditional way.

✨ Features

PDF Reader – Upload and parse text from PDF files.
Text Chunking – Splits large documents into smaller sentence chunks for better embedding and retrieval.
Embeddings – Uses all-MiniLM-L6-v2 from Sentence Transformers to generate semantic embeddings.
Vector Search – Leverages FAISS to store and query embeddings efficiently.
Question Answering – Ask a question like "What is the relevance of Blockchain?" and retrieve the most relevant chunks.

⚡ Example

python main.py

Sample output:

Total chunks: 42
2
[[0.23 0.45]] [[12 33]]
['Blockchain enables secure and transparent transactions ...'] distance: 0.23
['Distributed ledgers provide ...'] distance: 0.45

🛠️ How It Works

Extract text from PDFs using PyPDF2.
Tokenize sentences using NLTK.
Split into chunks (approx. 100 words each).
Encode sentences into embeddings using Sentence Transformers.
Build a FAISS index to store embeddings.
Search queries against the index to find relevant chunks.

##📦 Prerequisites

Python 3.9+ (tested on 3.11 / 3.13)
Virtual Environment (venv) recommended Install dependencies:

pip install PyPDF2 nltk sentence-transformers faiss-cpu numpy

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
Artificial.pdf		Artificial.pdf
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Semantic Search Engine

✨ Features

⚡ Example

🛠️ How It Works

About

Uh oh!

Releases

Packages

Languages

AksaRose/Semantic-Search-Engine

Folders and files

Latest commit

History

Repository files navigation

Semantic Search Engine

✨ Features

⚡ Example

🛠️ How It Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages