-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #9 from neonwatty/index_delta
Index delta
- Loading branch information
Showing
28 changed files
with
1,144 additions
and
123 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Change Log | ||
All notable changes to this project will be documented in this file. | ||
|
||
|
||
## 2024-07-17 | ||
|
||
### Added | ||
|
||
- Core tests added for query, imgs modules, add images re-indexing, remove image re-indexing | ||
|
||
- A new "refresh index" button has been introduced to update the index when images are added or removed from the data/input image directory, affecting only the newly added or removed images. | ||
|
||
|
||
<p align="center"> | ||
<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme_search_refresh_button.gif" height="200"> | ||
</p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Contributing to Meme Search | ||
|
||
Welcome to Meme Search! We're stoked that you're interested in contributing. | ||
|
||
Before you get started, please take a moment to read through the guidelines below. | ||
|
||
|
||
# How Can I Contribute? | ||
## Reporting Bugs | ||
If you encounter a bug or unexpected behavior in Meme Search, please help us by creating an issue in our GitHub repository. Be sure to include as much detail as possible to help us reproduce the issue. | ||
|
||
## Suggesting Enhancements | ||
Have an idea to improve Meme Search? Bring it on! You can submit your ideas by creating an issue in our GitHub repository and using the `enhancement` label. | ||
|
||
## Contributing Code | ||
If you're ready to contribute code to Meme Search, follow these steps: | ||
|
||
Fork the Repository: Start by forking the repository to your GitHub account. | ||
|
||
Clone the Repository: Clone the forked repository to your local machine. | ||
|
||
```sh | ||
git clone https://github.com/neonwatty/meme_search | ||
``` | ||
|
||
Create a Branch: Create a new branch for your feature or fix. | ||
|
||
```sh | ||
git checkout -b feature-branch | ||
``` | ||
|
||
Make Changes: Make your changes and ensure they follow the coding style of the project. | ||
|
||
Test Your Changes: Test your changes to ensure they work as expected. | ||
|
||
Commit Your Changes: Commit your changes with a clear and descriptive commit message. | ||
|
||
```sh | ||
git commit -m "Add feature or fix for XYZ" | ||
``` | ||
|
||
Push Your Changes: Push your branch to your forked repository. | ||
|
||
```sh | ||
git push origin feature-branch | ||
``` | ||
|
||
Create a Pull Request: Create a pull request from your forked repository to the main repository. Be sure to provide a detailed description of your changes. | ||
|
||
Review Process: The maintainers will review your pull request and may request changes or provide feedback. | ||
|
||
Merge: Once approved, your pull request will be merged into the main repository. Congratulations! | ||
|
||
# Code of Conduct | ||
|
||
Remember to always be excellent to each other. | ||
|
||
# Questions? | ||
If you have any questions that aren't addressed in this guide, feel free to reach out to us by creating an issue in our GitHub repository. | ||
|
||
Thank you for contributing to Meme Search! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +0,0 @@ | ||
A placeholder file to ensure this directory exists on github | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
import os | ||
import sqlite3 | ||
import faiss | ||
from meme_search.utilities import model | ||
from meme_search.utilities.text_extraction import extract_text_from_imgs | ||
from meme_search.utilities.chunks import create_all_img_chunks | ||
|
||
|
||
def add_to_chunk_db(img_chunks: list, sqlite_db_path: str) -> None: | ||
# Create a lookup table for chunks | ||
conn = sqlite3.connect(sqlite_db_path) | ||
cursor = conn.cursor() | ||
|
||
# Create the table | ||
cursor.execute(""" | ||
CREATE TABLE IF NOT EXISTS chunks_reverse_lookup ( | ||
img_path TEXT, | ||
chunk TEXT | ||
); | ||
""") | ||
|
||
# Insert data into the table | ||
for chunk_index, entry in enumerate(img_chunks): | ||
img_path = entry["img_path"] | ||
chunk = entry["chunk"] | ||
cursor.execute( | ||
"INSERT INTO chunks_reverse_lookup (img_path, chunk) VALUES (?, ?)", | ||
(img_path, chunk), | ||
) | ||
|
||
conn.commit() | ||
conn.close() | ||
|
||
|
||
def add_to_vector_db(chunks: list, vector_db_path: str) -> None: | ||
# embed inputs | ||
embeddings = model.encode(chunks) | ||
|
||
# dump all_embeddings to faiss index | ||
if os.path.exists(vector_db_path): | ||
index = faiss.read_index(vector_db_path) | ||
else: | ||
index = faiss.IndexFlatL2(embeddings.shape[1]) | ||
|
||
index.add(embeddings) | ||
faiss.write_index(index, vector_db_path) | ||
|
||
|
||
def add_to_dbs(img_chunks: list, sqlite_db_path: str, vector_db_path: str) -> None: | ||
try: | ||
print("STARTING: add_to_dbs") | ||
|
||
# add to db for img_chunks | ||
add_to_chunk_db(img_chunks, sqlite_db_path) | ||
|
||
# create vector embedding db for chunks | ||
chunks = [v["chunk"] for v in img_chunks] | ||
add_to_vector_db(chunks, vector_db_path) | ||
print("SUCCESS: add_to_dbs succeeded") | ||
except Exception as e: | ||
print(f"FAILURE: add_to_dbs failed with exception {e}") | ||
|
||
|
||
def add(new_imgs_to_be_indexed: list, sqlite_db_path: str, vector_db_path: str) -> None: | ||
moondream_answers = extract_text_from_imgs(new_imgs_to_be_indexed) | ||
img_chunks = create_all_img_chunks(new_imgs_to_be_indexed, moondream_answers) | ||
add_to_dbs(img_chunks, sqlite_db_path, vector_db_path) |
Oops, something went wrong.