-
Notifications
You must be signed in to change notification settings - Fork 5
Amits 2u/dos 5996 meilisearch #280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Meilisearch support to the infrastructure, introducing Meilisearch as an alternative search backend alongside Elasticsearch. The changes include Ansible roles for deployment, utility scripts for index management, Docker configurations, and integration with the edxapp configuration.
Key Changes
- Added Ansible role for Meilisearch installation and configuration
- Created utility scripts for index copying, verification, and incremental reindexing
- Integrated Meilisearch configuration into edxapp and Docker environments
Reviewed Changes
Copilot reviewed 22 out of 24 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vagrant/release/analyticstack/Vagrantfile | Added port forwarding for Meilisearch (7700) |
| util/meilisearch/verify-index-copy.py | New utility to compare Meilisearch indices |
| util/meilisearch/requirements.txt | Python dependencies for Meilisearch utilities |
| util/meilisearch/forums-incremental-reindex.sh | Shell script for incremental reindexing |
| util/meilisearch/copy-index.sh | Shell script to copy indices from Elasticsearch to Meilisearch |
| requirements/meilisearch.in | Source requirements for Meilisearch utilities |
| playbooks/roles/meilisearch/templates/lib/systemd/system/meilisearch.service.j2 | Systemd service configuration for Meilisearch |
| playbooks/roles/meilisearch/templates/etc/default/meilisearch.j2 | Environment defaults for Meilisearch service |
| playbooks/roles/meilisearch/templates/edx/etc/meilisearch/meilisearch.yml.j2 | Main Meilisearch configuration template |
| playbooks/roles/meilisearch/tasks/main.yml | Ansible tasks for Meilisearch deployment |
| playbooks/roles/meilisearch/meta/main.yml | Ansible role metadata and dependencies |
| playbooks/roles/meilisearch/defaults/main.yml | Default variables for Meilisearch role |
| playbooks/roles/edxapp/defaults/main.yml | Added Meilisearch configuration to edxapp |
| playbooks/roles/common_vars/defaults/main.yml | Added common certificate variables |
| playbooks/meilisearch.yml | Ansible playbook for Meilisearch deployment |
| playbooks/elasticsearch.yml | Commented out Elasticsearch playbook |
| playbooks/edx_continuous_integration.yml | Added Meilisearch to CI roles |
| docker/plays/meilisearch.yml | Docker play for Meilisearch |
| docker/build/meilisearch/Dockerfile | Dockerfile for Meilisearch container |
| docker/build/meilisearch-devstack/Dockerfile | Devstack-specific Meilisearch Dockerfile |
| docker/build/edxapp/studio.yml | Added Meilisearch config to Studio |
| docker/build/edxapp/lms.yml | Added Meilisearch config to LMS |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| diff = DeepDiff(old_doc, new_doc, ignore_order=True) | ||
| for field in ignorable_fields: | ||
| diff.pop(field, None) |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff object from DeepDiff is not a simple dictionary and doesn't support pop() method directly. To ignore fields, pass them to the exclude_paths parameter of DeepDiff or check if the diff contains only ignorable fields using proper DeepDiff methods.
| diff = DeepDiff(old_doc, new_doc, ignore_order=True) | |
| for field in ignorable_fields: | |
| diff.pop(field, None) | |
| # Build exclude_paths for top-level fields | |
| exclude_paths = [f"root['{field}']" for field in ignorable_fields] | |
| diff = DeepDiff(old_doc, new_doc, ignore_order=True, exclude_paths=exclude_paths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 24 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # ./incremental-reindex-meili.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE] | ||
| # | ||
| # Example: | ||
| # ./incremental-reindex-meili.sh content 30 |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usage comment references 'incremental-reindex-meili.sh' but the actual filename is 'forums-incremental-reindex.sh'. Update the usage documentation to match the actual filename.
| # ./incremental-reindex-meili.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE] | |
| # | |
| # Example: | |
| # ./incremental-reindex-meili.sh content 30 | |
| # ./forums-incremental-reindex.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE] | |
| # | |
| # Example: | |
| # ./forums-incremental-reindex.sh content 30 |
| while : ; do | ||
| echo "Fetching documents newer than $WINDOW minutes..." | ||
|
|
||
| NEW_DOCS=$(./fetch_new_docs.rb "$WINDOW" "$INDEX" "$BATCH_SIZE") |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script references './fetch_new_docs.rb' but this file does not exist in the repository. This will cause the script to fail when executed.
| NEW_DOCS=$(./fetch_new_docs.rb "$WINDOW" "$INDEX" "$BATCH_SIZE") | |
| # Fetch documents newer than $WINDOW minutes using Meilisearch API | |
| SINCE_TIMESTAMP=$(date -u -d "$WINDOW minutes ago" +"%Y-%m-%dT%H:%M:%SZ") | |
| NEW_DOCS=$(curl -s "${AUTH_HEADER[@]}" \ | |
| "$MEILI_URL/indexes/$INDEX/documents?limit=$BATCH_SIZE" \ | |
| | jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]') |
util/meilisearch/copy-index.sh
Outdated
| set -euo pipefail | ||
|
|
||
| # Usage: | ||
| # ./copy-index-meili.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY] |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usage comment references 'copy-index-meili.sh' but the actual filename is 'copy-index.sh'. Update the usage documentation to match the actual filename.
| # ./copy-index-meili.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY] | |
| # ./copy-index.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY] |
| print(f"Checking {sample_size} random documents...") | ||
|
|
||
| # Fetch docs from old index (cap batch size at 1000 for safety) | ||
| old_docs_response = old_idx.get_documents(limit=min(sample_size, 1000)) |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When sample_size exceeds 1000, only the first 1000 documents are fetched, but then random.sample at line 75 attempts to sample from this limited set. This means documents beyond the first 1000 can never be checked even if they should be part of the random sample. Consider fetching all documents needed or adjusting the sampling logic.
| @@ -0,0 +1,13 @@ | |||
| # docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack | |||
|
|
|||
| FROM getmeili/meilisearch:v1.10.0 | |||
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The devstack Dockerfile uses Meilisearch v1.10.0, but other configuration files (playbooks/roles/meilisearch/defaults/main.yml line 20 and playbooks/meilisearch.yml line 11) use v1.11.0. Consider standardizing on a single version across all configurations.
| FROM getmeili/meilisearch:v1.10.0 | |
| FROM getmeili/meilisearch:v1.11.0 |
5 ansible role
26c4ac3 to
8dfefe3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 20 out of 24 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| MEILI_SEARCH_CONFIG: | ||
| - host: edx.devstack.meilisearch184 | ||
| port: 7700 | ||
| use_ssl: false |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MEILI_SEARCH_CONFIG is defined as a list with a single dictionary entry (similar to ELASTIC_SEARCH_CONFIG), but it includes a use_ssl property at line 223 that belongs to ELASTIC_SEARCH_CONFIG, not MEILI_SEARCH_CONFIG. The indentation shows use_ssl: false is under ELASTIC_SEARCH_CONFIG. However, based on the structure, MEILI_SEARCH_CONFIG should have its own properties including use_ssl and api_key. Please verify the YAML structure is correctly indented.
| use_ssl: false |
| SINCE_TIMESTAMP=$(date -u -d "$WINDOW minutes ago" +"%Y-%m-%dT%H:%M:%SZ") | ||
| NEW_DOCS=$(curl -s "${AUTH_HEADER[@]}" \ | ||
| "$MEILI_URL/indexes/$INDEX/documents?limit=$BATCH_SIZE" \ | ||
| | jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]') |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jq filter expects the response to be an array, but the Meilisearch API returns an object with a results key containing the array of documents. The filter should be .results[] | select(.updated_at >= $since) to properly access the documents.
| | jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]') | |
| | jq --arg since "$SINCE_TIMESTAMP" '[.results[] | select(.updated_at >= $since)]') |
| # docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack | ||
|
|
||
| FROM getmeili/meilisearch:v1.11.0 |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment references Meilisearch 1.11.0 but the Dockerfile uses the official Meilisearch image getmeili/meilisearch:v1.11.0. While this is correct, it would be better to use a variable for the version to ensure consistency across the codebase, similar to how meilisearch_version is used in other files.
| # docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack | |
| FROM getmeili/meilisearch:v1.11.0 | |
| # docker build -f docker/build/meilisearch-devstack/Dockerfile . --build-arg MEILISEARCH_VERSION=1.11.0 -t edxops/meilisearch:devstack | |
| ARG MEILISEARCH_VERSION=1.11.0 | |
| FROM getmeili/meilisearch:v${MEILISEARCH_VERSION} |
|
|
||
| WORKDIR /etc/meilisearch | ||
| CMD service meilisearch start && sleep 5 && tail -f /edx/var/log/meilisearch/meilisearch.log | ||
| EXPOSE 7700 7701 |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Dockerfile exposes ports 7700 and 7701, but Meilisearch typically only uses port 7700 for its HTTP API. Port 7701 is not a standard Meilisearch port. If 7701 is intended for a specific purpose, it should be documented; otherwise, it should be removed from the EXPOSE directive.
| EXPOSE 7700 7701 | |
| EXPOSE 7700 |
| diff = DeepDiff(old_doc, new_doc, ignore_order=True) | ||
| for field in ignorable_fields: | ||
| diff.pop(field, None) | ||
| return diff == {} |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison diff == {} will not work as expected because DeepDiff returns a DeepDiff object, not a plain dictionary. To check if there are no differences, you should use if not diff or if diff == DeepDiff().
| return diff == {} | |
| return not diff |
| echo "Sleeping $SLEEP_TIME seconds..." | ||
| sleep "$SLEEP_TIME" | ||
|
|
||
| [ "$SLEEP_TIME" -le 0 ] && break |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition [ "$SLEEP_TIME" -le 0 ] will never be true during normal operation since SLEEP_TIME defaults to 60 and is used for sleeping. This line appears to be a misplaced break condition. If the intent is to run indefinitely, this check should be removed, or if it's meant to allow a single run, the logic should be reconsidered.
| [ "$SLEEP_TIME" -le 0 ] && break |
| curl -s -X DELETE "$TARGET_MEILI/indexes/$TARGET_INDEX" "${AUTH_HEADER[@]}" || true | ||
| curl -s -X POST "$TARGET_MEILI/indexes" \ | ||
| -H "Content-Type: application/json" \ | ||
| "${AUTH_HEADER[@]}" \ | ||
| -d "{ \"uid\": \"$TARGET_INDEX\" }" |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other script, the array expansion "${AUTH_HEADER[@]}" should not be quoted when the array might be empty. This will cause issues when AUTH_HEADER is an empty array. Use ${AUTH_HEADER[@]} without the outer quotes.
| MEILI_HTTP_ADDR={{ meilisearch_host | default("0.0.0.0") }} | ||
|
|
||
| # The port Meilisearch listens on | ||
| MEILI_HTTP_PORT={{ meilisearch_port | default("7700") }} |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable MEILI_HTTP_PORT is set here but the systemd service file uses {{ meilisearch_port }} directly in the ExecStart command. This creates an inconsistency - the port should be consistently sourced from either the environment variable or the Ansible variable, not both. Consider using the environment file variable in the service or removing this environment variable declaration.
| MEILI_HTTP_PORT={{ meilisearch_port | default("7700") }} |
| # See official docs: | ||
| # https://www.meilisearch.com/docs/learn/configuration/instance_options | ||
| # | ||
| # This template generates a .env file that Meilisearch will load at startup. |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment mentions this template generates a .env file, but the actual file being created in the tasks is named just meilisearch (without .env extension). Either the comment should be updated to reflect the actual filename, or the destination filename in the tasks should use a .env extension.
| # This template generates a .env file that Meilisearch will load at startup. | |
| # This template generates a 'meilisearch' file that Meilisearch will load at startup. |
| MEILI_SEARCH_CONFIG: | ||
| - host: edx.devstack.meilisearch184 | ||
| port: 7700 | ||
| use_ssl: false |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MEILI_SEARCH_CONFIG is defined as a list with a single dictionary entry (similar to ELASTIC_SEARCH_CONFIG), but it includes a use_ssl property at line 246 that belongs to ELASTIC_SEARCH_CONFIG, not MEILI_SEARCH_CONFIG. The indentation shows use_ssl: false is under ELASTIC_SEARCH_CONFIG. However, based on the structure, MEILI_SEARCH_CONFIG should have its own properties including use_ssl and api_key. Please verify the YAML structure is correctly indented.
| use_ssl: false | |
| use_ssl: false | |
| api_key: '' |
Make sure that the following steps are done before merging: