Skip to content

Conversation

@amits-2u
Copy link


Make sure that the following steps are done before merging:

  • Have a Site Reliability Engineer review the PR if you don't own all of the services impacted.
  • If you are adding any new default values that need to be overridden when this change goes live, update internal repos and add an entry to the top of the CHANGELOG.
  • Performed the appropriate testing.

Copilot AI review requested due to automatic review settings October 24, 2025 09:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Meilisearch support to the infrastructure, introducing Meilisearch as an alternative search backend alongside Elasticsearch. The changes include Ansible roles for deployment, utility scripts for index management, Docker configurations, and integration with the edxapp configuration.

Key Changes

  • Added Ansible role for Meilisearch installation and configuration
  • Created utility scripts for index copying, verification, and incremental reindexing
  • Integrated Meilisearch configuration into edxapp and Docker environments

Reviewed Changes

Copilot reviewed 22 out of 24 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
vagrant/release/analyticstack/Vagrantfile Added port forwarding for Meilisearch (7700)
util/meilisearch/verify-index-copy.py New utility to compare Meilisearch indices
util/meilisearch/requirements.txt Python dependencies for Meilisearch utilities
util/meilisearch/forums-incremental-reindex.sh Shell script for incremental reindexing
util/meilisearch/copy-index.sh Shell script to copy indices from Elasticsearch to Meilisearch
requirements/meilisearch.in Source requirements for Meilisearch utilities
playbooks/roles/meilisearch/templates/lib/systemd/system/meilisearch.service.j2 Systemd service configuration for Meilisearch
playbooks/roles/meilisearch/templates/etc/default/meilisearch.j2 Environment defaults for Meilisearch service
playbooks/roles/meilisearch/templates/edx/etc/meilisearch/meilisearch.yml.j2 Main Meilisearch configuration template
playbooks/roles/meilisearch/tasks/main.yml Ansible tasks for Meilisearch deployment
playbooks/roles/meilisearch/meta/main.yml Ansible role metadata and dependencies
playbooks/roles/meilisearch/defaults/main.yml Default variables for Meilisearch role
playbooks/roles/edxapp/defaults/main.yml Added Meilisearch configuration to edxapp
playbooks/roles/common_vars/defaults/main.yml Added common certificate variables
playbooks/meilisearch.yml Ansible playbook for Meilisearch deployment
playbooks/elasticsearch.yml Commented out Elasticsearch playbook
playbooks/edx_continuous_integration.yml Added Meilisearch to CI roles
docker/plays/meilisearch.yml Docker play for Meilisearch
docker/build/meilisearch/Dockerfile Dockerfile for Meilisearch container
docker/build/meilisearch-devstack/Dockerfile Devstack-specific Meilisearch Dockerfile
docker/build/edxapp/studio.yml Added Meilisearch config to Studio
docker/build/edxapp/lms.yml Added Meilisearch config to LMS

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +12 to +14
diff = DeepDiff(old_doc, new_doc, ignore_order=True)
for field in ignorable_fields:
diff.pop(field, None)
Copy link

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff object from DeepDiff is not a simple dictionary and doesn't support pop() method directly. To ignore fields, pass them to the exclude_paths parameter of DeepDiff or check if the diff contains only ignorable fields using proper DeepDiff methods.

Suggested change
diff = DeepDiff(old_doc, new_doc, ignore_order=True)
for field in ignorable_fields:
diff.pop(field, None)
# Build exclude_paths for top-level fields
exclude_paths = [f"root['{field}']" for field in ignorable_fields]
diff = DeepDiff(old_doc, new_doc, ignore_order=True, exclude_paths=exclude_paths)

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings November 4, 2025 09:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 24 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 5 to 8
# ./incremental-reindex-meili.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE]
#
# Example:
# ./incremental-reindex-meili.sh content 30
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage comment references 'incremental-reindex-meili.sh' but the actual filename is 'forums-incremental-reindex.sh'. Update the usage documentation to match the actual filename.

Suggested change
# ./incremental-reindex-meili.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE]
#
# Example:
# ./incremental-reindex-meili.sh content 30
# ./forums-incremental-reindex.sh INDEX [WINDOW] [SLEEP_TIME] [BATCH_SIZE]
#
# Example:
# ./forums-incremental-reindex.sh content 30

Copilot uses AI. Check for mistakes.
while : ; do
echo "Fetching documents newer than $WINDOW minutes..."

NEW_DOCS=$(./fetch_new_docs.rb "$WINDOW" "$INDEX" "$BATCH_SIZE")
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script references './fetch_new_docs.rb' but this file does not exist in the repository. This will cause the script to fail when executed.

Suggested change
NEW_DOCS=$(./fetch_new_docs.rb "$WINDOW" "$INDEX" "$BATCH_SIZE")
# Fetch documents newer than $WINDOW minutes using Meilisearch API
SINCE_TIMESTAMP=$(date -u -d "$WINDOW minutes ago" +"%Y-%m-%dT%H:%M:%SZ")
NEW_DOCS=$(curl -s "${AUTH_HEADER[@]}" \
"$MEILI_URL/indexes/$INDEX/documents?limit=$BATCH_SIZE" \
| jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]')

Copilot uses AI. Check for mistakes.
set -euo pipefail

# Usage:
# ./copy-index-meili.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY]
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage comment references 'copy-index-meili.sh' but the actual filename is 'copy-index.sh'. Update the usage documentation to match the actual filename.

Suggested change
# ./copy-index-meili.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY]
# ./copy-index.sh SOURCE_ES SOURCE_INDEX TARGET_MEILI TARGET_INDEX [API_KEY]

Copilot uses AI. Check for mistakes.
print(f"Checking {sample_size} random documents...")

# Fetch docs from old index (cap batch size at 1000 for safety)
old_docs_response = old_idx.get_documents(limit=min(sample_size, 1000))
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When sample_size exceeds 1000, only the first 1000 documents are fetched, but then random.sample at line 75 attempts to sample from this limited set. This means documents beyond the first 1000 can never be checked even if they should be part of the random sample. Consider fetching all documents needed or adjusting the sampling logic.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,13 @@
# docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack

FROM getmeili/meilisearch:v1.10.0
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The devstack Dockerfile uses Meilisearch v1.10.0, but other configuration files (playbooks/roles/meilisearch/defaults/main.yml line 20 and playbooks/meilisearch.yml line 11) use v1.11.0. Consider standardizing on a single version across all configurations.

Suggested change
FROM getmeili/meilisearch:v1.10.0
FROM getmeili/meilisearch:v1.11.0

Copilot uses AI. Check for mistakes.
@amits-2u amits-2u force-pushed the amits-2u/DOS-5996-meilisearch branch from 26c4ac3 to 8dfefe3 Compare November 17, 2025 14:06
Copilot AI review requested due to automatic review settings November 17, 2025 15:30
Copilot finished reviewing on behalf of amits-2u November 17, 2025 15:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 20 out of 24 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

MEILI_SEARCH_CONFIG:
- host: edx.devstack.meilisearch184
port: 7700
use_ssl: false
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MEILI_SEARCH_CONFIG is defined as a list with a single dictionary entry (similar to ELASTIC_SEARCH_CONFIG), but it includes a use_ssl property at line 223 that belongs to ELASTIC_SEARCH_CONFIG, not MEILI_SEARCH_CONFIG. The indentation shows use_ssl: false is under ELASTIC_SEARCH_CONFIG. However, based on the structure, MEILI_SEARCH_CONFIG should have its own properties including use_ssl and api_key. Please verify the YAML structure is correctly indented.

Suggested change
use_ssl: false

Copilot uses AI. Check for mistakes.
SINCE_TIMESTAMP=$(date -u -d "$WINDOW minutes ago" +"%Y-%m-%dT%H:%M:%SZ")
NEW_DOCS=$(curl -s "${AUTH_HEADER[@]}" \
"$MEILI_URL/indexes/$INDEX/documents?limit=$BATCH_SIZE" \
| jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]')
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jq filter expects the response to be an array, but the Meilisearch API returns an object with a results key containing the array of documents. The filter should be .results[] | select(.updated_at >= $since) to properly access the documents.

Suggested change
| jq --arg since "$SINCE_TIMESTAMP" '[.[] | select(.updated_at >= $since)]')
| jq --arg since "$SINCE_TIMESTAMP" '[.results[] | select(.updated_at >= $since)]')

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +3
# docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack

FROM getmeili/meilisearch:v1.11.0
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment references Meilisearch 1.11.0 but the Dockerfile uses the official Meilisearch image getmeili/meilisearch:v1.11.0. While this is correct, it would be better to use a variable for the version to ensure consistency across the codebase, similar to how meilisearch_version is used in other files.

Suggested change
# docker build -f docker/build/meilisearch-devstack/Dockerfile . -t edxops/meilisearch:devstack
FROM getmeili/meilisearch:v1.11.0
# docker build -f docker/build/meilisearch-devstack/Dockerfile . --build-arg MEILISEARCH_VERSION=1.11.0 -t edxops/meilisearch:devstack
ARG MEILISEARCH_VERSION=1.11.0
FROM getmeili/meilisearch:v${MEILISEARCH_VERSION}

Copilot uses AI. Check for mistakes.

WORKDIR /etc/meilisearch
CMD service meilisearch start && sleep 5 && tail -f /edx/var/log/meilisearch/meilisearch.log
EXPOSE 7700 7701
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile exposes ports 7700 and 7701, but Meilisearch typically only uses port 7700 for its HTTP API. Port 7701 is not a standard Meilisearch port. If 7701 is intended for a specific purpose, it should be documented; otherwise, it should be removed from the EXPOSE directive.

Suggested change
EXPOSE 7700 7701
EXPOSE 7700

Copilot uses AI. Check for mistakes.
diff = DeepDiff(old_doc, new_doc, ignore_order=True)
for field in ignorable_fields:
diff.pop(field, None)
return diff == {}
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison diff == {} will not work as expected because DeepDiff returns a DeepDiff object, not a plain dictionary. To check if there are no differences, you should use if not diff or if diff == DeepDiff().

Suggested change
return diff == {}
return not diff

Copilot uses AI. Check for mistakes.
echo "Sleeping $SLEEP_TIME seconds..."
sleep "$SLEEP_TIME"

[ "$SLEEP_TIME" -le 0 ] && break
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition [ "$SLEEP_TIME" -le 0 ] will never be true during normal operation since SLEEP_TIME defaults to 60 and is used for sleeping. This line appears to be a misplaced break condition. If the intent is to run indefinitely, this check should be removed, or if it's meant to allow a single run, the logic should be reconsidered.

Suggested change
[ "$SLEEP_TIME" -le 0 ] && break

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +23
curl -s -X DELETE "$TARGET_MEILI/indexes/$TARGET_INDEX" "${AUTH_HEADER[@]}" || true
curl -s -X POST "$TARGET_MEILI/indexes" \
-H "Content-Type: application/json" \
"${AUTH_HEADER[@]}" \
-d "{ \"uid\": \"$TARGET_INDEX\" }"
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the other script, the array expansion "${AUTH_HEADER[@]}" should not be quoted when the array might be empty. This will cause issues when AUTH_HEADER is an empty array. Use ${AUTH_HEADER[@]} without the outer quotes.

Copilot uses AI. Check for mistakes.
MEILI_HTTP_ADDR={{ meilisearch_host | default("0.0.0.0") }}

# The port Meilisearch listens on
MEILI_HTTP_PORT={{ meilisearch_port | default("7700") }}
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable MEILI_HTTP_PORT is set here but the systemd service file uses {{ meilisearch_port }} directly in the ExecStart command. This creates an inconsistency - the port should be consistently sourced from either the environment variable or the Ansible variable, not both. Consider using the environment file variable in the service or removing this environment variable declaration.

Suggested change
MEILI_HTTP_PORT={{ meilisearch_port | default("7700") }}

Copilot uses AI. Check for mistakes.
# See official docs:
# https://www.meilisearch.com/docs/learn/configuration/instance_options
#
# This template generates a .env file that Meilisearch will load at startup.
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions this template generates a .env file, but the actual file being created in the tasks is named just meilisearch (without .env extension). Either the comment should be updated to reflect the actual filename, or the destination filename in the tasks should use a .env extension.

Suggested change
# This template generates a .env file that Meilisearch will load at startup.
# This template generates a 'meilisearch' file that Meilisearch will load at startup.

Copilot uses AI. Check for mistakes.
MEILI_SEARCH_CONFIG:
- host: edx.devstack.meilisearch184
port: 7700
use_ssl: false
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MEILI_SEARCH_CONFIG is defined as a list with a single dictionary entry (similar to ELASTIC_SEARCH_CONFIG), but it includes a use_ssl property at line 246 that belongs to ELASTIC_SEARCH_CONFIG, not MEILI_SEARCH_CONFIG. The indentation shows use_ssl: false is under ELASTIC_SEARCH_CONFIG. However, based on the structure, MEILI_SEARCH_CONFIG should have its own properties including use_ssl and api_key. Please verify the YAML structure is correctly indented.

Suggested change
use_ssl: false
use_ssl: false
api_key: ''

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants