Skip to content

Conversation

Jibola
Copy link
Contributor

@Jibola Jibola commented Aug 19, 2025

Summary

Defining Field Mappings for Atlas Search and Vector Search indexes can get complicated. Our initial SearchIndex and VectorSearchIndex solutions provide reasonable defaults for categorized fields -- however for the typical MongoDB poweruser, there may be more nuanced indexes they may want to use. This PR introduces an avenue to provide more custom field mappings on a field.

Key changes

  • Added field_mappings parameter to SearchIndex to allow custom Atlas Search field configurations
  • Changed the options returned by get_constraints to also include analyzer and searchAnalyzer

Test Plan

  • Manual Testing
  • Add Test cases

Screenshots

Image of a customized field_mapping added in a migration
image

It's representation on MongoDB Compass
image

Checklist

Checklist for Author

  • Does this require a changelog update?
  • Did you update the changelog (if necessary)?
  • Is this a breaking change?
  • Did you run the tests locally?

Checklist for Reviewer

  • Is the PR in the correct format?
  • Can you explain the PR?
  • [] Do all TODOS have JIRA tickets?
  • Have you checked for spelling & grammar errors

Additional Considerations

  • This does not have extensive testing. Tests should include:
    • A field override with multiple mappings
    • A SearchIndex constructed without anything provided in fields
    • A field name present in both fields and fields_mapping

@Jibola Jibola requested a review from Copilot August 19, 2025 23:03
@Jibola Jibola marked this pull request as draft August 19, 2025 23:03
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances MongoDB Atlas search index functionality by adding field mapping capabilities and index status monitoring. The changes allow developers to specify custom field mappings for search indexes and ensure proper synchronization during index operations.

Key changes:

  • Added field_mappings parameter to SearchIndex to allow custom Atlas Search field configurations
  • Introduced index status monitoring functions to wait for index creation/deletion completion
  • Added DynamicSearchIndex class for dynamic field mapping scenarios

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
django_mongodb_backend/schema.py Added index status monitoring functions and integrated them into index operations
django_mongodb_backend/indexes.py Enhanced SearchIndex with field_mappings support and added DynamicSearchIndex class

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Jibola Jibola changed the title Allow additional fields_mappings to get added to SearchIndexModel configurations INTPYTHON-729: Improve flexibility and QOL of Atlas/Vector Search Index Configurations Aug 19, 2025
@Jibola Jibola changed the title INTPYTHON-729: Improve flexibility and QOL of Atlas/Vector Search Index Configurations INTPYTHON-729: (PoC) Improve flexibility and QOL of Atlas/Vector Search Index Configurations Aug 19, 2025
Copy link
Collaborator

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagined the index type API as subclasses like AutocompleteSearchIndex but I guess that's not flexible enough if an index has multiple fields with different types.

Comment on lines 169 to 171
if field_name in self.field_mappings:
fields[field_path] = self.field_mappings[field_name].copy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is field_mappings really supposed to contain the entire mapping? (e.g. "type" too). I'd think it would be more likely to be interpreted as "extra options to add to the field".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, type in the Atlas Search Field Mapping refers to the Atlas Search Field Type. We infer type from our fields, but, for instance, strings can be interpreted as four different types:

  • string (we infer)
  • token
  • stringFacet
  • autocomplete

Copy link
Collaborator

@timgraham timgraham Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Your original PR combined fields and field_mappings but I made these arguments mutually exclusive (possibly a separate class (e.g. "MappedSearchIndex") would be a better separate of concerns rather than having mutually exclusive arguments).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh, potentially. I actually chose to combine fields and fields_mappings because I envisioned folks being fine with the defaults set on a field name unless they wanted to make one small mutation. It's purely a QOL so folks don't have to commit to writing the entire field mapping, but I'm fine conceding to their separation unless we get requests from developers.

@timgraham timgraham changed the title INTPYTHON-729: (PoC) Improve flexibility and QOL of Atlas/Vector Search Index Configurations INTPYTHON-729 Allow creating search indexes with field mappings Sep 12, 2025
@timgraham timgraham force-pushed the INTPYTHON-729 branch 2 times, most recently from b978a65 to 48c1495 Compare September 12, 2025 20:44
@timgraham
Copy link
Collaborator

I think this is functionally complete for field_mappings, but that still doesn't support top-level definition options like "analyzer" and "searchAnalyzer" (see example) [not sure if important].

VectorSearchIndex doesn't take mappings in the same way (see syntax). The existing implementation supports some options (numDimensions, similarity) but not others (quantization, hnswOptions). If important to add, let's create a separate issue.

@Jibola
Copy link
Contributor Author

Jibola commented Sep 24, 2025

I think this is functionally complete for field_mappings, but that still doesn't support top-level definition options like "analyzer" and "searchAnalyzer" (see example) [not sure if important].

This is a fairly straightforward addition. lucene.standard is used by default if not specified. I can add two new arguments for analyzer and searchAnalyzer and only attach them if the field is not None.

VectorSearchIndex doesn't take mappings in the same way (see syntax). The existing implementation supports some options (numDimensions, similarity) but not others (quantization, hnswOptions). If important to add, let's create a separate issue.

Yeah, quantization, hnswOptions can definitely be split that into a separate ticket

@Jibola Jibola marked this pull request as ready for review September 25, 2025 17:19
@Jibola Jibola requested review from timgraham and WaVEV September 29, 2025 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants