-
Notifications
You must be signed in to change notification settings - Fork 70
Implement embedders, vector search and similar documents #638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
wow, I also submitted a PR about #625 yesterday. |
8b4f00e
to
44e0109
Compare
44e0109
to
3be19ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 16 out of 19 changed files in this pull request and generated 2 comments.
Files not reviewed (3)
- .env: Language not supported
- src/Meilisearch/Meilisearch.csproj: Language not supported
- tests/Meilisearch.Tests/Datasets/movies_for_vector.json: Language not supported
3be19ed
to
bac47c4
Compare
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds AI-powered search support: new embedders settings (models, enum, converter, settings APIs), hybrid/vector search parameters, similar-documents endpoint and models, plus tests and datasets. Project file includes new Index partials. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App
participant Index
participant HTTP as HTTP Client
participant Server as Meilisearch
rect rgba(200,230,255,0.2)
note over App,Server: Manage Embedders Settings
App->>Index: GetEmbeddersAsync()
Index->>HTTP: GET /indexes/:uid/settings/embedders
HTTP->>Server: Request
Server-->>HTTP: 200 JSON { name: Embedder }
HTTP-->>Index: Deserialize Dictionary<string, Embedder>
Index-->>App: Embedders
App->>Index: UpdateEmbeddersAsync(dict)
Index->>HTTP: PATCH /indexes/:uid/settings/embedders (JSON)
HTTP->>Server: Request
Server-->>HTTP: 202 TaskInfo
HTTP-->>Index: TaskInfo
Index-->>App: TaskInfo
App->>Index: ResetEmbeddersAsync()
Index->>HTTP: DELETE /indexes/:uid/settings/embedders
HTTP->>Server: Request
Server-->>HTTP: 202 TaskInfo
HTTP-->>Index: TaskInfo
Index-->>App: TaskInfo
end
sequenceDiagram
autonumber
participant App
participant Index
participant HTTP as HTTP Client
participant Server as Meilisearch
rect rgba(220,255,220,0.2)
note over App,Server: Similar Documents (POST)
App->>Index: SearchSimilarDocumentsAsync<T>(SimilarDocumentsQuery)
Index->>HTTP: POST /indexes/:uid/similar (JSON body)
HTTP->>Server: Request
Server-->>HTTP: 200 JSON SimilarDocumentsResult
HTTP-->>Index: Deserialize SimilarDocumentsResult<T>
Index-->>App: Result<T>
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (9)
src/Meilisearch/Settings.cs (1)
106-111
: Typo in class summary and doc comment
The summary above theSettings
class has "Setttings" (three 't's), and the doc comment forEmbedders
reads "embeddings attribute" instead of "embedders attribute." Correct these for clarity and consistency.Apply this diff:
- /// <summary> - /// Setttings of an index. + /// <summary> + /// Settings of an index. @@ - /// <summary> - /// Gets or sets the embeddings attribute. + /// <summary> + /// Gets or sets the embedders attribute.src/Meilisearch/Meilisearch.csproj (1)
71-91
: Indentation inconsistency in project file entriesThe new entries use spaces for indentation while the existing entries use tabs. This is a minor consistency issue that doesn't affect functionality but might be worth standardizing.
- <Compile Update="Index.Faceting.cs"> - <DependentUpon>Index.cs</DependentUpon> - </Compile> + <Compile Update="Index.Faceting.cs"> + <DependentUpon>Index.cs</DependentUpon> + </Compile>(Apply similar changes to all new entries)
src/Meilisearch/HybridSearch.cs (1)
13-18
: Consider adding validation for SemanticRatioThe
SemanticRatio
property typically represents a blend between vector and keyword search and usually should be a value between 0 and 1. Consider adding validation or documentation to clarify the acceptable range of values./// <summary> -/// Gets or sets the semantic ratio. +/// Gets or sets the semantic ratio between 0 and 1, where 0 is full keyword search and 1 is full vector search. /// </summary> [JsonPropertyName("semanticRatio")] public double SemanticRatio { get; set; }src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
23-49
: Improve error message in ArgumentOutOfRangeExceptionThe exception thrown for unexpected enum values doesn't include a meaningful error message, which will make debugging harder.
case EmbedderSource.Unknown: default: - throw new ArgumentOutOfRangeException(nameof(value), value, null); + throw new ArgumentOutOfRangeException(nameof(value), value, + $"Cannot serialize unsupported EmbedderSource value: {value}");src/Meilisearch/SimilarDocumentsResult.cs (1)
1-72
: Well-designed result class with minor documentation inconsistencyThe
SimilarDocumentsResult<T>
class is well-structured with immutable properties and proper JSON annotations. The class follows good design practices for a data transfer object.The XML documentation for some properties uses "Gets or sets" when they should only use "Gets" since the properties are read-only:
-/// <summary> -/// Gets or sets the id. -/// </summary> +/// <summary> +/// Gets the id. +/// </summary>Similar changes should be applied to other properties that only have getters (lines 43-44, 49-50, 55-56, 61-62, 67-68).
src/Meilisearch/SimilarDocumentsQuery.cs (1)
31-83
: Minor documentation inconsistency in property commentsThe XML documentation for properties uses "Gets or sets" for all properties, including the read-only
Id
property.For the
Id
property, update the documentation to:-/// <summary> -/// Gets or sets the document id. -/// </summary> +/// <summary> +/// Gets the document id. +/// </summary>tests/Meilisearch.Tests/SearchTests.cs (1)
581-597
: Enhanced similar documents test assertionsThe test effectively validates the similar documents functionality by checking that results contain the expected movie titles in the correct order.
Consider adding assertions to verify other aspects of the search result:
var movies = await _indexForVectorSearch.SearchSimilarDocumentsAsync<VectorMovie>(query); +// Verify the result contains the expected number of hits +Assert.Equal(4, movies.Hits.Count); + +// Verify metadata properties +Assert.Equal("143", movies.Id); +Assert.True(movies.ProcessingTimeMs >= 0); +Assert.Equal(0, movies.Offset); +Assert.Equal(20, movies.Limit); + Assert.Collection(movies.Hits, m => Assert.Equal("Escape Room", m.Title), m => Assert.Equal("Captain Marvel", m.Title), m => Assert.Equal("How to Train Your Dragon: The Hidden World", m.Title), m => Assert.Equal("Shazam!", m.Title) );src/Meilisearch/Index.Embedders.cs (1)
32-45
: Consider adding input validation for UpdateEmbeddersAsyncWhile the implementation is solid, adding validation for the
embedders
parameter would improve robustness. For example, checking if the dictionary is null or empty before sending the request.public async Task<TaskInfo> UpdateEmbeddersAsync(Dictionary<string, Embedder> embedders, CancellationToken cancellationToken = default) { + if (embedders == null) + { + throw new ArgumentNullException(nameof(embedders)); + } + var responseMessage = await _http.PatchAsJsonAsync( $"indexes/{Uid}/settings/embedders", embedders, Constants.JsonSerializerOptionsRemoveNulls, cancellationToken: cancellationToken) .ConfigureAwait(false); return await responseMessage.Content .ReadFromJsonAsync<TaskInfo>(cancellationToken: cancellationToken) .ConfigureAwait(false); }src/Meilisearch/Embedder.cs (1)
10-83
: Consider adding convenience methods for creating common embedder configurationsThe Embedder class would benefit from factory methods or a fluent builder pattern to easily create common configurations. This would improve usability and ensure proper initialization.
For example:
/// <summary> /// Creates a new OpenAI embedder configuration. /// </summary> /// <param name="apiKey">OpenAI API key.</param> /// <param name="model">Model name, e.g., "text-embedding-ada-002".</param> /// <returns>A configured Embedder instance.</returns> public static Embedder CreateOpenAIEmbedder(string apiKey, string model = "text-embedding-ada-002") { return new Embedder { Source = EmbedderSource.OpenAI, ApiKey = apiKey, Model = model, // Set other default properties as needed }; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
.env
(1 hunks)src/Meilisearch/Converters/EmbedderSourceConverter.cs
(1 hunks)src/Meilisearch/Embedder.cs
(1 hunks)src/Meilisearch/EmbedderDistribution.cs
(1 hunks)src/Meilisearch/EmbedderSource.cs
(1 hunks)src/Meilisearch/HybridSearch.cs
(1 hunks)src/Meilisearch/Index.Embedders.cs
(1 hunks)src/Meilisearch/Index.SimilarDocuments.cs
(1 hunks)src/Meilisearch/Meilisearch.csproj
(1 hunks)src/Meilisearch/SearchQuery.cs
(1 hunks)src/Meilisearch/Settings.cs
(1 hunks)src/Meilisearch/SimilarDocumentsQuery.cs
(1 hunks)src/Meilisearch/SimilarDocumentsResult.cs
(1 hunks)tests/Meilisearch.Tests/Datasets.cs
(1 hunks)tests/Meilisearch.Tests/Datasets/movies_for_vector.json
(1 hunks)tests/Meilisearch.Tests/IndexFixture.cs
(2 hunks)tests/Meilisearch.Tests/Models/VectorMovie.cs
(1 hunks)tests/Meilisearch.Tests/SearchTests.cs
(4 hunks)tests/Meilisearch.Tests/SettingsTests.cs
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
src/Meilisearch/Settings.cs (1)
src/Meilisearch/Embedder.cs (1)
Embedder
(10-83)
src/Meilisearch/SearchQuery.cs (1)
src/Meilisearch/HybridSearch.cs (1)
HybridSearch
(5-18)
src/Meilisearch/HybridSearch.cs (1)
src/Meilisearch/Embedder.cs (1)
Embedder
(10-83)
src/Meilisearch/Embedder.cs (2)
src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
EmbedderSource
(9-21)src/Meilisearch/EmbedderDistribution.cs (3)
EmbedderDistribution
(9-47)EmbedderDistribution
(17-28)EmbedderDistribution
(46-46)
🔇 Additional comments (15)
.env (1)
1-1
:❓ Verification inconclusive
Version Bump Confirmed
The Meilisearch server version has been updated from v1.9.0 to v1.13 to enable support for embedders, hybrid search, vector search, and similar documents features.Please verify that the
v1.13
server image is available and compatible by running:
🏁 Script executed:
# Check if Meilisearch v1.13 health endpoint responds correctly curl -s http://meilisearch:7700/health
Length of output: 40
Meilisearch v1.13 Health Check Unresponsive
Thecurl -s http://meilisearch:7700/health
command returned no output, indicating the service may not be running or the endpoint is unreachable. Please ensure thegetmeili/meilisearch:v1.13
image is pulled and the container is up and healthy, then retry:# Pull and start the Meilisearch v1.13 container docker-compose pull meilisearch docker-compose up -d meilisearch # Verify the health endpoint responds curl -s http://meilisearch:7700/healthtests/Meilisearch.Tests/Datasets/movies_for_vector.json (1)
1-32
:✅ Verification successful
Validate JSON schema
The newmovies_for_vector.json
contains a_vectors
object with a"manual"
key. Ensure this structure matches whatIndexFixture.SetUpIndexForVectorSearch
and your search/query classes expect for vector ingestion.Quick schema check:
🏁 Script executed:
python - << 'EOF' import json, sys data=json.load(open('tests/Meilisearch.Tests/Datasets/movies_for_vector.json')) for doc in data: if '_vectors' not in doc or 'manual' not in doc['_vectors']: print("Schema mismatch in:", doc) sys.exit(1) print("Schema valid") EOFLength of output: 23
🏁 Script executed:
#!/bin/bash # Locate the definition of SetUpIndexForVectorSearch in C# tests rg -n "SetUpIndexForVectorSearch" -C 5 --glob "*.cs"Length of output: 4285
🏁 Script executed:
#!/bin/bash # Locate the VectorMovie class to verify JSON property mappings rg -n "class VectorMovie" -C 20 --glob "*.cs"Length of output: 1517
🏁 Script executed:
#!/bin/bash # Show the embedder configuration in SetUpIndexForVectorSearch sed -n '114,124p' tests/Meilisearch.Tests/IndexFixture.csLength of output: 559
JSON schema & embedder configuration validated
I’ve verified that every document inmovies_for_vector.json
includes a_vectors.manual
array, thatVectorMovie.Vectors
is correctly mapped to_vectors
, and thatSetUpIndexForVectorSearch
registers an embedder named"manual"
with 3 dimensions. Everything aligns—no changes needed.src/Meilisearch/EmbedderSource.cs (1)
1-42
: Enum Definition Looks Good
TheEmbedderSource
enum and its[JsonConverter]
attribute are correctly defined and documented. The members follow PascalCase conventions and align with your serialization logic.tests/Meilisearch.Tests/Models/VectorMovie.cs (1)
1-20
: Model design looks good!This model class effectively captures the structure needed for vector-based search testing. The
Vectors
property using aDictionary<string, double[]>
allows for multiple named vector embeddings per document, which aligns with Meilisearch's vector search capabilities.src/Meilisearch/Index.SimilarDocuments.cs (1)
9-31
: Implementation looks good!The method properly implements the similar documents search functionality, following the library's patterns for asynchronous API calls. Good use of generics, cancellation token support, and proper
ConfigureAwait(false)
usage.src/Meilisearch/SearchQuery.cs (1)
151-167
: LGTM: Implementation of vector search properties is well-structuredThe additions of
Hybrid
,Vector
andRetrieveVectors
properties are well-documented and correctly annotated with JSON property names, enabling the search functionality for vector and hybrid searches. The property types match what would be expected for these features.tests/Meilisearch.Tests/IndexFixture.cs (1)
114-143
: LGTM: Good implementation of vector search test setupThe
SetUpIndexForVectorSearch
method follows the same pattern as other setup methods in the class, with proper error handling and task waiting. The error messages are accurately referencing the current method context.tests/Meilisearch.Tests/SettingsTests.cs (3)
67-67
: LGTM: Adding Embedders to default settingsThe addition of the Embedders property to the default settings follows the same pattern as other settings properties, initializing it as an empty dictionary.
667-716
: LGTM: Comprehensive test coverage for Embedders settingsThe three new test methods follow the established pattern for settings tests:
GetEmbeddersAsync
verifies retrieval matches default settingsUpdateEmbeddersAsync
tests updating embedders configurationResetEmbeddersAsync
ensures settings can be properly resetThe test demonstrates correct usage of the Embedder model with properties for Source, Model, DocumentTemplate, and DocumentTemplateMaxBytes.
737-738
: LGTM: Updated helper method to include EmbeddersThe
SettingsWithDefaultedNullFields
method correctly includes the Embedders property, ensuring it's handled consistently with other settings properties when merging default and custom settings.src/Meilisearch/SimilarDocumentsQuery.cs (1)
11-23
: LGTM: Constructor with proper validationThe constructor correctly validates that the document ID is not null or empty, throwing an appropriate exception with a clear message.
tests/Meilisearch.Tests/SearchTests.cs (2)
17-17
: LGTM: Added vector search index setupProperly added a field for the vector search index and initialized it in the test fixture setup.
Also applies to: 34-34
562-579
: LGTM: Comprehensive vector search testThe test correctly demonstrates using the hybrid search capability with a vector and semantic ratio. The assertions verify the expected results are returned.
src/Meilisearch/Index.Embedders.cs (2)
17-24
: Good implementation of GetEmbeddersAsync methodThe method follows best practices with proper async/await usage, ConfigureAwait(false), cancellation token support, and clear documentation. The return type Dictionary<string, Embedder> correctly matches the expected response format from the Meilisearch API.
52-61
: ResetEmbeddersAsync method looks goodThe implementation follows the same pattern as other reset methods in the library with proper async/await usage, cancellation token support, and clear documentation.
[JsonPropertyName("apiKey")] | ||
public string ApiKey { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider security implications for ApiKey property
The ApiKey property likely contains sensitive information. Consider:
- Adding attributes or methods to prevent accidental logging or serialization in contexts where it might be exposed
- Implementing secure storage mechanisms or providing guidance on how to handle this sensitive data
Also, consider adding documentation to warn users about the sensitivity of this property.
🏁 Script executed:
#!/bin/bash
# Check for any existing security measures for API keys in the codebase
grep -r --include="*.cs" "ApiKey" --include="*.cs" "api key" src/
Length of output: 1890
Secure handling of Embedder.ApiKey
Ensure the ApiKey
in src/Meilisearch/Embedder.cs
cannot be accidentally serialized or logged:
- File:
src/Meilisearch/Embedder.cs
(around lines 27–28)
• Remove the[JsonPropertyName("apiKey")]
attribute.
• Add[JsonIgnore]
to exclude it from JSON serialization and logging.
• Make the setterinternal
orprivate
if external mutation isn’t required. - Add XML documentation on the
ApiKey
property to warn that it holds sensitive credentials and should only be supplied via secure configuration (e.g., environment variables).
Example diff:
--- a/src/Meilisearch/Embedder.cs
+++ b/src/Meilisearch/Embedder.cs
@@ -25,8 +25,11 @@ namespace Meilisearch
- [JsonPropertyName("apiKey")]
- public string ApiKey { get; set; }
+ /// <summary>
+ /// API key for authentication (Sensitive: excluded from JSON; only sent via HTTP header).
+ /// </summary>
+ [JsonIgnore]
+ public string ApiKey { get; internal set; }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
[JsonPropertyName("apiKey")] | |
public string ApiKey { get; set; } | |
/// <summary> | |
/// API key for authentication (Sensitive: excluded from JSON; only sent via HTTP header). | |
/// </summary> | |
[JsonIgnore] | |
public string ApiKey { get; internal set; } |
Hello @philipproplesch and thanks for your PR 🙌 Can you please look into the conflicts and address the feedback from CodeRabbit? Thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1)
28-38
: Datasets are correctly copied to output (including movies_for_vector.json)Good addition. This resolves the earlier missing-dataset issue for vector tests and keeps behavior consistent across all datasets via PreserveNewest.
src/Meilisearch/SimilarDocumentsQuery.cs (1)
76-77
: Resolved: RankingScoreThreshold uses decimal? consistently with SearchQueryThis addresses the earlier inconsistency noted in prior reviews (int? vs decimal?). Looks good.
src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
9-23
: Forward-compatibility: add sentinel EmbedderSource.Unknown and return it for unrecognized valuesI checked src/Meilisearch/EmbedderSource.cs — the enum contains OpenAi, HuggingFace, Ollama, Rest, UserProvided and does NOT have an Unknown member. Update both the enum and the converter so unknown server values map to a sentinel instead of throwing.
Files to change
- src/Meilisearch/EmbedderSource.cs — add Unknown enum member.
- src/Meilisearch/Converters/EmbedderSourceConverter.cs — return the sentinel on parse failure.
Suggested diffs
EmbedderSourceConverter.cs:
if (Enum.TryParse<EmbedderSource>(enumValue, true, out var embedderSource)) { return embedderSource; } - - throw new JsonException($"Invalid EmbedderSource value: '{enumValue}'."); + return EmbedderSource.Unknown;EmbedderSource.cs:
public enum EmbedderSource { + /// <summary> + /// Unknown / sentinel for unrecognized values from the server. + /// </summary> + Unknown, /// <summary> /// OpenAI /// </summary> OpenAi, @@ /// <summary> /// User-provided /// </summary> UserProvided, }Note: after adding Unknown, review any callers or tests that relied on the previous exception behavior and update documentation to mention the new fallback behavior.
🧹 Nitpick comments (9)
tests/Meilisearch.Tests/Datasets.cs (1)
10-10
: Prefer AppContext.BaseDirectory over Directory.GetCurrentDirectory() for test asset pathsUsing the current directory can vary across runners/IDEs. AppContext.BaseDirectory is more robust for locating content copied to the output folder.
Example:
private static readonly string BasePath = Path.Combine(AppContext.BaseDirectory, "Datasets");tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1)
9-14
: Scope or document project-wide NoWarn for CA1861If only a few tests trigger CA1861, consider suppressing at the file/member level with pragmas or [SuppressMessage] instead of globally, or add a brief comment explaining why this is disabled project-wide to aid future maintainers.
src/Meilisearch/SimilarDocumentsQuery.cs (2)
17-20
: Guard against whitespace-only IDs and throw precise exceptionsUse IsNullOrWhiteSpace, and distinguish null (ArgumentNullException) from empty/whitespace (ArgumentException) for clearer diagnostics.
Apply this diff:
- if (string.IsNullOrEmpty(id)) - { - throw new ArgumentNullException(nameof(id)); - } + if (id is null) + { + throw new ArgumentNullException(nameof(id)); + } + if (string.IsNullOrWhiteSpace(id)) + { + throw new ArgumentException("Value cannot be empty or whitespace.", nameof(id)); + }
14-15
: Document the purpose of the id parameterThe XML doc for the parameter is empty. Briefly clarify what the id represents.
Apply this diff:
- /// <param name="id"></param> + /// <param name="id">Document identifier to find similar documents for. Required.</param>src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
25-50
: Serialize unknown/new enum values gracefully to avoid runtime exceptionsWrite() throws ArgumentOutOfRangeException for any unhandled value. If you add a non-throwing read fallback, also serialize Unknown to "unknown" to preserve symmetry and prevent crashes if an unknown value flows through.
Proposed change:
- public override void Write(Utf8JsonWriter writer, EmbedderSource value, JsonSerializerOptions options) - { - string source; - switch (value) - { - case EmbedderSource.OpenAi: - source = "openAi"; - break; - case EmbedderSource.HuggingFace: - source = "huggingFace"; - break; - case EmbedderSource.Ollama: - source = "ollama"; - break; - case EmbedderSource.Rest: - source = "rest"; - break; - case EmbedderSource.UserProvided: - source = "userProvided"; - break; - default: - throw new ArgumentOutOfRangeException(nameof(value), value, null); - } - - writer.WriteStringValue(source); - } + public override void Write(Utf8JsonWriter writer, EmbedderSource value, JsonSerializerOptions options) + { + var source = value switch + { + EmbedderSource.OpenAi => "openAi", + EmbedderSource.HuggingFace => "huggingFace", + EmbedderSource.Ollama => "ollama", + EmbedderSource.Rest => "rest", + EmbedderSource.UserProvided => "userProvided", +#if SUPPORTS_UNKNOWN + EmbedderSource.Unknown => "unknown", +#endif + _ => throw new ArgumentOutOfRangeException(nameof(value), value, null) + }; + + writer.WriteStringValue(source); + }src/Meilisearch/Embedder.cs (4)
23-28
: ApiKey: document sensitivity and avoid serializing nullsThis property holds credentials. At minimum, avoid emitting it when null, and document that it’s sensitive to reduce accidental logging.
Apply this diff:
- /// <summary> - /// Gets or sets the API key. - /// </summary> - [JsonPropertyName("apiKey")] - public string ApiKey { get; set; } + /// <summary> + /// Gets or sets the API key. Sensitive: avoid logging or exposing this value. + /// When null, it will not be serialized. + /// </summary> + [JsonPropertyName("apiKey")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string ApiKey { get; set; }
68-76
: Prefer Dictionary<string, JsonElement> to preserve JSON fidelity and avoid boxingUsing object will deserialize values as boxed JsonElement and can lead to ambiguity. Using JsonElement explicitly avoids boxing and clarifies intent.
Apply these diffs:
- [JsonPropertyName("request")] - public Dictionary<string, object> Request { get; set; } + [JsonPropertyName("request")] + public Dictionary<string, System.Text.Json.JsonElement> Request { get; set; }- [JsonPropertyName("response")] - public Dictionary<string, object> Response { get; set; } + [JsonPropertyName("response")] + public Dictionary<string, System.Text.Json.JsonElement> Response { get; set; }Optionally add a using to shorten type names:
-using System.Collections.Generic; -using System.Text.Json.Serialization; +using System.Collections.Generic; +using System.Text.Json; +using System.Text.Json.Serialization;
47-52
: Add basic range validation for Dimensions and DocumentTemplateMaxBytesTo fail fast on clearly invalid configurations (before hitting the server), enforce positive values. This mirrors the validation approach in EmbedderDistribution.
For example:
[JsonPropertyName("documentTemplateMaxBytes")] -public int? DocumentTemplateMaxBytes { get; set; } +public int? DocumentTemplateMaxBytes +{ + get => _documentTemplateMaxBytes; + set + { + if (value is { } v && v <= 0) + throw new ArgumentOutOfRangeException(nameof(DocumentTemplateMaxBytes), "Must be > 0."); + _documentTemplateMaxBytes = value; + } +} +private int? _documentTemplateMaxBytes;[JsonPropertyName("dimensions")] -public int? Dimensions { get; set; } +public int? Dimensions +{ + get => _dimensions; + set + { + if (value is { } v && v <= 0) + throw new ArgumentOutOfRangeException(nameof(Dimensions), "Must be > 0."); + _dimensions = value; + } +} +private int? _dimensions;Also applies to: 44-46
20-22
: Optional: model URL as Uri for early validationUsing Uri instead of string provides built-in validation and clearer intent. This is optional and may be deferred to avoid a breaking change.
Possible change:
- public string Url { get; set; } + public Uri Url { get; set; }If you adopt this, ensure System.Text.Json can handle Uri (it does by default) and update any callers accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
src/Meilisearch/Converters/EmbedderSourceConverter.cs
(1 hunks)src/Meilisearch/Embedder.cs
(1 hunks)src/Meilisearch/EmbedderDistribution.cs
(1 hunks)src/Meilisearch/EmbedderSource.cs
(1 hunks)src/Meilisearch/Meilisearch.csproj
(1 hunks)src/Meilisearch/SearchQuery.cs
(1 hunks)src/Meilisearch/SimilarDocumentsQuery.cs
(1 hunks)tests/Meilisearch.Tests/Datasets.cs
(1 hunks)tests/Meilisearch.Tests/Meilisearch.Tests.csproj
(1 hunks)tests/Meilisearch.Tests/SearchTests.cs
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- src/Meilisearch/EmbedderSource.cs
- src/Meilisearch/Meilisearch.csproj
- tests/Meilisearch.Tests/SearchTests.cs
- src/Meilisearch/SearchQuery.cs
- src/Meilisearch/EmbedderDistribution.cs
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/Meilisearch/SimilarDocumentsQuery.cs (1)
src/Meilisearch/Embedder.cs (1)
Embedder
(9-82)
src/Meilisearch/Embedder.cs (2)
src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
EmbedderSource
(9-23)src/Meilisearch/EmbedderDistribution.cs (3)
EmbedderDistribution
(9-66)EmbedderDistribution
(19-23)EmbedderDistribution
(65-65)
🔇 Additional comments (3)
tests/Meilisearch.Tests/Datasets.cs (1)
18-18
: LGTM: dataset path added for vector tests
MoviesForVectorJsonPath
aligns with the new dataset and is consistent with the existing pattern. With the csproj now copying the file to the output, this should work at runtime.tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1)
17-21
: Test package references look finePinned versions for FluentAssertions, xUnit, runner, and Test SDK are reasonable and commonly compatible with net8.0 test projects.
src/Meilisearch/Converters/EmbedderSourceConverter.cs (1)
30-44
: Verified — keep the existing casing ("openAi", "huggingFace")Meilisearch v1.13 requires case-sensitive source strings; the docs show "openAi" (not "openAI") and "huggingFace" (not "huggingface").
- File: src/Meilisearch/Converters/EmbedderSourceConverter.cs (lines ~30–44) — current mapping emits "openAi", "huggingFace", "ollama", "rest", "userProvided" — no change required.
a6e1bae
to
70dd740
Compare
@Strift I merged with |
70dd740
to
e1ba052
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
src/Meilisearch/SimilarDocumentsQuery.cs (2)
40-41
: AttributesToRetrieve initializer fix looks good.Thanks for switching to an explicit array initializer; this compiles correctly for auto-properties.
76-77
: RankingScoreThreshold type alignment is correct.Using
decimal?
matchesSearchQuery
and avoids precision issues.
🧹 Nitpick comments (3)
tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1)
9-14
: Deduplicate NoWarn across configurations (keep CA1861 suppression intentional).You’re duplicating the same NoWarn list in both Debug and Release. Prefer a single unconditional PropertyGroup unless you plan to diverge per config. Also confirm CA1861 suppression is intentionally desired for tests.
Apply:
- <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|AnyCPU'"> - <NoWarn>1701;1702;CA1861</NoWarn> - </PropertyGroup> - - <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|AnyCPU'"> - <NoWarn>1701;1702;CA1861</NoWarn> - </PropertyGroup> + <PropertyGroup> + <NoWarn>1701;1702;CA1861</NoWarn> + </PropertyGroup>src/Meilisearch/SimilarDocumentsQuery.cs (2)
14-15
: Complete the constructor XML doc for id.The
<param name="id">
doc is empty. Add a clear description for better IntelliSense.Apply this diff:
- /// <param name="id"></param> + /// <param name="id">The primary key value of the document to anchor similarity (string or number).</param>
64-71
: Document the dependency between score details and score display.Minor doc improvement: clarify that
ShowRankingScoreDetails
is only meaningful whenShowRankingScore
is true.Apply this diff:
/// <summary> /// Gets or sets whether to show the ranking score details. /// </summary> [JsonPropertyName("showRankingScoreDetails")] - public bool ShowRankingScoreDetails { get; set; } + public bool ShowRankingScoreDetails { get; set; } + // Note: Only meaningful when ShowRankingScore is true.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/Meilisearch/Converters/EmbedderSourceConverter.cs
(1 hunks)src/Meilisearch/Embedder.cs
(1 hunks)src/Meilisearch/EmbedderDistribution.cs
(1 hunks)src/Meilisearch/EmbedderSource.cs
(1 hunks)src/Meilisearch/SimilarDocumentsQuery.cs
(1 hunks)src/Meilisearch/SimilarDocumentsResult.cs
(1 hunks)tests/Meilisearch.Tests/Meilisearch.Tests.csproj
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- src/Meilisearch/EmbedderSource.cs
- src/Meilisearch/EmbedderDistribution.cs
- src/Meilisearch/Converters/EmbedderSourceConverter.cs
- src/Meilisearch/SimilarDocumentsResult.cs
- src/Meilisearch/Embedder.cs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/Meilisearch/SimilarDocumentsQuery.cs (1)
src/Meilisearch/Embedder.cs (1)
Embedder
(9-82)
🔇 Additional comments (3)
tests/Meilisearch.Tests/Meilisearch.Tests.csproj (2)
24-24
: LGTM: Project reference change is formatting-only.No functional impact; consistent with self-closing style used elsewhere.
28-37
: Approved — datasets added and verifiedBoth new dataset files were found in the repository; the CopyToOutputDirectory entries are valid.
Files checked:
- tests/Meilisearch.Tests/Datasets/movies_for_vector.json — ✓ exists
- tests/Meilisearch.Tests/Datasets/products_for_distinct_search.json — ✓ exists
src/Meilisearch/SimilarDocumentsQuery.cs (1)
82-83
: RetrieveVectors field matches the API shape.Property name and usage are consistent with vector-enabled endpoints.
/// <summary> | ||
/// Gets or sets the filter. | ||
/// </summary> | ||
[JsonPropertyName("filter")] | ||
public string Filter { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Align Filter type with SearchQuery (support arrays and complex filters).
Filter
in search requests typically accepts string, arrays, and nested structures; using string
here is needlessly restrictive and inconsistent with SearchQuery
.
Apply this diff:
- /// <summary>
- /// Gets or sets the filter.
- /// </summary>
- [JsonPropertyName("filter")]
- public string Filter { get; set; }
+ /// <summary>
+ /// Gets or sets the filter. Accepts the same forms as <c>SearchQuery.Filter</c>
+ /// (e.g., string or array).
+ /// </summary>
+ [JsonPropertyName("filter")]
+ public object Filter { get; set; }
Run this script to verify consistency with SearchQuery
and detect usages expecting non-string filters or numeric IDs:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Verify SearchQuery.Filter type =="
rg -n "class\s+SearchQuery" -A 200 | rg -n -A 2 -B 2 '\[JsonPropertyName\("filter"\)\]'
echo
echo "== Find SimilarDocumentsQuery usages (check if numeric IDs are used) =="
rg -n "new\s+SimilarDocumentsQuery\(" -A 2 -B 1
Length of output: 1061
Align SimilarDocumentsQuery.Filter with SearchQueryBase.Filter (use dynamic)
SimilarDocumentsQuery currently types Filter as string but SearchQueryBase defines Filter as dynamic — change SimilarDocumentsQuery to dynamic so it accepts strings, arrays, and complex filters.
- Files to update:
- src/Meilisearch/SimilarDocumentsQuery.cs (property Filter)
- Reference: src/Meilisearch/SearchQueryBase.cs shows
public dynamic Filter { get; set; }
- Note: tests/Meilisearch.Tests/SearchTests.cs uses
new SimilarDocumentsQuery("143")
(string), which remains valid.
Apply this diff:
- /// <summary>
- /// Gets or sets the filter.
- /// </summary>
- [JsonPropertyName("filter")]
- public string Filter { get; set; }
+ /// <summary>
+ /// Gets or sets the filter. Accepts the same forms as <c>SearchQueryBase.Filter</c>
+ /// (e.g., string, arrays, or complex filter structures).
+ /// </summary>
+ [JsonPropertyName("filter")]
+ public dynamic Filter { get; set; }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
/// <summary> | |
/// Gets or sets the filter. | |
/// </summary> | |
[JsonPropertyName("filter")] | |
public string Filter { get; set; } | |
/// <summary> | |
/// Gets or sets the filter. Accepts the same forms as <c>SearchQueryBase.Filter</c> | |
/// (e.g., string, arrays, or complex filter structures). | |
/// </summary> | |
[JsonPropertyName("filter")] | |
public dynamic Filter { get; set; } |
🤖 Prompt for AI Agents
In src/Meilisearch/SimilarDocumentsQuery.cs around lines 55 to 59, the Filter
property is currently typed as string but must match SearchQueryBase which uses
dynamic; change the Filter property type from string to dynamic so it accepts
strings, arrays, and complex filters, keeping the JsonPropertyName("filter")
attribute and getter/setter unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philipproplesch this seems relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
src/Meilisearch/SimilarDocumentsQuery.cs (3)
37-42
: Good fix: valid array initializer for AttributesToRetrieveUsing new[] { "*" } compiles correctly and matches SDK patterns.
55-60
: Filter typed as dynamic aligns with SearchQueryBaseThis restores consistency and supports complex filter shapes.
73-77
: RankingScoreThreshold as decimal? — consistent with SearchQueryMatches prior guidance and avoids precision issues.
🧹 Nitpick comments (2)
src/Meilisearch/SimilarDocumentsQuery.cs (2)
14-23
: Correct exception semantics for empty/whitespace IDs; add int overload for consistency
- Throw ArgumentNullException only for null; use ArgumentException for empty/whitespace.
- Fill the XML doc for id.
- Per team convention (retrieved learning), consider an int convenience overload that forwards to the string ctor.
Apply within this range:
- /// <param name="id"></param> + /// <param name="id">Primary key of the anchor document.</param> public SimilarDocumentsQuery(string id) { - if (string.IsNullOrEmpty(id)) - { - throw new ArgumentNullException(nameof(id)); - } + if (id is null) + { + throw new ArgumentNullException(nameof(id)); + } + if (string.IsNullOrWhiteSpace(id)) + { + throw new ArgumentException("Id cannot be empty or whitespace.", nameof(id)); + } Id = id; }Add this overload right after the string constructor:
public SimilarDocumentsQuery(int id) : this(id.ToString()) { }
46-53
: Optional: clarify valid ranges for Offset/LimitConsider documenting that Offset must be non-negative and that Limit should be positive and subject to server-side bounds.
- /// <summary> - /// Gets or sets the offset. - /// </summary> + /// <summary> + /// Gets or sets the offset (non-negative). + /// </summary> @@ - /// <summary> - /// Gets or sets the limit. - /// </summary> + /// <summary> + /// Gets or sets the limit (positive; server may enforce maximums). + /// </summary>
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/Meilisearch/SimilarDocumentsQuery.cs
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-15T07:43:05.991Z
Learnt from: Strift
PR: meilisearch/meilisearch-dotnet#638
File: src/Meilisearch/SimilarDocumentsQuery.cs:15-23
Timestamp: 2025-08-15T07:43:05.991Z
Learning: In the Meilisearch .NET SDK, the primary document ID type is `string`. Methods that accept document IDs follow a consistent pattern: the main method uses `string`, with optional convenience overloads for `int` that internally convert to string using `ToString()`. This pattern is used in GetDocumentAsync, DeleteOneDocumentAsync, DeleteDocumentsAsync, and should be followed for consistency in new methods like SimilarDocumentsQuery.
Applied to files:
src/Meilisearch/SimilarDocumentsQuery.cs
🧬 Code graph analysis (1)
src/Meilisearch/SimilarDocumentsQuery.cs (1)
src/Meilisearch/Embedder.cs (1)
Embedder
(9-82)
Pull Request
Related issue
Fixes #625
What does this PR do?
PR checklist
Please check if your PR fulfills the following requirements:
Summary by CodeRabbit
New Features
Tests