-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mango
): strict index selection
#4710
Conversation
9c77301
to
0dfe891
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better (RESTful, at least as much as anything else in CouchDB is) to enforce this through the url hierarchy, e.g,
/$dbname/_design/$ddoc/_find/$indexname
i.e, the same as search where a specific index must be specified.
we already have I have qualms with adding the URL-way on top of this and later possibly deprecating |
I agree with @janl. I would be happy to work on making the Mango interface to be consistent with that of Search (I like the idea), but not in the scope of this PR. |
I disagree. The mistake when use_index was added, where it was allowed to not use the index requested, is embedded now. Having a "I really mean it" option, that is optional, and off by default, makes this interface worse, not better. We should either fix the associated issue (prevent use_index from not using any index other than the one requested) or not. An optional fudge is worse. |
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, automated overrides can still happen. Introduce the concept of "strict index selection" which lets the user to request the exclusive use of a specific index. When it is not possible, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the missing index, request its creation and try again later. The feature comes with a configuration toggle. By default, the feature is disabled to maintain backward compatibility, but the user may ask for this behavior via the new `use_index_strict` query parameter for a specific query. Note that, similarly, `use_index_strict` could be used to disable strict index selection temporarily when it is enabled on the server. Fixes apache#4511
0dfe891
to
285ad6d
Compare
I see. But we should keep |
Yes.
I wouldn't frame it these strict terms. I see this PR as a step towards making the Mango-query-index-selection API look similar to JS views and search APIs. IF this PR doesn't get us the whole step there, that’s fine by me as long as we arrive there eventually. IF we want to fix this all in one go, that’s fine by me, but for BC reasons we will need a toggle of some sort for a while and time for folks to migrate off of the no longer preferred APIs. |
the behaviour of If there was a roadmap here for couchdb 4.0 (handwaving, but imagine that use_index as an option goes away entirely. you either use Without that, it's hard to decide whether adding more options is taking us somewhere better or just somewhere more confusing. As a user I would be legitimately baffled to be told I should use I'm not sure if I'm -0 or -1 on this, I'm veering to -0 though. I don't like leaving Hence I suggested having That idea also gives users a migration path away from the flawed choosing algorithm. A future (major) release of CouchDB might then drop |
In general, for API migrations we loosely agreed on: introduce a better API, eventually retire the old one, maybe. In that light, adding the REST-URL-selects-index route is adding the better API and we can later remove raw That is indeed cleaner than adding another toggle to change the behaviour of I have no problems with that approach. |
@janl I don't see how this is a step towards "making the Mango-query-index-selection API look similar to JS views and search APIs". my suggestion of adding |
@janl we commented concurrently. I agree with your last. :) |
noting |
Why to drop |
it was a mistake to allow unindexed queries. We are trying to reverse that mistake now :) |
A revisited version of |
sure, but why leave it if we have a better API in place? |
I am not sure I got you. In my understanding, |
removing |
More advanced query engines allow selecting multiple indices to answer a query. It's not impossible, for Mango to be improved to do it as well. For instance, one pain point of Mango is it can't use indices for general disjunctions today; but it might be possible to do that by doing a merge on two indices. If the new Mango query API is to specify only one index, that API choice will work against that improvement. Another, more pessimistic view, I guess is to say Mango is never going to be anything but a layer on top of an mrview index and we shouldn't pretend otherwise. Then selecting an index and forcing that there be an index defined is probably reasonable. |
I agree with Nick that we would not want to put ourselves in the corner by embedding assumption about single index into endpoint name. I also agree with Bob and Jan that adding more option to existing endpoint is not the best choice in this particular case. Specifically because the option would be off by default for compatibility reasons. Also adding new options to existing API endpoints complicates maintaining of client libraries. I think we should come up with a new endpoint name but shouldn't embed index name into it. I am bad at naming and naming is hard.
Let's ask help from @ricellis who has an extensive experience in designing APIs. The decision whether to keep |
The stated purpose of this PR is to allow a user to run a selector against a specific individual index or receive an error stating that the index is not compatible, rather than fallback to a less efficient index. Let us focus on the best way to achieve that specific goal, and whether to then do so. Speculation about the future of For map/reduce views and search indexes (either dreyfus/clouseau or nouveau) this is done having the path to the index in the url. Inventing a third mechanism (as Ilya proposes) for this is not acceptable to me; our API is already complex enough. If we want a facility to run a selector against a user-selected index, then I only see one way: If we don't want that, then we need do nothing. |
I don't have that strong of opinion. My only strong opinion is to involve Rich and his team in the design. |
Agreed, would be good to hear from Rich et al |
Appreciating the difficulties with augmenting existing endpoints, I think
it's worth considering adding "selector" and "fields" support to the
existing _all_docs, view and search endpoints. This would be consistent
with the _changes and _find endpoints and provides most of the benefits of
a single-index _find query, albeit with a more explicit API.
It would also enable the filtering functionality for existing view/search
indexes whereas Mango makes assumptions about the index key structure that
hand-crafted indexes would need to replicate if _find we're to be supported
for them.
…On Tue, 1 Aug 2023, 23:26 Robert Newson, ***@***.***> wrote:
Agreed, would be good to hear from Rich et al
—
Reply to this email directly, view it on GitHub
<#4710 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAX365O3ZXNF4RRRIJZHXLXTFYBHANCNFSM6AAAAAA25IW4QM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
oooh I like that. |
one further note here. when, in the future, you pass a selector to |
Okay, thanks for the note @rnewson. |
as for comparisons to calling Querying a view without startkey/endkey can make sense and can be efficient as it's quite common for the view to be a subset of the database (any Querying a search index with You can tell I am tired of dealing with users who have been let down by mango and its false promises. |
Controversial I'm sure, but can we go back to basics here? The documentation says:
That doesn't even seem that ambiguous, given computers follow instructions. |
This is completely understandable and comprehensible and yes, you always answer the same questions and misunderstandings ;-). Maybe this is also one of the reasons that I have not used Mango much. I think we shouldn't mix different things here though, nor personal inclinations / dislikes to various functions. I think we can and should improve the current behavior as best as possible (in the sense of the user and operator) (and best of all, without introducing an additional configuration parameter). If we want to preserve a backward-comatable behavior, can't we explicitly tell the user in the result if an index was used or not (more visible than just in _explain) and without interpreting the "execution statistics"? Now I would also have no problem with marking this as backwards incompatible in the changlog! If we follow the documentation, then we just correct the behavior (which was wrong until now). |
@ricellis I agree that @big-r81 how do you you propose to "explicitly tell the user in the result if an index was used"? We already send a warning in every |
@ricellis separately, what's your opinion on mango considering an index usable for a selector for $regex operator? I consider that a bug also. The index isn't usable for that query, it's no better than |
@rnewson Mhhh, then we should not output a result if a user specifies an index that does not exist or cannot be used. Then we should just issue a warning as a result and specify "Use the correct index or do not use |
Good point, @ricellis. This looks more like a bug fix than adding a new feature. Documentation agrees too. Not sure about |
That problem extends beyond
Does brute-forced in this context mean it uses the keys from the view or does it actually pull the docs? |
Thank you all for the comments, it is much appreciated. How about the following:
|
|
@pgj imo we either make |
Curiously, couchdb/src/mango/src/mango_cursor.erl Lines 475 to 509 in 435db51
Is this a bug in the documentation itself? couchdb/src/docs/src/api/database/find.rst Lines 45 to 47 in 83e39b6
The use of "instruct" there might be a poor choice which causes so much pain and misunderstanding. Personally, I can see the intention behind the wording but I can also understand that it is misleading at the first sight. (Being instructed does not necessarily warrant that one will obey. But I am not either a native speaker or a linguist.) |
Instruct
|
Submitted #4774 to fix the documentation. |
Created #4775 for the problem with the |
another option occured to me. Similar to the initial proposal, we could introduce When
If this exists we can encourage users to always set |
I like this |
Do we really need backwards compatability here? Let me give you a practical example: There is a road with allowed max. speed of 50 km/h and a digital display board which shows your current speed and a 😄 if you drive less than or equal to 50 km/h and 😞 if you drive too fast. You drive this way for years and you drive always to fast because there are obviously no dangers. One day I got a speeding ticket because I was going too fast (like always)...
In this case - I think - we should fix
So, just my 2 cents... But, if we go the BC route, than |
@big-r81 the difference between your scenario and this one is that you know you are exceeding a limit. The problem with the fallback behaviour of |
Perhaps some administrative config for both:
would provide enough control to allow admins to enforce no fallbacks when they have no compatibility concerns. That would also give a mechanism of flipping the default config value on the major version boundary. |
@rnewson, @ricellis I like the proposed ideas and I am glad that we could utilize the original proposal but in a better way. Thanks for helping us with getting to this solution! @big-r81 we need the backwards compatibility. Personally, I see at least two reasons for that:
I do not have plans to remove the fallback at the moment. I do not see it an obvious bad thing but at the same time I understand that this has many disadvantages that may confuse or surprise the users. Actually, more "noise" around this in the documentation and in the configuration toggles could help to raise the awareness. I do not see that much associated extra work either (it is a small change if you check the current version), but I am willing to take on those, if needed. |
I agree with changing the behavior of It may also be useful context that Mango did originally prevent the |
@willholley what cluster-wide toggle? I'm talking about |
@willholley aha! I missed it the first time. I agree that those two suggested toggles just complicate things. I suggest it's just the opt-in in the |
Please note that #4792 has been opened as an alternative proposal in response to the comments in this conversation. |
I am closing this in favor of #4792. |
It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, automated overrides can still happen.
Introduce the concept of "strict index selection" (h/t @willholley) which lets the user to request the exclusive use of a specific index. When it is not possible, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the missing index, request its creation and try again later.
The feature comes with a configuration toggle. By default, the feature is disabled to maintain backward compatibility, but the user may ask for this behavior via the new
use_index_strict
query parameter for a specific query. Note that, similarly,use_index_strict
could be used to disable strict index selection temporarily when it is enabled on the server.Fixes #4511
Checklist for the review
rel/overlay/etc/default.ini
src/docs
folder