Skip to content

add support for JSONField #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2024
Merged

Conversation

WaVEV
Copy link
Collaborator

@WaVEV WaVEV commented Jun 9, 2024

fixes #8

@WaVEV WaVEV marked this pull request as draft June 9, 2024 22:34
@WaVEV WaVEV force-pushed the supports-json-field branch from a661dac to f0f46f4 Compare June 9, 2024 22:46
@timgraham timgraham changed the title Supports json field add support for JSONField Jun 10, 2024
Copy link
Collaborator

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

@WaVEV WaVEV force-pushed the supports-json-field branch 2 times, most recently from 9f29d68 to e8d257a Compare June 15, 2024 19:12
@WaVEV WaVEV force-pushed the supports-json-field branch from 617dc03 to 13f58ab Compare June 19, 2024 14:36
@WaVEV WaVEV marked this pull request as ready for review June 19, 2024 14:42
Copy link
Collaborator

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an exhaustive review but here are a few things to work on while I continue reviewing.

@@ -202,6 +204,9 @@ class DatabaseFeatures(BaseDatabaseFeatures):
"annotations.tests.NonAggregateAnnotationTestCase.test_order_by_annotation",
# annotate().filter().count() gives incorrect results.
"db_functions.datetime.test_extract_trunc.DateFunctionTests.test_extract_year_exact_lookup",
"model_fields.test_jsonfield.TestQuerying.test_nested_key_transform_on_subquery",
"model_fields.test_jsonfield.TestQuerying.test_ordering_grouping_by_count",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is "FieldDoesNotExist with ordering."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that there is already a section (a few lines above) with this name.

@@ -279,6 +284,10 @@ class DatabaseFeatures(BaseDatabaseFeatures):
"update.tests.SimpleTest.test_empty_update_with_inheritance",
"update.tests.SimpleTest.test_foreign_key_update_with_id",
"update.tests.SimpleTest.test_nonempty_update_with_inheritance",
"model_fields.test_jsonfield.TestQuerying.test_join_key_transform_annotation_expression",
"model_fields.test_jsonfield.TestQuerying.test_order_grouping_custom_decoder",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these last three don't actually do joins but the logic in SQLCompiler._get_ordering() is naive to the fact that __ is a key transform. You can defer this to a separate commit or PR, but I would put these skips in a separate section that describes the issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmh you are right, so for this one I think you should create a new PR. The if that check joins is wonky (only checks for __) I think it should be another short task. I will move those test to expected failures.

@@ -8,8 +8,9 @@ class DatabaseFeatures(BaseDatabaseFeatures):
supports_date_lookup_using_string = False
supports_foreign_keys = False
supports_ignore_conflicts = False
# Not implemented: https://github.com/mongodb-labs/django-mongodb/issues/8
supports_json_field = False
supports_json_field = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this line since it's what it's the superclass.

@@ -202,6 +204,9 @@ class DatabaseFeatures(BaseDatabaseFeatures):
"annotations.tests.NonAggregateAnnotationTestCase.test_order_by_annotation",
# annotate().filter().count() gives incorrect results.
"db_functions.datetime.test_extract_trunc.DateFunctionTests.test_extract_year_exact_lookup",
"model_fields.test_jsonfield.TestQuerying.test_nested_key_transform_on_subquery",
"model_fields.test_jsonfield.TestQuerying.test_ordering_grouping_by_count",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that there is already a section (a few lines above) with this name.

@@ -48,4 +48,6 @@ def process_rhs(node, compiler, connection):
def regex_match(field, value, regex, *re_args, **re_kwargs):
regex = re.compile(regex % re.escape(value), *re_args, **re_kwargs)
options = "i" if regex.flags & re.I else ""
return {"$regexMatch": {"input": field, "regex": regex.pattern, "options": options}}
return {
"$regexMatch": {"input": {"$toString": field}, "regex": regex.pattern, "options": options}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change end up solving anything for JSONField? I reverted and model_fields.test_jsonfield still passed. Maybe there's another affected test. Anyway, I'd make it a separate PR since it fixes other non-JSONField tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't. It solve only when a json field is a string, number, or null. But It does not pass any new test.

@WaVEV WaVEV force-pushed the supports-json-field branch 2 times, most recently from 30a3188 to c475a78 Compare June 25, 2024 18:30
@timgraham
Copy link
Collaborator

This is looking pretty good to me. I added a couple of commits.

Two remaining concerns:

  • Would you like to add some comments to the logic inside json.py's functions. I didn't dive into it to try to understand them, but they look pretty gnarly. Even if you as the author understand it now, I wonder if it's going to be easy to understand if you return it later.
  • I'm not sure what "MongoDB's null behavior is different from SQL's." means. Only some of the tests there seem to query for None. If this is an issue you don't think we'll be able to solve later, it might be time to document it in some more detail in the README.

@WaVEV
Copy link
Collaborator Author

WaVEV commented Jun 26, 2024

This is looking pretty good to me. I added a couple of commits.

Two remaining concerns:

  • Would you like to add some comments to the logic inside json.py's functions. I didn't dive into it to try to understand them, but they look pretty gnarly. Even if you as the author understand it now, I wonder if it's going to be easy to understand if you return it later.

Yes, I can, I will add some comments there. There are two method that I patched because I wasn't able to find a way to handle them in process_rhs or process_lhs.

  • I'm not sure what "MongoDB's null behavior is different from SQL's." means. Only some of the tests there seem to query for None. If this is an issue you don't think we'll be able to solve later, it might be time to document it in some more detail in the README.

The behavior of NULL in SQL is unique. For instance, NULL is NULL gives True, NULL = NULL is NULL, NULL <> NULL is NULL, and so forth. When a JSONField is indexed with a non-existent field, it returns NULL. This allows queries like:

condition = Q(value__foo="bax")

To return less values than expected when using (I expect everything with: B or ~B, that is the question)

NullableJSONModel.objects.filter(condition | ~condition)

In SQL, this query could return unexpected results because NULL can make a predicate and its negation both true in some cases.

Handling this behavior in MongoDB requires extra logic. In MongoDB, NULL sometimes behaves like False and other times like True, requiring careful handling to allow this kind of inconsistencies in query results.

for path in paths:
keys.append(_has_key_predicate(path, lhs))
if self.mongo_operator is None:
assert len(keys) == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this mainly for debugging? Can we remove it now? We shouldn't rely on assert for anything critical since asserts are removed when running with python -O. If so, I'll make the update along with my next batch of language edits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will remove. It was for debugging. Shall we also remove this assert: https://github.com/mongodb-labs/django-mongodb/blob/main/django_mongodb/base.py#L121?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a brief attempt to refactor the connection code to conform to Django's normal API, but it wasn't successful. I may return to it later. For now I wouldn't bother with a commit just to remove that line.

@timgraham
Copy link
Collaborator

Thanks for the documentation. I found some of it a little redundant, so I made it more concise, hopefully without losing any important info. I'll check this once more tomorrow, but I think we're close to a merge. 👍

@timgraham timgraham force-pushed the supports-json-field branch from 7aa1ec0 to e509877 Compare June 28, 2024 12:53
result = {
"$and": [
# The path must exist (i.e. not be "missing").
{"$ne": [{"$type": path}, "missing"]},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would the path return missing? Wouldn't it always return null?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the argument is a field that is missing in the input document, $type returns the string "missing".

@mongodb mongodb deleted a comment from Jibola Jun 28, 2024
@@ -49,6 +51,12 @@ def adapt_decimalfield_value(self, value, max_digits=None, decimal_places=None):
return None
return Decimal128(value)

def adapt_json_value(self, value, encoder):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this method called and what's the purpose? I ask because json.loads+json.dumps is expensive and ends up with the same value. Also it won't work with pymongo's extended JSON types like ObjectId, Code, Binary, etc...

Copy link
Collaborator

@timgraham timgraham Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We can skip it unless the user has specified a custom encoder.

It's called before saving a value to the database. Normally it's just json.dumps() on databases that don't have a json data type.

Co-authored-by: Tim Graham <[email protected]>
@timgraham timgraham force-pushed the supports-json-field branch from 226e9f2 to 9019a33 Compare July 1, 2024 13:48
@timgraham timgraham merged commit 9019a33 into mongodb:main Jul 1, 2024
3 checks passed
@WaVEV WaVEV deleted the supports-json-field branch August 26, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add JSONField support
4 participants