Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type mapping for Collections bbox #319

Open
ccancellieri opened this issue Dec 5, 2024 · 2 comments
Open

Type mapping for Collections bbox #319

ccancellieri opened this issue Dec 5, 2024 · 2 comments

Comments

@ccancellieri
Copy link

I'm proceeding into the development of the collections search and I encountered the following schema definition:

"extent.spatial.bbox": {"type": "long"},
 "extent.temporal.interval": {"type": "date"},

This is preventing me to apply any spatial filter on this field, so I was wondering if we can consider to change this schema.

The standard says:
https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#spatial-extent-object

So it should be a [[number]] which may be mapped to a long or double rater than a long, don't you think so?
https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html

Moreover the standard talks about interval as [[string|null]] ( https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#temporal-extent-object )
and this has been mapped as 'date' instead.

This is great so we can search and filter by date not using generic strings.

But so, why don't we map the spatial extent with a geo_shape?

"extent.spatial.bbox": {"type": "geo_shape"},

so we can start properly search collections with s_intersection filters?

Thanks

@jamesfisher-geo
Copy link
Collaborator

This sounds like a good approach to me. Here are a couple thoughts:

@ccancellieri
Copy link
Author

Thanks @jamesfisher-geo for your reply.

Let me try to reply to your points:
Point 1: I think we could reuse what has been done for the items, it may not be a challenge.
Point 2: same as above.
Point 3: well this is actually an interesting point, I think we should also look on how the time is defined but definitely this could add a level of complexity to properly manage it:
We will first need to create an index that includes nested fields for our catalog structure and geo_shape fields for geospatial data. This will allow us to perform properly searches on each bbox of the list but it's a quite complex approach.
Another option, probably easier, is to encode the multiple extents as a single multi-polygon (multiple bboxes) so the search will be accurate and performed on all of the extents in one shot.

Experimenting More:

I'm currently testing my extension with:

{"limit":120,
"filter-lang":"cql2-text",
"filter":"S_INTERSECTS(\"extent.spatial.bbox\", BBOX(\"-73.80811779512624, -43.580390855607845, 80.0639090544287, 43.068887774169625\"))",
"sortBy":[{"field":"id","direction":"asc"}]}

And this is the resulting query:

{
 'geo_shape': 
 {
  'extent.spatial.bbox': 
  {
   'shape': {
     'function': 'bbox',
     'args': [{'property': '-73.80811779512624, -43.580390855607845, 80.0639090544287, 43.068887774169625'}]}, 
     'relation': 'intersects'
   }
  }
}

It is unfortunately not working probably because of the type of the extent (not geo_shape):

File "/usr/local/lib/python3.10/site-packages/stac_fastapi/api/routes.py", line 65, in _endpoint
    return _wrap_response(await func(request_data, request=request))
  File "/app/stac_fastapi/core/stac_fastapi/core/core.py", line 1045, in post_all_collections
    return await self.post_search(search_request=search_request, request=request)
  File "/app/stac_fastapi/core/stac_fastapi/core/core.py", line 632, in post_search
    items, maybe_count, next_token = await self.database.execute_search(
  File "/app/stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py", line 704, in execute_search
    es_response = await search_task
  File "/usr/local/lib/python3.10/site-packages/elasticsearch/_async/client/__init__.py", line 3735, in search
    return await self.perform_request(  # type: ignore[return-value]
  File "/usr/local/lib/python3.10/site-packages/elasticsearch/_async/client/_base.py", line 320, in perform_request
    raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
elasticsearch.BadRequestError: BadRequestError(400, 'illegal_argument_exception', 'Required [type]')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants