-
Notifications
You must be signed in to change notification settings - Fork 39
fix type for advanced freetext and allow free-text for Item search #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -54,8 +54,7 @@ async def all_collections( # noqa: C901 | |||
sortby: Optional[str] = None, | |||
filter_expr: Optional[str] = None, | |||
filter_lang: Optional[str] = None, | |||
q: Optional[List[str]] = None, | |||
**kwargs, | |||
**kwargs: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR we changed and we now forward kwargs
to _clean_search_args
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if POST /search
with {"q": 123}
is submitted? Does q
make it all the way to the DB and raises due to invalid types? There won't be any early API model validation of the parameter?
4f90e88
to
c0767f3
Compare
@fmigneault could you check this PR 🙏 Overall is to do as #267 Instead of adding |
tests/conftest.py
Outdated
pgdatabase=database.dbname, | ||
) | ||
logger.info("Creating app Fixture") | ||
time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is time used for anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷
|
||
resp = await app_client.get( | ||
"/collections", | ||
params={"q": "temperature,yo"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for the /search?q=
and /collections/{col}/items?q=...
cases?
Also, there could be POST /search
with {"q": "advanced AND search"}
or {"q": ["basic", "search"]}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this PR is meant to implement free-text item search, just correct some stuff around free-text collection search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function self._clean_search_args
calls are updated with the **kwargs
, so q
will now trickle down within these calls as well (which is good), and should be considered (but could be in a separate PR though not to block this one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q
won't be in /search
because we don't usually use the free-text extension for items
if you're passing a dict
, pydantic should raise a validation https://github.com/stac-utils/stac-fastapi/blob/fa42985255fad0bab7dbe3aadbf1f74cb1635f3a/stac_fastapi/extensions/stac_fastapi/extensions/core/free_text/request.py#L37-L43
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this PR is meant to implement free-text item search, just correct some stuff around free-text collection search.
Exactly, but it also enables free-text for items
by using kwargs
, as for other unknown extension people would want to implement. Now they would have to just pass a custom _clean_search_args
to support any kind of input passed through kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Good point about the Pydantic model.
I'm not sure to understand the "q
won't be in /search
".
Isn't it available if the conformance is applied?
https://github.com/stac-utils/stac-fastapi/blob/fa42985255fad0bab7dbe3aadbf1f74cb1635f3a/stac_fastapi/extensions/stac_fastapi/extensions/core/free_text/free_text.py#L27-L40
I was able to activate it is my implementation (https://github.com/crim-ca/stac-app/pull/28/files). It should do the title
/description
/keywords
free-text search across all collections' items, as if filter
or query
was used, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand the "q won't be in /search".
I just mean that q
parameter will ONLY be available if enabled at the application level.
As mentioned in #263 (comment) it works only for Advanced because we don't do str -> list -> str
transformation but keep the values as str
@@ -54,8 +54,7 @@ async def all_collections( # noqa: C901 | |||
sortby: Optional[str] = None, | |||
filter_expr: Optional[str] = None, | |||
filter_lang: Optional[str] = None, | |||
q: Optional[List[str]] = None, | |||
**kwargs, | |||
**kwargs: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if POST /search
with {"q": 123}
is submitted? Does q
make it all the way to the DB and raises due to invalid types? There won't be any early API model validation of the parameter?
😭 Well in fact I added tests for When we are using
|
Exactly. And this is invalid according to Advanced free-text. The comma is plain-text in this variant. It should split and do That being said, I would love for Advanced spec to be updated and align it with Basic to avoid this ambiguity. This is pretty much what every open issues requests: |
There will be no issue when working with |
I think the main issue it that the spec define both I'm going to see if we can do this in pgstac instead of having |
# join the list[str] with ` OR ` | ||
# ref: https://github.com/stac-utils/stac-fastapi-pgstac/pull/263 | ||
if q := clean_args.pop("q", None): | ||
clean_args["q"] = " OR ".join(q) if isinstance(q, list) else q |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need custom code to handle list[str]
passed by collection-search Free-Text extension as pgstac will only accept str
Note: we don't need this in items search because we will use pydantic serialization
] = Field( | ||
None, | ||
description="Parameter to perform free-text queries against STAC metadata", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom FreeTextExtensionPostRequest
model which will handle JSON serialization, transforming list[str] to str
@@ -34,6 +34,7 @@ | |||
"type": "Polygon" | |||
}, | |||
"properties": { | |||
"description": "Landat 8 imagery radiometrically calibrated and orthorectified using gound points and Digital Elevation Model (DEM) data to correct relief displacement.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free-text for items
will only work within properties
(title, description keywords)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not relevant for this PR, but this is why item-level free-text search feels funny to me ... IMO this info should live at the collection level only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of a collection where each item contains relatively the same information at different place/time, Item-level free-text search is indeed redundant. However, imagine the case of a collection regrouping multiple "conceptual" items, such as many AI models described using MLM extension. In this case, each Item could contain different descriptions and keywords within the same collection, which makes the free-text search very relevant at item-level.
@@ -34,6 +34,7 @@ | |||
"type": "Polygon" | |||
}, | |||
"properties": { | |||
"description": "Landat 8 imagery radiometrically calibrated and orthorectified using gound points and Digital Elevation Model (DEM) data to correct relief displacement.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not relevant for this PR, but this is why item-level free-text search feels funny to me ... IMO this info should live at the collection level only.
are we good to merge this one @bitner @fmigneault ? |
ref: stac-utils/stac-fastapi#849
To Do