Skip to content

Conversation

vincentsarago
Copy link
Member

@vincentsarago vincentsarago commented Jul 10, 2025

ref: stac-utils/stac-fastapi#849

To Do

  • add tests

@@ -54,8 +54,7 @@ async def all_collections( # noqa: C901
sortby: Optional[str] = None,
filter_expr: Optional[str] = None,
filter_lang: Optional[str] = None,
q: Optional[List[str]] = None,
**kwargs,
**kwargs: Any,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR we changed and we now forward kwargs to _clean_search_args method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if POST /search with {"q": 123} is submitted? Does q make it all the way to the DB and raises due to invalid types? There won't be any early API model validation of the parameter?

@vincentsarago vincentsarago force-pushed the patch/allow-advanced-free-text-ext branch from 4f90e88 to c0767f3 Compare July 24, 2025 14:32
@vincentsarago vincentsarago requested review from alukach and gadomski and removed request for alukach July 24, 2025 14:54
@vincentsarago
Copy link
Member Author

@fmigneault could you check this PR 🙏

Overall is to do as #267

Instead of adding q to the method annotation, this PR forward any kwargs to the _clean_search_args function

pgdatabase=database.dbname,
)
logger.info("Creating app Fixture")
time.time()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is time used for anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷


resp = await app_client.get(
"/collections",
params={"q": "temperature,yo"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests for the /search?q= and /collections/{col}/items?q=... cases?

Also, there could be POST /search with {"q": "advanced AND search"} or {"q": ["basic", "search"]}.

Copy link
Member

@gadomski gadomski Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this PR is meant to implement free-text item search, just correct some stuff around free-text collection search.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function self._clean_search_args calls are updated with the **kwargs, so q will now trickle down within these calls as well (which is good), and should be considered (but could be in a separate PR though not to block this one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q won't be in /search because we don't usually use the free-text extension for items

if you're passing a dict, pydantic should raise a validation https://github.com/stac-utils/stac-fastapi/blob/fa42985255fad0bab7dbe3aadbf1f74cb1635f3a/stac_fastapi/extensions/stac_fastapi/extensions/core/free_text/request.py#L37-L43

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this PR is meant to implement free-text item search, just correct some stuff around free-text collection search.

Exactly, but it also enables free-text for items by using kwargs, as for other unknown extension people would want to implement. Now they would have to just pass a custom _clean_search_args to support any kind of input passed through kwargs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Good point about the Pydantic model.

I'm not sure to understand the "q won't be in /search".
Isn't it available if the conformance is applied?
https://github.com/stac-utils/stac-fastapi/blob/fa42985255fad0bab7dbe3aadbf1f74cb1635f3a/stac_fastapi/extensions/stac_fastapi/extensions/core/free_text/free_text.py#L27-L40

I was able to activate it is my implementation (https://github.com/crim-ca/stac-app/pull/28/files). It should do the title/description/keywords free-text search across all collections' items, as if filter or query was used, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand the "q won't be in /search".

I just mean that q parameter will ONLY be available if enabled at the application level.

As mentioned in #263 (comment) it works only for Advanced because we don't do str -> list -> str transformation but keep the values as str

@@ -54,8 +54,7 @@ async def all_collections( # noqa: C901
sortby: Optional[str] = None,
filter_expr: Optional[str] = None,
filter_lang: Optional[str] = None,
q: Optional[List[str]] = None,
**kwargs,
**kwargs: Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if POST /search with {"q": 123} is submitted? Does q make it all the way to the DB and raises due to invalid types? There won't be any early API model validation of the parameter?

@vincentsarago
Copy link
Member Author

😭 Well in fact I added tests for /search and it shows a bug

When we are using _clean_search_args we're transforming a list to a string which is what pgstac expect apparently. But for /search what will happen is:

@fmigneault
Copy link
Contributor

fmigneault commented Jul 25, 2025

fastapi will convert this string to ["temperature","yo"]

Exactly. And this is invalid according to Advanced free-text. The comma is plain-text in this variant. It should split and do OR only for Basic free-text.

That being said, I would love for Advanced spec to be updated and align it with Basic to avoid this ambiguity. This is pretty much what every open issues requests:

@vincentsarago
Copy link
Member Author

Exactly. And this is invalid according to Advanced free-text. The comma is plain-text in this variant. It should split and do OR only for Basic free-text.

There will be no issue when working with advanced Free Search, the param will always be in string and no transformation will be done. This bug only happens for basic free search because we have a list in input (as defined by the spec) but we need to pass a string to PgSTAC.

@vincentsarago
Copy link
Member Author

I think the main issue it that the spec define both string and list of string for input (GET and POST) so IMO, pgstac should be able to handle both.

I'm going to see if we can do this in pgstac instead of having hacks in stac-fastapi-pgstac`

# join the list[str] with ` OR `
# ref: https://github.com/stac-utils/stac-fastapi-pgstac/pull/263
if q := clean_args.pop("q", None):
clean_args["q"] = " OR ".join(q) if isinstance(q, list) else q
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need custom code to handle list[str] passed by collection-search Free-Text extension as pgstac will only accept str

Note: we don't need this in items search because we will use pydantic serialization

] = Field(
None,
description="Parameter to perform free-text queries against STAC metadata",
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom FreeTextExtensionPostRequest model which will handle JSON serialization, transforming list[str] to str

@@ -34,6 +34,7 @@
"type": "Polygon"
},
"properties": {
"description": "Landat 8 imagery radiometrically calibrated and orthorectified using gound points and Digital Elevation Model (DEM) data to correct relief displacement.",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free-text for items will only work within properties (title, description keywords)

https://github.com/stac-utils/pgstac/blob/45ac2478b58946529872ec3feed0ee0c838c4742/src/pgstac/sql/004_search.sql#L225-L227

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant for this PR, but this is why item-level free-text search feels funny to me ... IMO this info should live at the collection level only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of a collection where each item contains relatively the same information at different place/time, Item-level free-text search is indeed redundant. However, imagine the case of a collection regrouping multiple "conceptual" items, such as many AI models described using MLM extension. In this case, each Item could contain different descriptions and keywords within the same collection, which makes the free-text search very relevant at item-level.

@vincentsarago vincentsarago changed the title fix type for advanced freetext fix type for advanced freetext and allow free-text for Item search Aug 1, 2025
@@ -34,6 +34,7 @@
"type": "Polygon"
},
"properties": {
"description": "Landat 8 imagery radiometrically calibrated and orthorectified using gound points and Digital Elevation Model (DEM) data to correct relief displacement.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant for this PR, but this is why item-level free-text search feels funny to me ... IMO this info should live at the collection level only.

@vincentsarago
Copy link
Member Author

are we good to merge this one @bitner @fmigneault ?

@vincentsarago vincentsarago mentioned this pull request Aug 7, 2025
@vincentsarago vincentsarago merged commit 8e5ebfa into main Aug 8, 2025
7 checks passed
@vincentsarago vincentsarago deleted the patch/allow-advanced-free-text-ext branch August 8, 2025 09:04
@fmigneault fmigneault mentioned this pull request Aug 14, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants