Skip to content

ElasticSearch doSimilaritySearch broken after recent changes in Document  #1936

@jcgouveia

Description

@jcgouveia

After update to the most recent version the similarity search gives an exception because the Document is no longer compatible to the data return by the search hits

  1. Document (org.springframework.ai.document.Document) is directly serialized from the ES response using standard jackson map to object serialization
doSimilaritySearch(SearchRequest searchRequest)
...
			SearchResponse<Document> res = this.elasticsearchClient.search(
					sr -> sr.index(this.options.getIndexName())
						.knn(knn -> knn.queryVector(EmbeddingUtils.toList(vectors))
							.similarity(finalThreshold)
							.k((long) searchRequest.getTopK())
							.field("embedding")
							.numCandidates((long) (1.5 * searchRequest.getTopK()))
							.filter(fl -> fl.queryString(
									qs -> qs.query(getElasticsearchQueryString(searchRequest.getFilterExpression()))))),
			---->		Document.class);

			return res.hits().hits().stream().map(this::toDocument).collect(Collectors.toList());
  1. The previous Document class was compatible with ES responses
"_source":{
"embedding":[…],
"content": "...",
"media":[],
"metadata":{ .... }

Now, it isn´t anymore in the Document class, after #1794.
context has been renamed to text, media is no longer an array

  1. Therefore I get the error
com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize value of type `org.springframework.ai.model.Media` from Array value (token `JsonToken.START_ARRAY`)
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 8428] (through reference chain: org.springframework.ai.document.Document["media"])

I'm not able to use ElasticSearch queries through the Vector Store.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions