Skip to content

Add reconciliation API endpoint to REST API #734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
245 changes: 245 additions & 0 deletions annif/openapi/annif.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,152 @@ paths:
"503":
$ref: '#/components/responses/ServiceUnavailable'
x-codegen-request-body-name: documents
/projects/{project_id}/reconcile:
get:
tags:
- Reconciliation
summary: get reconciliation service manifest or reconcile against a project
operationId: annif.rest.reconcile_metadata
parameters:
- $ref: '#/components/parameters/project_id'
- in: query
description: A call to the reconciliation service API
name: queries
required: false
schema:
type: string
additionalProperties:
type: object
required:
- query
properties:
query:
type: string
description: Query string to search for
limit:
type: integer
description: Maximum number of results to return
example:
'{
"q0": {
"query": "example query",
"limit": 10
},
"q1": {
"query": "another example",
"limit": 15
}
}'
responses:
"200":
description: successful operation
content:
application/json:
schema:
oneOf:
- $ref: '#/components/schemas/ReconcileMetadata'
- $ref: '#/components/schemas/ReconciliationResult'
examples:
ReconcileMetadata:
summary: Reconciliation service manifest
value:
{
"defaultTypes": [
{
"id": "default-type",
"name": "Default type"
}
],
"identifierSpace": "",
"name": "Annif Reconciliation Service for Dummy Finnish",
"schemaSpace": "http://www.w3.org/2004/02/skos/core#Concept",
"versions": [
"0.2"
],
"view": {
"url": "{{id}}"
}
}
ReconciliationResult:
summary: Reconciliation result
value:
{
"q0": {
"result": [
{
"id": "example-id",
"name": "example name",
"score": 0.5,
"match": true
}
]
},
"q1": {
"result": [
{
"id": "another-id",
"name": "another name",
"score": 0.5,
"match": false
}
]
}
}
"404":
$ref: '#/components/responses/NotFound'
post:
tags:
- Reconciliation
summary: reconcole against a project
operationId: annif.rest.reconcile
parameters:
- $ref: '#components/parameters/project_id'
requestBody:
content:
application/x-www-form-urlencoded:
encoding:
queries:
contentType: application/json
schema:
type: object
required:
- queries
properties:
queries:
type: object
description: A call to the reconciliation service API
additionalProperties:
type: object
required:
- query
properties:
query:
type: string
description: Query string to search for
limit:
type: integer
description: Maximum number of results to return
example:
{
"q0": {
"query": "example query",
"limit": 10
},
"q1": {
"query": "another example",
"limit": 15
}
}
required: true
responses:
"200":
description: successful operation
content:
application/json:
schema:
$ref: '#/components/schemas/ReconciliationResult'
"404":
$ref: '#/components/responses/NotFound'
components:
schemas:
ApiInfo:
Expand Down Expand Up @@ -314,6 +460,105 @@ components:
type: string
example: Vulpes vulpes
description: A document with attached, known good subjects
ReconcileMetadata:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be better to call this type ReconciliationServiceManifest (although it's pretty long...) so that it matches the reconciliation API spec.

required:
- name
- defaultTypes
- view
- identifierSpace
- schemaSpace
- versions
type: object
properties:
name:
type: string
example: Annif Reconciliation Service
identifierSpace:
type: string
example: ""
schemaSpace:
type: string
example: "http://www.w3.org/2004/02/skos/core#Concept"
defaultTypes:
type: array
items:
type: object
required:
- id
- name
properties:
id:
type: string
example: type-id
name:
type: string
example: type name
view:
type: object
required:
- url
properties:
url:
type: string
example: "{{id}}"
versions:
type: array
items:
type: string
example: 0.2
description: Reconciliation service information
ReconciliationResult:
type: object
additionalProperties:
type: object
required:
- result
properties:
result:
type: array
items:
type: object
required:
- id
- name
- score
- match
properties:
id:
type: string
example: example-id
name:
type: string
example: example name
score:
type: number
example: 0.5
match:
type: boolean
example: true
example:
{
"q0": {
"result": [
{
"id": "example-id",
"name": "example name",
"score": 0.5,
"match": true
}
]
},
"q1": {
"result": [
{
"id": "another-id",
"name": "another name",
"score": 0.5,
"match": false
}
]
}
}
Problem:
type: object
properties:
Expand Down
70 changes: 70 additions & 0 deletions annif/rest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from __future__ import annotations

import importlib
import json
from typing import TYPE_CHECKING, Any

import connexion
Expand Down Expand Up @@ -214,3 +215,72 @@ def learn(
return server_error(err)

return None, 204


def _reconcile(project_id: str, query: dict[str, Any]) -> dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint dict doesn't seem to match the actual returned type, which is a list of dicts.

(found by SonarCloud)

document = [{"text": query["query"]}]
parameters = {"limit": query["limit"]} if "limit" in query else {}
result = _suggest(project_id, document, parameters)

if _is_error(result):
return result

results = [
{
"id": res["uri"],
"name": res["label"],
"score": res["score"],
"match": res["label"] == query["query"],
}
for res in result[0]["results"]
]
return results


def reconcile_metadata(
project_id: str, **query_parameters
) -> ConnexionResponse | dict[str, Any]:
"""return service manifest or reconcile against a project and return a dict
with results formatted according to OpenAPI spec"""

try:
project = annif.registry.get_project(project_id, min_access=Access.hidden)
except ValueError:
return project_not_found_error(project_id)

if not query_parameters:
return {
"versions": ["0.2"],
"name": "Annif Reconciliation Service for " + project.name,
"identifierSpace": "",
"schemaSpace": "http://www.w3.org/2004/02/skos/core#Concept",
"view": {"url": "{{id}}"},
"defaultTypes": [{"id": "default-type", "name": "Default type"}],
}
else:
queries = json.loads(query_parameters["queries"])
results = {}
for key, query in queries.items():
data = _reconcile(project_id, query)
if _is_error(data):
return data
results[key] = {"result": data}

return results


def reconcile(
project_id: str, body: dict[str, Any]
) -> ConnexionResponse | dict[str, Any]:
"""reconcile against a project and return a dict with results
formatted according to OpenAPI spec"""

queries = body["queries"]
results = {}
for key, query in queries.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a for loop here means that the reconciliation isn't done in batches internally. It would be better (more efficient) to pass the whole batch to _suggest at once, similar to how the suggest_batch method works.

data = _reconcile(project_id, query)
if _is_error(data):
return data
results[key] = {"result": data}

return results
44 changes: 44 additions & 0 deletions tests/test_rest.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,3 +233,47 @@ def test_rest_learn_not_supported(app):
with app.app_context():
result = annif.rest.learn("tfidf-fi", [])
assert result.status_code == 503


def test_rest_reconcile_metadata(app):
with app.app_context():
results = annif.rest.reconcile_metadata("dummy-fi")
assert results["name"] == "Annif Reconciliation Service for Dummy Finnish"


def test_rest_reocncile_metadata_nonexistent(app):
with app.app_context():
result = annif.rest.reconcile_metadata("nonexistent")
assert result.status_code == 404


def test_rest_reconcile_metadata_queries(app):
with app.app_context():
results = annif.rest.reconcile_metadata(
"dummy-fi", queries='{"q0": {"query": "example text"}}'
)
assert "result" in results["q0"]


def test_rest_reconcile_metadata_queries_nonexistent(app):
with app.app_context():
result = annif.rest.reconcile_metadata(
"nonexistent", queries='{"q0": {"query": "example text"}}'
)
assert result.status_code == 404


def test_rest_reconcile(app):
with app.app_context():
results = annif.rest.reconcile(
"dummy-fi", {"queries": {"q0": {"query": "example text"}}}
)
assert "result" in results["q0"]


def test_rest_reconcile_nonexistent(app):

Check warning

Code scanning / CodeQL

Variable defined multiple times

This assignment to 'test_rest_reconcile_nonexistent' is unnecessary as it is [redefined](1) before this value is used.
with app.app_context():
result = annif.rest.reconcile(
"nonexistent", {"queries": {"q0": {"query": "example text"}}}
)
assert result.status_code == 404