Skip to content

Improve SPARQLStore type annotations and add optional (opt-in) tests against public endpoints #3125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vemonet
Copy link

@vemonet vemonet commented Apr 28, 2025

Summary of changes

This PR brings improvements to make SPARQLStore a better and more reliable option to execute SPARQL queries on remote endpoints (instead of using the SPARQLWrapper)

  • Add tests of the SPARQLStore on public endpoints, taking inspiration from the SPARQLWrapper tests. Many popular SPARQL endpoints are tested in real world condition (graphdb, qlever, blazegraph, allegrograph, virtuoso...), which is really helpful to better know actually which features work, and which one are tricky

    • There are many comments explaining the current limitations, which will be helpful to people who tries to use it, and for future improvements of the store.
    • Since public endpoints are prone to be unavailable and the tests are a bit slower than the rest, these tests are optional and need to be opt-in by adding the flag --public-endpoints when running pytest. We do not run those tests as part of the CI/CD, we expect maintainers would run them locally from time to time, or when they are working on improving the SPARQLStore implementation
    • Documentation about these optional tests have been added to the developer docs
  • Make the method param available directly on the SPARQLStore constructor so that it is proposed with autocomplete in all IDEs (before it was passed as kwargs to the SPARQLConnector, so users would not now about it unless they dive in the SPARQLConnector code)

  • Use Literal type annotations for returnFormats and method at on the SPARQLConnector and SPARQLStore, this enable users to know exactly which values are possible just from basic autocomplete suggestions. Instead of having to dive in the code again

  • Improve the SPARQLStore docstring documentation to provide a short, yet complete, working example for running a query: just copy/paste and it works

  • Also fixed some small type checking errors that were already failing before these changes

    rdflib/store.py:126: error: Unused "type: ignore" comment  [unused-ignore]
    rdflib/compare.py:456: error: Unsupported operand types for + ("str" and "int")  [operator]
    rdflib/compare.py:456: note: Left operand is of type "Union[int, str]"
    

Running the test on public endpoints right now might give some errors, all related to the Linked Open Vocabulary SPARQL endpoint being down since this morning. It will probably be back up soon

poetry run pytest --public-endpoints

@nicholascar

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Checked that all tests and type checking passes.
  • If the change adds new features or changes the RDFLib public API:
    • Created an issue to discuss the change and get in-principle agreement.
    • Considered adding an example in ./examples.
  • If the change has a potential impact on users of this project:
    • Added or updated tests that fail without the change.
    • Updated relevant documentation to avoid inaccuracies.
    • Considered adding additional documentation.
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

vemonet added 3 commits April 28, 2025 13:51
…t is disabled (slow, and public endpoints loves to go down whenever they can, so we don't want to have them run as part of the default CI/CD). We need to add the flag --public-endpoints to the pytest command to enable them. The list of tested endpoints is the same as the one used in SPARQLWrapper tests, with a new entry for a Qlever endpoint. Documentation about this have been added to the developers docs page
…eople using it will get proper autocomplete suggestion for the returnFormats and methods available. Add better docs on how to use SPARQLStore in its docstring
…pported operand types for + ("str" and "int")
@vemonet
Copy link
Author

vemonet commented May 1, 2025

For now we test directly on public endpoints, which I find really interesting to test the store in real-life setting (with weird proxy rules, etc), and enable to test on proprietary endpoints that can't be deployed locally easily (e.g. stardog)

But it could be interesting to also add test on locally deployed endpoints in the future, here I have a docker compose file to deploy most open source triplestores: https://github.com/vemonet/rdflib-endpoint/blob/main/tests/compose.yml

We could use the testcontainer lib to automatically deploy them with docker, then use the SPARQLUpdateStore to add some triples, and query them

Example of a test that deploys GraphDB with testcontainer:

import os

import pytest
from testcontainers.core.container import DockerContainer
from testcontainers.core.waiting_utils import wait_for_logs

TRIPLESTORE_IMAGE = 'ontotext/graphdb:10.8.5'

env = os.environ.copy()
env["GRAPHDB_USERNAME"] = "admin"
env["GRAPHDB_PASSWORD"] = "root"

@pytest.fixture(scope="module")
def triplestore():
    """Start GraphDB container as a fixture."""
    container = DockerContainer(TRIPLESTORE_IMAGE)
    container.with_exposed_ports(7200).with_bind_ports(7200, 7200)
    container.with_env("JAVA_OPTS", "-Xms1g -Xmx4g")
    container.start()
    delay = wait_for_logs(container, "Started GraphDB")
    host = container.get_container_host_ip()
    port = container.get_exposed_port(7200)
    base_url = f"http://{container.get_container_host_ip()}:{container.get_exposed_port(7200)}"
    print(f"GraphDB started in {delay:.0f}s at {base_url}")
    # print(container.get_logs())
    yield base_url


def test_graphdb(triplestore):
    print(f"Triplestore available at URL: {triplestore}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant