Skip to content

Conversation

@edmondchuc
Copy link
Contributor

@edmondchuc edmondchuc commented Oct 27, 2025

Summary of changes

Fix: #3102

The RuntimeError: Set changed size during iteration occurs because a Dataset's default graph is lazily evaluated and only added to the store's context dict when accessed.

Since the default graph is always present in a dataset, this new change adds the default graph to the context dict on dataset creation. - this change breaks a bunch of tests unrelated to Dataset, like namespace manager etc.

We materialise the iterator using a list() call. This way, when the default graph is added to the contexts dict, we don't get the runtime error.

We also avoid calling the super class (ConjunctiveGraph)'s methods as it's redundant and we'll be removing the class hierarchy soon. This now calls self.store.contexts directly.

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Checked that all tests and type checking passes.
  • If the change adds new features or changes the RDFLib public API:
    • Created an issue to discuss the change and get in-principle agreement.
    • Considered adding an example in ./examples.
  • If the change has a potential impact on users of this project:
    • Added or updated tests that fail without the change.
    • Updated relevant documentation to avoid inaccuracies.
    • Considered adding additional documentation.
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

) -> Generator[_ContextType, None, None]:
if triple is None or triple == (None, None, None):
return (context for context in self.__all_contexts)
return (context for context in list(self.__all_contexts))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix.

default |= c.identifier == DATASET_DEFAULT_GRAPH_ID
yield c
if not default:
yield self.graph(DATASET_DEFAULT_GRAPH_ID)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the default graph is lazily evaluated, it makes a call here to self.graph, which adds the default graph to the store. This changes the context iterator size and throws a runtime error.

@edmondchuc
Copy link
Contributor Author

I'm a bit puzzled as to why this only fails on the 3.8 tests in GitHub Action while it passes fine on python 3.8 locally.

@edmondchuc
Copy link
Contributor Author

To reproduce the failing test locally, run:

./with-fuseki.sh poetry run pytest -v test/test_dataset/test_dataset.py

@edmondchuc
Copy link
Contributor Author

The current Python 3.8 tests are extensive and include spinning up a Fuseki instance during the test run. These tests conditionally register the SPARQLUpdateStore plugin only if the Fuseki endpoint is reachable.

The issue arises from a change where the Dataset.contexts() method was modified to call the store’s contexts() method directly in 63caed5. However, not all stores guarantee that their contexts() implementation returns graph objects. For example, SPARQLUpdateStore returns graph names as URIRefs instead. To ensure contexts() always returns graph objects, we delegate to ConjunctiveGraph.contexts(), which converts graph name URIRefs into graph objects.

@nicholascar nicholascar self-requested a review October 30, 2025 02:47
@nicholascar nicholascar merged commit 8685a85 into 7.x Oct 30, 2025
20 checks passed
@nicholascar nicholascar deleted the v7/fix/memstore-graphs-set-change branch October 30, 2025 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants