Skip to content

Conversation

@urbandroid
Copy link

@urbandroid urbandroid commented Aug 3, 2025

Hello,

This pull request constitutes a major architectural refactoring of the datanucleus-neo4j plugin. It replaces the legacy, embedded-only persistence implementation with a modern, robust backend powered by the official Neo4j Bolt Driver.

This migration addresses critical bugs in core persistence operations (especially updates and relationship management) and provides a more stable, performant, and maintainable foundation for the future.
Summary of Changes

The changes can be categorized into three main efforts: a full backend replacement, a strategic refactoring of existing code, and targeted bug fixes.

  1. New Bolt Driver Persistence Backend

The core of this PR is a completely new set of classes designed to work with the neo4j-java-driver.

Connection Management (pom.xml, ConnectionFactoryImpl, BoltConnectionFactoryImpl):

    The pom.xml now includes the neo4j-java-driver dependency.

    ConnectionFactoryImpl has been refactored into an intelligent delegate. It inspects the connection URL and seamlessly switches between the legacy embedded handler and the new BoltConnectionFactoryImpl for Bolt connections.

    The new BoltConnectionFactoryImpl manages the Bolt Driver, Session, and Transaction lifecycle, wrapping them in a DataNucleus ManagedConnection and an emulated XAResource.

Persistence Operations (BoltPersistenceHandler, BoltFieldManager, BoltRelationshipManager):

    The old Neo4jPersistenceHandler has been deleted and replaced by BoltPersistenceHandler. This new class orchestrates all CRUD operations (insert, fetch, update, delete) using the Bolt transaction.

    The updateObject method is now fully implemented, fixing a critical bug where property changes were not saved to the database.

    BoltFieldManager handles the serialization of object fields into node properties.

    BoltRelationshipManager encapsulates the logic for creating and deleting relationships using idempotent Cypher MERGE queries.

The Race Condition Fix (Immediate Node Caching):
A fundamental bug that prevented the creation of relationships for newly persisted objects in the same transaction has been resolved with a two-part mechanism:

    In BoltFieldManager, after a node is created, its native Node object is immediately cached in the DNStateManager.

    In BoltRelationshipManager, the related Node is retrieved directly from the StateManager's cache, not the database. This guarantees that the node is found even before the transaction is committed, making relationship persistence reliable.
  1. Code Refactoring and Separation of Concerns

To support the new backend and improve maintainability, existing code was heavily refactored:

Utility Class Separation: The monolithic Neo4jUtils (now deleted) has been broken down into logical, purpose-built utility classes:

    Neo4jSchemaUtils: For handling labels, surrogate columns, and metadata.

    Neo4jPropertyManager: For type conversion between Java and Neo4j.

    BoltPersistenceUtils: For Bolt-specific lookups and object creation.

    EmbeddedPersistenceUtils: Isolates the logic used only by the embedded driver.

Query Engine Isolation:

    The query logic is now split. A new BoltCypherQuery class handles executing Cypher against the Bolt driver.

    A placeholder BoltJDOQLQuery has been added to provide clear error messages, indicating that JDOQL/JPQL is not yet supported on the new backend.

    The existing JDOQLQuery and CypherQuery now delegate to a new EmbeddedQueryEngine, cleanly separating them from the Bolt implementation.
  1. Test Suite and Plugin Configuration

    The plugin.xml has been simplified to register only the single, intelligent ConnectionFactoryImpl.

    The tests you provided (and fixed) in the initial problem description are now fully supported and serve as validation for the new backend's correctness for create, read, update, delete, and relationship management operations.

Impact

This is a foundational PR that brings the datanucleus-neo4j plugin into the modern era. It resolves long-standing stability issues, improves performance by using the official binary protocol, and provides a clear, maintainable architecture for future enhancements.

There is lot more to do it for example JDOQL and JPQL support which are clear, currently only supports Cypher queries. And things that are not clear for example: Persistence by reachability does it work in all settings? For this purpose i ask for your feedback and guidance to complete datanucleus-neo4j module.

I am happy to discuss any of the architectural decisions or implementation details. Thank you for your review.

@urbandroid
Copy link
Author

Cascade delete is not working

@andyjefferson
Copy link
Member

Just to say thank you for doing this, and please continue with your efforts.
The current Neo4j implementation was based on a very early Neo4j driver, and much has changed in Neo4j driver-space since that time, and an update to this plugin is long overdue. I will hopefully get around to looking at it in more detail in the next couple of weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants