Releases: neo4j/graph-data-science
1.2.0
Release date: May 7, 2020
Compatibility: GDS 1.2 is compatible with Neo4j 4.0.0 and above. GDS 1.2 is not compatible with Neo4j 3.5.x.
New Features:
- Triangle Count has moved into the product tier. This means it is now called via
gds.triangleCount
, and all associated bugs have been fixed. We have also added an optionalmaxDegree
parameter that users can specify to eliminate dense nodes and speed up calculations.- This adds the following new procedures:
gds.triangleCount.stream
,gds.triangleCount.write
,gds.triangleCount.mutate
,gds.triangleCount.stats
- This removes the alpha procedures
gds.alpha.triangleCount.stream
,gds.alpha.triangleCount.write
, andgds.alpha.triangleCount.stats
- The global triangle count is available as an output of the
.stats
or.write
mode
- This adds the following new procedures:
- Clustering Coefficient . has moved into the product tier (and is now separate from triangleCount). It can now be called via
gds.localClusteringCoefficient
, and all bugs have been fixed.- This adds the following new procedures:
gds.localClusteringCoefficient.stream
,gds.localClusteringCoefficient.write
,gds.localClusteringCoefficient.mutate
,gds.localClusteringCoefficient.stats
- The global clustering coefficient is available as an output of the
.stats
or.write
mode
- This adds the following new procedures:
- Graph export has moved to the
beta
tier, and can now export a new graph as a new database in Neo4j 4.0’s multidatabase environment. - All of our product tier community detection algorithms now support assigning consecutive integers for community IDs by using the optional
consecutiveID
parameter. gds.graph.list
now outputssizeInBytes
andmemoryUsage
to enable users to see the memory footprint of loaded graphs.- We have added node label filters to
gds.graph.writeNodeProperties
,gds.graph.removeNodeProperties
, andgds.util.nodeProperty
- We have added a centralityDistribution result field for page pank’s write, stats, and mutate procedures
Breaking changes:
- GDS 1.2 has dropped support for Neo4j 3.5, and now only works in 4.0
- g
ds.alpha.triangle.stream
has been renamedgds.alpha.triangle
s storeDir
has been removed fromgds.graph.export
, instead it will create a new database in thedatabases
directory of your current installation.creationTime
andmodificationTime
have been updated to used ZonedDateTime- We have removed explicit definition of property mappings and aggregations from cypher projections
Bug fixes:
- Fixed
gds.graph.writeNodeProperties
where it did not return the count of mutated properties, and incorrectly wrote 0 for nodes which were missing properties. - Graphs created via Cypher projections no longer return inferred projections (they just return the query)
- Corrected a bug where mutated node properties had size 0
- Fixed a bug where community detection using a seed property from an in-memory node property failed to write results.
- Fixed a bug where similarity algorithms would throw on weight vectors containing null
GDS 1.1.1 Release (compatible with Neo4j 3.5)
Release date: 4 May, 2020
GDS 1.1.1 is compatible with Neo4j 3.5.9 and above, but not Neo4j 4.x. For a 4.0 compatible release, please see GDS 1.2.0.
Bug fixes:
- Fixed several bugs with mutated node properties:
- Mutated node properties no longer have size 0
- gds.graph.writeNodeProperties previously did not return counts for node properties; this has been fixed
- Fixed a bug where progress logging could slow down algorithm performance
- gds.graph.writeNodeProperties no longer writes 0 to nodes missing the specified property
- Fixed a bug where seeding from an in-memory node property caused no results to be written back.
1.2.0-alpha01
Release date: April 24, 2020
Important information: This is a preview release and not recommended for production. We plan on offering a GA version of our 1.2 library on May 7. If you have feedback on the preview, please open an issue on our github repo!
Compatibility: GDS 1.2 is compatible with Neo4j 4.0.0 and above.
New Features:
- Triangle Count has moved into the product tier. This means it is now called via
gds.triangleCount
, and all associated bugs have been fixed.- This adds the following new procedures:
gds.triangleCount.stream
,gds.triangleCount.stream.estimate
,gds.triangleCount.write
,gds.triangleCount.write.estimate
,gds.triangleCount.stats
,gds.triangleCount.stats.estimate
,gds.triangleCount.mutate
, andgds.triangleCount.mutate.estimate
- This removes the alpha procedures gds.alpha.triangleCount.stream, gds.alpha.triangleCount.write, gds.alpha.triangleCount.stats
- This adds the following new procedures:
- Graph export has moved to the
beta
tier, and can now export a new graph as a new database in Neo4j 4.0’s multidatabase environment. - All of our product tier community detection algorithms now support assigning consecutive integers for community IDs by using the optional
consecutiveID
parameter. - We have added a schema column to
graph.list()
display a unified view of the schema of the in memory graph (node labels, node properties, relationship types, relationship properties). This includes any new properties or relationships introduced by usingmutate
mode. - We have added node label filters to
gds.graph.writeNodeProperties
,gds.graph.removeNodeProperties
andgds.util.nodeProperty
gds.graph.list
now outputssizeInBytes
andmemoryUsage
to enable users to see the memory footprint of loaded graphs.
Breaking changes:
- We have dropped support for Neo4j 3.5
storeDir
has been removed fromgds.graph.export
, instead it will create a new database in thedatabases
directory of your current installation.creationTime
and modificationTime have been updated to used ZonedDateTime- We have removed the explicit definition of property mappings and aggregations in cypher projections.
Bug fixes:
- Fixed &
gds.graph.writeNodeProperties
where it did not return the count of mutated properties, and incorrectly wrote 0 for nodes which were missing properties. - Graphs created via Cypher projections no longer return inferred projections (they just return the query)
- Corrected a bug where mutated node properties had size 0
- Fixed a bug where community detection using a seed property from an in-memory node property failed to write results.
1.1.0
Release date: 9 April, 2020
New features:
- Multiple node label support: You can now load, and reference, multiple node labels in your in memory graph. This allows you to load all the data you need for multiple algorithms once and refer to specific node labels when you call each algorithm. This is specified by the new parameter
nodeLabels
. - Graph mutability: We’ve introduced the ability to update your in-memory analytics graph when you execute product supported and beta algorithms. This allows you to chain together multiple algorithms, and only write back your final results to your Neo4j database. To support this we’ve added procedures to:
- Update your in memory graph with the
.mutate
mode for product supported and beta procedures (not available for alpha) - Write mutated data back to your Neo4j database with
graph.writeNodeProperties
andgraph.writeRelationship
, - Remove data from your in memory graph with
gds.graph.removeNodeProperties
andgds.graph.deleteRelationships
- Inspect your in memory graph with
gds.util.nodeProperty
, which allows you to retrieve node properties from a named in memory graph.
- Update your in memory graph with the
- Heap control: All gds procedures execute on heap, and this may lead to OOMs, so we now block the execution of algorithms that will require more memory than currently available.
- Graph Export: We’ve added a catalog procedure to allow you to create new Neo4j databases from named in-memory graphs,
gds.alpha.graph.export
- Other features to improve user experience:
gds.graph.list
now returns a timestamp to indicate when a graph was created as well as a modificationDate to indicate if/when a graph was mutated.- Removed redundant
write
config parameter from several alpha procedures - Added
validateRelationships
parameter to Cypher projections, to control behaviour when a user accidentally specifies relationships between non-existent nodes (specify drop or fail) - New aggregation type,
COUNT
, to summarize parallel relationships based on their total count - When a user improperly misconfigures a procedure call, the error message now suggests the possible misspelled key
- Better error messaging when a user accidentally loads an empty graph
Bug fixes:
gds.alpha.spanningTree
has been fixed so it now creates relationship properties- Fixed
.estimate
function when run on an in memory graph created viagds.graph.generate
- Fixed out of order and incorrect progress logging for
gds.nodeSimilarity
- Removed default value of
null
for graph names ingds.graph.exists()
Breaking changes:
- Parallel cypher node loading (via use of
SKIP
andLIMIT
) has been disabled -- data are already loaded in parallel, and this method results in significantly worse performance than the default mode.
The GDS library is compatible with Neo4j 3.5 versions 3.5.9 and above.
The GDS library is not compatible with Neo4j 4.0. We plan on releasing a 4.0 compatible version in late April 2020.
Feedback? Please post feedback as issues on our github repo!
GDS 1.1 Preview
Release date: 26 March, 2020
New features:
- Multiple node label support: You can now load, and reference, multiple node labels in your in memory graph. This allows you to load all the data you need for multiple algorithms once and refer to specific node labels when you call each algorithm. This is specified by the new parameter `nodeLabels`.
- Graph mutability: We’ve introduced the ability to update your in-memory analytics graph when you execute product supported and beta algorithms. This allows you to chain together multiple algorithms, and only write back your final results to your Neo4j database. To support this we’ve added procedures to:
- Update your in memory graph with the .mutate mode for product supported and beta procedures (not available for alpha)
- Write mutated data back to your Neo4j database with graph.writeNodeProperties and graph.writeRelationship, and
- Inspect your in memory graph with gds.util.nodeProperty, which allows you to retrieve node properties from a named in memory graph.
- Heap control: All gds procedures execute on heap, and this may lead to OOMs, so we now block the execution of algorithms that will require more memory than currently available.
- Other features to improve user experience:
- gds.graph.list now returns a timestamp to indicate when a graph was created
- Removed redundant `write` config parameter from several alpha procedures
- Added `validateRelationships` parameter to Cypher projections, to control behaviour when a user accidentally specifies relationships between non-existent nodes (specify drop or fail)
- New aggregations type, `COUNT`, to summarize parallel relationships based on their total count
- When a user improperly misconfigures a procedure call, the error message now suggests the possible misspelled key
Bug fixes:
- `gds.alpha.spanningTree` has been fixed so it now creates relationship properties
- Fixed `.estimate` function when run on an in memory graph created via `gds.graph.generate`
- Fixed out of order and incorrect progress logging for `gds.nodeSimilarity`
- Removed default value of `null` for graph names in gds.graph.exists()
Breaking changes:
- Parallel cypher node loading (via use of `SKIP` and `LIMIT`) has been disabled -- data are already loaded in parallel, and this method results in significantly worse performance than the default mode.
The GDS library is compatible with Neo4j 3.5 versions greater than 3.5.8
The GDS library is not compatible with Neo4j 4.0. We plan on releasing a 4.0 compatible version in late April 2020.
<br> Feedback? Please post feedback as issues on our github repo!
1.0.0
Release Date: 5 March, 2020
Important information: The graph data science library is not compatible with Neo4j 4.0 or with the Neo4j graph algorithms plugin.
Highlights:
- Alpha, Beta, and Production Quality Algorithms: We’ve divided the algorithms library into three tiers to indicate the level of support and testing that has gone into each on. Production quality algorithms have been highly optimized and are guaranteed to scale and execute quickly; alpha algorithms are experimental labs implementations; and beta algorithms are candidates for product support.
- New API: In order to simplify the use of the graph data science library and create a more intuitive, expressive, and easy to use surface, the GDS has a new surface and API. Major changes include specifying node and relationships in the configuration parameters, specifying directionality in a single location (only when graphs are initially loaded), and the additions of
.write
,.stats
, and.estimate
modes. - Better error messaging: To help users adjust to the new library, and to prevent common gotchas, we’ve invested in intuitive and easy to understand error messages. Trying to run an algorithm that requires a directed graph on an undirected projection? We’ll tell you!
- Named Graphs: Previously available in the graph algorithms library, but underutilized, the recommended workflow in the GDS is to use name graphs to load and manage data into an in-memory analytics graph. This approach is compatible with production deployment of algorithm or analytics workloads on graphs that contain billions of nodes and relationships.
- Graph Catalog Operations: We’ve introduced graph catalog operations to enable users to create, reshape, and manage multiple named graphs for GDS workflows.
- Multiple Relationship Types and Node Labels: We support multiple node labels and relationship types in loaded graphs, including the ability to load multiple different relationship projections. This was previewed in recent graph algorithms releases, but we’ve expanded the flexibility and expressiveness of these operations under the GDS.
- Bug fixes and feature improvements: We support all the most popular algorithms from the labs library, and continue to provide bug fixes and enhancements.
Compatibility:
- The GDS library is compatible with Neo4j 3.5 versions greater than 3.5.8
- The GDS library is not compatible with Neo4j 4.0. We plan on releasing a 4.0 compatible binary in mid Q2 2020.
- The GDS library is not compatible with the graph algorithms library -- installing both plug ins in the same database will result in an error message.
Ready to get started? Download the jar from the Neo4j download center, or on our github repo, and install it in your plugins directory. We’ve created a browser guide, :play graph-data-science
, to familiarize users with the new surface, features, and workflows.
Feedback? Please post feedback as issues on our github repo!
Graph Data Science Library Release 0.9.1
neo4j-graph-data-science-0.9.1
Breaking changes
- Remove syntactic sugar variant that allows defining multiple projections via a piped string.
- Change
projection
key within a relationship projection toorientation
.
New features
Bug fixes
- Fixed a bug where the thread pool size was always limited to 4
Improvements
Other changes
gds.alpha.balancedTriads.[write,stream]
is no longer supported and has been removed.
neo4j-graph-data-science-0.9.1-standalone.jar
This jar contains the Neo4j Graph Data Science Library plugin.
To use it in your Neo4j database, copy the jar into the plugins
directory of your Neo4j installation and restart the server.
For more information visit https://neo4j.com/docs/graph-data-science/preview/
Neo4j Graph Data Science library preview release 0.9.0
neo4j-graph-data-science-0.9.0
Breaking changes
- Product-supported algorithms (WCC, Louvain, Label Propagation, Page Rank and Node Similarity) have been ported to the new procedure API.
- Removed
beta
versions of graph catalog procedures - Weighted algorithms now run on an unweighted graph copy unless a relationship weight is specified in the procedure configuration.
New features
- Added new catalog procedures:
gds.graph.create
-- adds a graph to the cataloggds.graph.create.cypher
-- adds a graph to the catalog using Cyphergds.graph.list
-- lists information about graphs in the cataloggds.graph.exists
-- checks if a graph exists in the cataloggds.graph.drop
-- drops a graph from the cataloggds.graph.create.estimate
-- estimate memory usage for a graphgds.graph.create.cypher.estimate
-- estimate memory usage for a graph using Cypher
- Added new catalog function:
gds.graph.exists
-- checks if a graph exists in the catalog
- Added support for multiple relationship types and properties in Cypher projection.
- For Cypher projections, node properties are initialized from the
RETURN
statement if nonodeProperties
are given.
Bug fixes
- Fixed potential assertion error when running memory estimation on pre-sized graph.
- Fixed bug in Node Similarity where it would return wrong node ids.
- Fixed failure when assigning integer values to parameters that expect doubles in procedures.
neo4j-graph-data-science-0.9.0-standalone.jar
This jar contains the Neo4j Graph Data Science Library plugin.
To use it in your Neo4j database, copy the jar into the plugins
directory of your Neo4j 3.5 installation and restart the server.
For more information visit https://neo4j.com/docs/graph-data-science/preview/