Releases: neo4j/graph-data-science
Graph Data Science 1.8.5
GDS 1.8.5 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.
Bug fixes
- Fixed a bug where the optimized compressed memory format for GDS graph projections was unavailable
Graph Data Science 1.8.4
GDS 1.8.4 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.
Bug fixes
- Fixed a bug where cypher on gds would try to access node properties as relationship properties and vice versa.
- Fixed a bug where
gds.beta.graphSage
would produce incorrect results when specifying the nodeLabels filter. - Fixed a bug where
mutate
used in conjunction withnodeLabels
filter on graphs with multiple node labels and relationship types would sometimes not work correctly. - Fixed a bug where function
gds.alpha.similarity.cosine
and proceduresgds.alpha.similarity.cosine.[stats,stream,write]
returned the absolute value of the cosine computation, instead of the cosine value itself. - Fixed a bug where an invalid license would prevent the Neo4j database from starting
- Fixed a bug where
gds.alpha.closeness
might produce wrong results for directed graphs.
Other changes
- Corrected the definition of
trainFraction
in documentation and removed overly strict validation that requiredtrainFraction + testFraction < 1
.
Graph Data Science 1.8.3
GDS 1.8.3 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.
Bug fixes
- Fixed a bug where
gds.beta.model.drop
could not drop a previously stored model. - Fixed a bug where
gds.beta.model.publish
resulted in unusable models if the model was also stored before. - Fixed a bug where
gds.alpha.ml.splitRelationships
created relationships with incorrect ids when NodeLabel filtering is used. - Fixed a bug where
gds.alpha.ml.linkPrediction.predict
produced relationships with incorrect nodeIds if node labels are filtered. - Fixed a bug where
gds.alpha.ml.pipeline.linkPrediction.predict.mutate
produced relationships with incorrect nodeIds if node labels are filtered. - Fixed a bug where
gds.beta.graph.create.subgrap
h could associate nodes with the wrong properties.
Graph Data Science 1.8.2
Release Date: 13 January, 2022
GDS 1.8.2 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.
Bug fixes
- Fixed a bug where
gds.alpha.ml.pipeline.nodeClassification.train
would train a model under the wrong username and not be accessible for the actual user. - Fixed a bug where
gds.triangleCount
andgds.localClusteringCoefficient
might produce wrong results when using a nodeLabels filter. - Fixed a bug where
gds.graph.create
orgds.graph.create.subgraph
did not release allocated memory which can lead to an OutOfMemoryException, especially when applied in a loop.
Graph Data Science 1.8.1
GDS 1.8.1 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.
Bug fixes
- Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms, e.g. Page Rank.
- Fixed a bug where
gds.beta.graphSage
could produce incorrect results for small graphs - Fixed a bug where
gds.beta.graphSage
could product incorrect results for the pool aggregator - Fixed a bug where
gds.graph.create.cypher
would not accept list properties for nodes - Fixed a bug in
gds.beta.graph.create.subgraph
where long values greater than 253 were not properly handled during expression evaluation
Graph Data Science 1.7.3
GDS 1.7.3 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x, 4.0, or 4.4. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.4 compatible release, please see GDS 1.8.0.
Bug fixes
- Fixed a bug where Node2Vec would produce an AIOOBE on sufficiently large graphs.
- Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms,e.g. Page Rank.
GDS 1.8.0
GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5
Breaking changes
- GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects
graphName
,modelName
, and several property parameters such asnodeWeightProperty
orseedProperty
. - We have removed the separate
concurrency
parameter from the model parameter space ingds.alpha.ml.nodeClassification.train
,gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.pipeline.linkPrediction.configureParams
. Theconcurrency
value in the configuration of the train procedure will be used. - The procedure
gds.alpha.randomWalk.stream
has graduated to thebeta
tier, asgds.beta.randomWalk.stream
.- Random Walk has been improved and aligned with the
Node2Vec
implementation. Please consult the documentation to find out about the new configuration options. gds.alpha.randomWalk.stream
has been removed.- A memory estimation procedure,
gds.beta.randomWalk.estimate
has been added
- Random Walk has been improved and aligned with the
- The procedure
gds.beta.fastRPExtended
has been merged withgds.fastRP
.
New features
- Link Prediction
- Add new link prediction stream procedure
gds.alpha.ml.pipeline.linkPrediction.predict.stream
. - Added
probabilityDistribution
andsamplingStats
to the result ofgds.alpha.ml.pipeline.linkPrediction.predict.mutate
. - To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures
gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate
. - Node property steps in Link Prediction pipelines can use a relationship property.
- Add new link prediction stream procedure
- Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
gds.alpha.ml.pipeline.nodeClassification.create
gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
gds.alpha.ml.pipeline.nodeClassification.selectFeatures
gds.alpha.ml.pipeline.nodeClassification.configureParams
gds.alpha.ml.pipeline.nodeClassification.configureSplit
gds.alpha.ml.pipeline.nodeClassification.train
gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
- New algorithm: Conductance,
gds.alpha.conductance.stream
, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms. - Added support for preserving a relationship property in
gds.alpha.ml.splitRelationships.mutate
. - The procedure
gds.fastRP
has received additional configuration parameters:featureProperties
: to configure using node properties as part of the embedding.propertyRatio
: to control how much of the embedding is computed from properties.nodeSelfInfluence
: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.
Bug fixes
- Added check that
concurrency
is meeting determinism constraints for K-Nearest Neighbors wheneverrandomSeed
is overridden. - Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
- Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
- Fixed an issue where KNN did not add candidates to the topK result.
- Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter
nodeLabels
. - Fixed an issue where running
gds.alpha.ml.pipeline.linkPrediction.train
could result in an error on graphs filtered with the configuration parameternodeLabels
. - Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
- Fixed an issue with unmapped Neo4j node ids throwing
ArrayIndexOutOfBoundsException
. - Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
- Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
- Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.
Improvements
- Added context information to log entries in debug and warning.
- Log Training loss as part of general progress logging
- Running transactions while projecting a graph now has less chance of breaking the projected graph
- Improve runtime performance for FastRP
- Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
- The in-memory cypher db is now capable of querying relationship ids, types and properties
- The procedure
gds.alpha.randomWalk.stream
has been improved and should now run faster and more stable.
Graph Data Science 1.8.0-Preview
GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5
Breaking changes
- GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects
graphName
,modelName
, and several property parameters such asnodeWeightProperty
orseedProperty
. - We have removed the separate
concurrency
parameter from the model parameter space ingds.alpha.ml.nodeClassification.train
,gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.pipeline.linkPrediction.configureParams
. Theconcurrency
value in the configuration of the train procedure will be used. - The procedure
gds.alpha.randomWalk.stream
has been improved and aligned with theNode2Vec
implementation. Please consult the documentation to find out about the new configuration options. - The procedure
gds.beta.fastRPExtended
has been merged withgds.fastRP
.
New features
- Link Prediction
- Add new link prediction stream procedure
gds.alpha.ml.pipeline.linkPrediction.predict.stream
. - Added
probabilityDistribution
andsamplingStats
to the result ofgds.alpha.ml.pipeline.linkPrediction.predict.mutate
. - To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures
gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate
. - Node property steps in Link Prediction pipelines can use a relationship property.
- Add new link prediction stream procedure
- Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
gds.alpha.ml.pipeline.nodeClassification.create
gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
gds.alpha.ml.pipeline.nodeClassification.addFeatures
gds.alpha.ml.pipeline.nodeClassification.configureParams
gds.alpha.ml.pipeline.nodeClassification.configureSplit
gds.alpha.ml.pipeline.nodeClassification.train
gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
- New algorithm: Conductance,
gds.alpha.conductance.stream
, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms. - Added support for preserving a relationship property in
gds.alpha.ml.splitRelationships.mutate
. - The procedure
gds.fastRP
has received additional configuration parameters:featureProperties
: to configure using node properties as part of the embedding.propertyRatio
: to control how much of the embedding is computed from properties.nodeSelfInfluence
: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.
Bug fixes
- Added check that
concurrency
is meeting determinism constraints for K-Nearest Neighbors wheneverrandomSeed
is overridden. - Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
- Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
- Fixed an issue where KNN did not add candidates to the topK result.
- Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter
nodeLabels
. - Fixed an issue where running
gds.alpha.ml.pipeline.linkPrediction.train
could result in an error on graphs filtered with the configuration parameternodeLabels
. - Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
- Fixed an issue with unmapped Neo4j node ids throwing
ArrayIndexOutOfBoundsException
. - Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
- Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
- Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.
Improvements
- Added context information to log entries in debug and warning.
- Log Training loss as part of general progress logging
- Running transactions while projecting a graph now has less chance of breaking the projected graph
- Improve runtime performance for FastRP
- Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
- The in-memory cypher db is now capable of querying relationship ids, types and properties
- The procedure
gds.alpha.randomWalk.stream
has been improved and should now run faster and more stable.
Graph Data Science 1.7.2
GDS 1.7.2 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5
Bug fixes
- Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
- Fixed an issue where KNN did not add candidates to the topK result.
- Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
- Fixed an issue where running
gds.alpha.ml.pipeline.linkPrediction.train
could result in an error on graphs filtered with the configuration parameter nodeLabels. - Fixed an issue with unmapped Neo4j node ids throwing
ArrayIndexOutOfBoundsException
GDS 1.1.7
GDS 1.1.7 is compatible with Neo4j Neo4j 3.5.x. For a 4.x compatible release, please see GDS 1.7.2.
Bug fixes
- Fixed a bug in Louvain where changes to
maxIterations
were ignored. - Fixed a bug which caused
gds.graph.list
andgds.graph.drop
to throw an error when specifying a graph with duplicate property keys by failing early - Fixed a bug where
gds.alpha.scc
would sometimes fail with anArrayIndexOutOfBoundsException
. - Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.