Skip to content

Releases: neo4j/graph-data-science

Graph Data Science 1.8.5

14 Mar 23:54
Compare
Choose a tag to compare

GDS 1.8.5 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

  • Fixed a bug where the optimized compressed memory format for GDS graph projections was unavailable

Graph Data Science 1.8.4

08 Mar 15:45
Compare
Choose a tag to compare

GDS 1.8.4 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

  • Fixed a bug where cypher on gds would try to access node properties as relationship properties and vice versa.
  • Fixed a bug where gds.beta.graphSage would produce incorrect results when specifying the nodeLabels filter.
  • Fixed a bug where mutate used in conjunction with nodeLabels filter on graphs with multiple node labels and relationship types would sometimes not work correctly.
  • Fixed a bug where function gds.alpha.similarity.cosine and procedures gds.alpha.similarity.cosine.[stats,stream,write] returned the absolute value of the cosine computation, instead of the cosine value itself.
  • Fixed a bug where an invalid license would prevent the Neo4j database from starting
  • Fixed a bug where gds.alpha.closeness might produce wrong results for directed graphs.

Other changes

  • Corrected the definition of trainFraction in documentation and removed overly strict validation that required trainFraction + testFraction < 1.

Graph Data Science 1.8.3

24 Jan 15:46
Compare
Choose a tag to compare

GDS 1.8.3 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

  • Fixed a bug where gds.beta.model.drop could not drop a previously stored model.
  • Fixed a bug where gds.beta.model.publish resulted in unusable models if the model was also stored before.
  • Fixed a bug where gds.alpha.ml.splitRelationships created relationships with incorrect ids when NodeLabel filtering is used.
  • Fixed a bug where gds.alpha.ml.linkPrediction.predict produced relationships with incorrect nodeIds if node labels are filtered.
  • Fixed a bug where gds.alpha.ml.pipeline.linkPrediction.predict.mutate produced relationships with incorrect nodeIds if node labels are filtered.
  • Fixed a bug where gds.beta.graph.create.subgraph could associate nodes with the wrong properties.

Graph Data Science 1.8.2

13 Jan 14:02
Compare
Choose a tag to compare

Release Date: 13 January, 2022

GDS 1.8.2 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

  • Fixed a bug where gds.alpha.ml.pipeline.nodeClassification.train would train a model under the wrong username and not be accessible for the actual user.
  • Fixed a bug where gds.triangleCount and gds.localClusteringCoefficient might produce wrong results when using a nodeLabels filter.
  • Fixed a bug where gds.graph.create or gds.graph.create.subgraph did not release allocated memory which can lead to an OutOfMemoryException, especially when applied in a loop.

Graph Data Science 1.8.1

20 Dec 17:15
Compare
Choose a tag to compare

GDS 1.8.1 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

  • Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms, e.g. Page Rank.
  • Fixed a bug where gds.beta.graphSage could produce incorrect results for small graphs
  • Fixed a bug where gds.beta.graphSage could product incorrect results for the pool aggregator
  • Fixed a bug where gds.graph.create.cypher would not accept list properties for nodes
  • Fixed a bug in gds.beta.graph.create.subgraph where long values greater than 253 were not properly handled during expression evaluation

Graph Data Science 1.7.3

03 Dec 13:41
Compare
Choose a tag to compare

GDS 1.7.3 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x, 4.0, or 4.4. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.4 compatible release, please see GDS 1.8.0.

Bug fixes

  • Fixed a bug where Node2Vec would produce an AIOOBE on sufficiently large graphs.
  • Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms,e.g. Page Rank.

GDS 1.8.0

01 Dec 19:23
Compare
Choose a tag to compare

GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

  • GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects graphName, modelName, and several property parameters such as nodeWeightProperty or seedProperty.
  • We have removed the separate concurrency parameter from the model parameter space in gds.alpha.ml.nodeClassification.train, gds.alpha.ml.linkPrediction.train and gds.alpha.ml.pipeline.linkPrediction.configureParams. The concurrency value in the configuration of the train procedure will be used.
  • The procedure gds.alpha.randomWalk.stream has graduated to the beta tier, as gds.beta.randomWalk.stream.
    • Random Walk has been improved and aligned with the Node2Vec implementation. Please consult the documentation to find out about the new configuration options.
    • gds.alpha.randomWalk.stream has been removed.
    • A memory estimation procedure, gds.beta.randomWalk.estimate has been added
  • The procedure gds.beta.fastRPExtended has been merged with gds.fastRP.

New features

  • Link Prediction
    • Add new link prediction stream procedure gds.alpha.ml.pipeline.linkPrediction.predict.stream.
    • Added probabilityDistribution and samplingStats to the result of gds.alpha.ml.pipeline.linkPrediction.predict.mutate.
    • To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate.
    • Node property steps in Link Prediction pipelines can use a relationship property.
  • Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
    • gds.alpha.ml.pipeline.nodeClassification.create
    • gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
    • gds.alpha.ml.pipeline.nodeClassification.selectFeatures
    • gds.alpha.ml.pipeline.nodeClassification.configureParams
    • gds.alpha.ml.pipeline.nodeClassification.configureSplit
    • gds.alpha.ml.pipeline.nodeClassification.train
    • gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
  • New algorithm: Conductance, gds.alpha.conductance.stream, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms.
  • Added support for preserving a relationship property in gds.alpha.ml.splitRelationships.mutate.
  • The procedure gds.fastRP has received additional configuration parameters:
    • featureProperties: to configure using node properties as part of the embedding.
    • propertyRatio: to control how much of the embedding is computed from properties.
    • nodeSelfInfluence: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.

Bug fixes

  • Added check that concurrency is meeting determinism constraints for K-Nearest Neighbors whenever randomSeed is overridden.
  • Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
  • Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
  • Fixed an issue where KNN did not add candidates to the topK result.
  • Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
  • Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException.
  • Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
  • Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
  • Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.

Improvements

  • Added context information to log entries in debug and warning.
  • Log Training loss as part of general progress logging
  • Running transactions while projecting a graph now has less chance of breaking the projected graph
  • Improve runtime performance for FastRP
  • Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
  • The in-memory cypher db is now capable of querying relationship ids, types and properties
  • The procedure gds.alpha.randomWalk.stream has been improved and should now run faster and more stable.

Graph Data Science 1.8.0-Preview

26 Nov 15:20
Compare
Choose a tag to compare

GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

  • GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects graphName, modelName, and several property parameters such as nodeWeightProperty or seedProperty.
  • We have removed the separate concurrency parameter from the model parameter space in gds.alpha.ml.nodeClassification.train, gds.alpha.ml.linkPrediction.train and gds.alpha.ml.pipeline.linkPrediction.configureParams. The concurrency value in the configuration of the train procedure will be used.
  • The procedure gds.alpha.randomWalk.stream has been improved and aligned with the Node2Vec implementation. Please consult the documentation to find out about the new configuration options.
  • The procedure gds.beta.fastRPExtended has been merged with gds.fastRP.

New features

  • Link Prediction
    • Add new link prediction stream procedure gds.alpha.ml.pipeline.linkPrediction.predict.stream.
    • Added probabilityDistribution and samplingStats to the result of gds.alpha.ml.pipeline.linkPrediction.predict.mutate.
    • To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate.
    • Node property steps in Link Prediction pipelines can use a relationship property.
  • Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
    • gds.alpha.ml.pipeline.nodeClassification.create
    • gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
    • gds.alpha.ml.pipeline.nodeClassification.addFeatures
    • gds.alpha.ml.pipeline.nodeClassification.configureParams
    • gds.alpha.ml.pipeline.nodeClassification.configureSplit
    • gds.alpha.ml.pipeline.nodeClassification.train
    • gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
  • New algorithm: Conductance, gds.alpha.conductance.stream, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms.
  • Added support for preserving a relationship property in gds.alpha.ml.splitRelationships.mutate.
  • The procedure gds.fastRP has received additional configuration parameters:
    • featureProperties: to configure using node properties as part of the embedding.
    • propertyRatio: to control how much of the embedding is computed from properties.
    • nodeSelfInfluence: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.

Bug fixes

  • Added check that concurrency is meeting determinism constraints for K-Nearest Neighbors whenever randomSeed is overridden.
  • Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
  • Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
  • Fixed an issue where KNN did not add candidates to the topK result.
  • Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
  • Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException.
  • Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
  • Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
  • Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.

Improvements

  • Added context information to log entries in debug and warning.
  • Log Training loss as part of general progress logging
  • Running transactions while projecting a graph now has less chance of breaking the projected graph
  • Improve runtime performance for FastRP
  • Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
  • The in-memory cypher db is now capable of querying relationship ids, types and properties
  • The procedure gds.alpha.randomWalk.stream has been improved and should now run faster and more stable.

Graph Data Science 1.7.2

01 Nov 20:35
Compare
Choose a tag to compare

GDS 1.7.2 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Bug fixes

  • Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
  • Fixed an issue where KNN did not add candidates to the topK result.
  • Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
  • Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException

GDS 1.1.7

01 Nov 20:31
Compare
Choose a tag to compare

GDS 1.1.7 is compatible with Neo4j Neo4j 3.5.x. For a 4.x compatible release, please see GDS 1.7.2.

Bug fixes

  • Fixed a bug in Louvain where changes to maxIterations were ignored.
  • Fixed a bug which caused gds.graph.list and gds.graph.drop to throw an error when specifying a graph with duplicate property keys by failing early
  • Fixed a bug where gds.alpha.scc would sometimes fail with an ArrayIndexOutOfBoundsException.
  • Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.