-
Notifications
You must be signed in to change notification settings - Fork 98
Discussion : API for Graphs.jl 2.0 #146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @etiennedeg, and thanks for this first draft!
Also, what should we use to implement traits? SimpleTraits.jl? |
I don't remember,
Of course, I just put some examples, but if
For some algorithm, like astar, we need to have a list indexed by vertices (here to store the heuristic distance at each vertex). If the vertices does not form a UnitRange, we can't use a list and we can require a Dict.
BiDirectional comes from the boost graph library. For some graph implementations it is much more costly to call
The point of allowing more general vertex type is to avoid maintaining a range based enumeration of vertices. If a graph is implemented in a way that it can provide a natural enumeration of vertices, then he should probably subtype
This is what is used for the moment, are there better alternatives? |
Makes me remember we should also define the API for the vertex / edge containers |
Thanks for considering another API. Especially for weighted graphs need them. Should functions vertices and edges be required?
Having vertices identified by the range 1:nv(g) simplifies a lot of algorithms. But why providing an adjacency_matrix should necessary? Why weight of an edge cannot be zero? What the function weight should be doing? I would suggest RangeBasedGraph be a trait which only imposes vertices to be 1:nv(g). I am not sure if |
Thank you for your feedback on the API.
I don't know why I didn't used the current
The adjacency matrix can be easily built by querying the neighbors for every vertices of the graph, so we can make a default implementation. Graph types that want a customized implementation for better performance can do so, but it having it add API does not add more burden on implementation, it just guarantees that such a function can be called.
I don't see where I said this. Currently,
Here, |
SimpleTraits can seriously degrade the usefulness of stack traces, and |
Hey @etiennedeg, Thanks a lot for your work! Here are a few more remarks. I really think this experiment belongs in a GraphsBase.jl package containing only the interface and a few basic implementations. It will make it easier for everyone to understand the progress, without it being cluttered by all the algorithms from Graphs.jl. See #135
Do we really need this concept of graphs indexed by a UnitRange anymore? If we rewrite all the algorithms to accept generic vertices, the only important thing is to have the right
The weight of an edge shouldn't even be constrained to Real, you can do shortest path computations on elements of an arbitrary ordered monoid
Is this actually useful? I expect every user would implement their own DataEdge as a subtype of AbstractEdge anyway?
Why would vertices have a weight?
What is K here?
You put it for a subtype, why would this not be available for general AbstractGraphs?
I think we can safely exclude mixed graphs, IIRC they induce some pretty nasty things algorithmically.
What additional information does IsDirected give?
If the vertices are comparable there is always a canonical ordering, so we can always define an adjacency matrix (albeit with an expensive sort step)
I think this is only useful if we want very strict tests that all trait interfaces are satisfied by a concrete implementation. AFAIK the way to perform such tests is not yet agreed upon, so maybe this is an unneeded layer of complexity
I would return the true or false that we have returned sofar, or the graph object itself to be coherent with things like push!
I think deletion of adjacent edges is reasonable
In your proposal, there is no distinction between the vertex / the edge and its metadata. I kinda like it, but this raises the question: what happens when a user just wants to change the vertex or edge metadata (more generic than just the weight). Right now the only solution I see is to remove and reinsert
Can't we just make it a requirement everywhere? This is one of the most common requests anyway, people are pissed about vertex deletion shuffling everything |
It would be nice to also have a dict-like behavior, as in MetaGraphs(Next) |
Being able to handle infinite graphs out of the box would be a very cool asset |
While we're at it, we should probably be a bit more careful with the word |
Finally, a harder question: can we support hypergraphs, where "edges" are arbitrary subsets of the vertex set? I think not, and maybe that's okay |
This is a good point. The question is if the use of a vector indexed by vertices was the only use of the assumption of vertices forming a UnitRange. I'm inclined to think that we indeed do not require this assumption anymore, but that would need a check of the codebase to verify that there is indeed no other optimization coming from this. I see multiple occurrences of
I agree
This allows to provide a common interface to access metadata, like what was discussed here. I have no strong opinion on how to design metadata for graphs, and don't exactly know how it would look like so feel free to share how you would see an API for metadata.
Some algorithms may require vertex weights but we currently do no support it and these must be passed as an argument if an algorithm needs it. Maybe we can let users who want integer weights deal with it with more general metadata, but we could also provide a common interface for it, I don't know. Again, I don't have a strong opinion on it.
This is the
For these two points, these appears under
See #146 (comment) and Boost BidirectionalGraph. If we add
Ok, but I think there should be a way to add a vertex and also get a reference to the vertex just added, because it is not so trivial to retrieve it.
You can change the metadata of a vertex / edge without changing its reference. It will still be the same vertex / edge. You should not make the lexicographic order or the indexation depends on the metadata.
What do you mean ? By an API for vertex / edge containers, I mean clarifying the question on whether we can iterate these containers, if the iterations should satisfy the lexicographic ordering or so...
I don't feel like this should be in the interface, and I don't know of a wide utility. Almost no algorithm of the codebase will satisfy this assumption, and I don't think I want to rewrite / maintain algorithms specialized on this assumption. I think that if someone want's to deal with infinite graphs, he can make a wrapper around a mutable graph that will allocate during the exploration of the graph, and he can write his own algorithms for it.
I'm not sure renaming
Definitely not in Graphs.jl. We could maybe propose an API in the future GraphBase.jl, but if we propose algorithms, that definitely needs a dedicated package. |
Hey @etiennedeg, I was just wondering how things are going? Want me to set up a GraphsBase.jl package to highlight your new interface and start getting feedback? |
Another reason why I want to put this in a GraphsBase.jl package first. As things stand, the Graphs.jl codebase as a whole will be violently incompatible with the new interface, so it doesn't make much sense to me to have both in the same place. Especially since the compatibility restoration effort will be large.
The main hurdle I see with metadata is that if vertices and edges carry their data with them (instead of just being identifiers), algorithms might end up allocating a lot more memory than needed when storing them (think the priority queue of Dijkstra).
I think weights should be handled as part of the metadata. Maybe we could have default edge and vertex metadata, with weights set to 0 and 1 respectively?
Not entirely sure I understand, but in any case this should probably be a
As long as we have a comparison operation, we can define a default integer indexing, even if it involves some allocations.
In this new interface, isn't the vertex itself the reference?
OK this is good.
So we're not changing the
I agree. I was mostly thinking of access to a single vertex, like defining what
Fair.
Is that why we're not changing SimpleGraph vertex stability, because of the wide use? We're tagging a breaking release anyway, so I don't think it's much of a stretch to make it safer if we don't lose performance in the process?
Agreed |
I just made a draft for the Graphs 2.0, but I will keep the discussion on API here. Here are some thoughts about this first shot: I Implemented the AbstractVertex Trait, but did not used it that much. Traits.jl is horrible when dealing with multiple Traits, of when Traits are not of the first argument, maybe we can just allow Vertex to be anything? The only difference (I think) will be that instead of I wonder how breaking it is to change I defaulted the Traits When calling Also, we will probably need some more convenience optional interface methods for working with containers, for example a I don't know how to design the We need to settle the behavior for weights. As I currently implemented it:
The first one is not necessarily restrictive for WeightedGraphs, since the edges can be created on the fly, and weights of a graph can still have a global storage as a matrix. I used the code pattern: for e in outedges(g, current)
neigbhor = src(e) == current ? dst(e) : src(e)
# code
end But it feels messy and suboptimal. Do we add an
For non primitive datatypes, collection only holds references, the data will be store always in only one place, so there is not more allocations.
I think so for vertex weights, as these are not of much uses, and their default value is not very clear. In the current draft, I defined default weights for all edges (default weight of 1), but I'm not sure for the moment if it is the best solution.
We can probably use a type selector here
There is no problem in returning an object that is indexed by two vertices, or even by an edge. However, if we want to get an adjacency matrix to do linear algebra with it, an AbstractArray is definitely needed. We can indeed return an adjacency matrix indexed by a default integer indexing, but I don't think that should belongs to the interface.
The 2.0 will definitely be breaking, but we should still limit at our best the breakage. Furthermore, the non-vertex stability is part of the performance of the library, there is a balance proposed to the user: fast graph with no vertex stability and slow mutation, or slower graph with vertex stability and fast mutation.
That should be discussed, this looks a bit like:
of NetworkX. This is a totally different syntax than what we use now, I feel very mixed about it... |
Thank you so much for your work 🙏 Gonna take a look tonight or tomorrow! Any thoughts on putting this in a GraphsBase.jl? |
Besides I seem to remember John Lapeyre saying they mess up the stack traces. And they make it harder for beginners to implement their own graph types. My preference would be to use an abstract type or nothing at all, and document expectations thoroughly instead.
Besides the subtyping I don't see another issue.
Can we make it so that users don't have to define these traits themselves? Not sure which traits implementation you chose but there must be at least one where we can check the existence of methods. For instance,
I still don't understand why we can't just put this stuff in a vector every time we need it? As long as vertices are comparable we can build a unit range as index.
What shape of array? If we allow multiple edges, then even a matrix is not enough.
Why not
Do you mean
Wait, if
There is a tension between storing edge weights as part of the vertex / edge object or separately. I think it is conceptually simpler if it is part of the object, and we have an interface
Why not? It wouldn't take much to order vertices on the fly, it's an
I thought you said this would probably not impact the performance? I'd be up to add some benchmarks once this is stabilized
OK, let's leave it aside for now and make everything explicit through function calls. That's a non-breaking functionality we can add later anyway |
I'll start reviewing the PR this week |
I'm not sure I would be as categorical as you, I find the Trait to be suitable for
Sound like a great idea
This is an
That's why I talk about
I don't really like to rely on the automatic inferrability of julia, especially since it can change between new releases. I also suspect there are some use case where we really need to gather the eltype.
Yes
My understanding is that for an undirected graph, the order of
Do we already have this as an interface ? I don't think so... As I said, I have no strong opinion, so I'm ok with
We can have a parametric type for edge weights for
If the user keep a |
Sounds good to me. Traits to indicate properties of a graph, but not to constrain individual vertex or edge objects.
The easiest implementation of traits in my view is https://github.com/jolin-io/WhereTraits.jl, what do you think?
I understand the dilemma, although I would personally err in favor of simplicity.
That's fair. But in the spirit of symmetry one would be tempted to treat vertex weights and edge weights equally? Or do we just agree that edge weights are everywhere and vertex weights are not, so we put only the former in the type?
I mean for an undirected graph even the terminology
In that case we need to give a little thought to the constructor, so that the type parameter associated to edge weights is automatically defined somehow? |
This discussion is becoming hard to follow even for us two. I think instead of a single PR it would be nice to get GraphsBase.jl started, and then debate these topics in plenty of small issues based on an existing first draft. What do you say @etiennedeg? |
OK I'll create the repo with all the CI stuff and then let you copy your code :) |
Et voilà: https://github.com/JuliaGraphs/GraphsBase.jl I set it up with rather strict quality tests, if Aqua or JET cause you troubles let me know |
Thank you for starting the effort @etiennedeg ! From reading this thread I am a bit confused about how we want to treat edges. But I will probably start that discussion once we have a more structured place for that. (Thanks @gdalle for doing the tedious job of setting up a package.) Regarding traits, I want to add a few observations to the mix:
From what I understand, this feels like the easiest route, but it is very heavy in terms of dependencies... (In addition to that, it as well depends on How bad would it be to just not use any traits package, and just build them by hand? From what I understand, GeoInterface.jl seems to get quite far with this manual approach. Although I like the name |
Holy cow that is a good point. Let's stick with SimpleTraits for now ^^ |
Not just an interface, because it will also define the strucs for SimpleGraph and probably a more general MetaGraph |
This is for discussing the future API of Graphs.jl
Here is a first shot to open the discussions:
In the light of Why I no longer recommend Julia, I tried to make the API as complete as possible, and to detail all the assumptions that go with it.
I took a lot of inspiration from boost graph, and try to make it the less breaking possible, there may be many breaking changes in my proposition that can be avoided simply.
I also left many question marks and void in it.
Vertices
Vertex (Trait)
required :
isless(v1::Vertex , v2::Vertex)::Bool
assumptions:
isless
forms a total orderhash(v::Vertex)::UInt
IntegerVertex < : Vertex
IntegerVertex will be associated with containers indexed by a UnitRange
required :
getindex(V::IntegerVertex)::Uint
(orvindex
?)Edges
AbstractEdge{T<:Vertex} (Trait ? See #132)
required :
src(e::AbstractEdge)::T
dst(e::AbstractEdge)::T
assumptions:
src(e) <= dst(e)
?isless(e1::AbstractEdge, e2::AbstractEdge)::Bool
assumptions:
isless
forms a total orderhash(e::AbstractEdge)::UInt
WeightedEdge{T} < : AbstractEdge{T}
weight(e::WeightedEdge)::Float
DataEdge
todo
Graphs
AbstractGraph{V<:Vertex, E<:AbstractEdge}:
required:
vertices(g::AbstractGraph)::{Iterator over Vertex}
get_edges(g::AbstractGraph, u::Vertex, v::Vertex)::{Iterator over AbstractEdge}
edges(g::AbstractGraph)::{Iterator over AbstractEdge}
must be coherent withget_edges
outedges(g::AbstractGraph, v::Vertex)::{Iterator over AbstractEdge}
assumptions for each iterator output:
get_weight(g::AbstractGraph, v::Vertex)
default to 1 ? Or only for WeightedGraphs ?get_weight(g::AbstractGraph, e::AbstractEdge)
default to 1 if edge is resent else 0 ? or only for WeightedGraphs ?not required:
nv(g) = length(vertices(g))
ne(g) = length(edges(g))
outneighbors(g::AbstractGraph, v::Vertex) = union([src(e) == v ? dst(e) : src(e) for e in outedges(g, v)])
edges(g, v) = outedges(g, v)
outdegree(g, v) = length(outneighbors(g, v))
outdegree(g, u) = length(outedges(g, u))
has_edge(g, e)::Bool
has_vertex(g, v)::Bool
has_self_loops(g::AbstractGraph) = any(src(e) == dst(e) for e in edges(g))
get_vertex_container(g::AbstractGraph{V, E}, K::Type)::{Vertex container over type} = Dict{V, K}()
The container is indexed by vertices of g
get_edge_container(g::AbstractGraph{V, E}, K::Type)::{Vertex container} = Dict{E, K}()
The container is indexed by edges of g
BidirectionalGraph <: Graph
required:
inneighbors, inedges
assumptions:
not required:
all_neighbors = union(inneigbors, outneighbors)
neighbors = all_neighbors
indegree = length(inneighbors)
degree = length(all_neighbors)
IsSimple <: AbstractGraph
If true, additional requirement that no two edges can have same
src
anddst
Self-loops allowed because of implementation details and history of Graphs.jl (but at most one self-loop per vertex)
weights(g:AbstractGraph)::AbstractMatrix
?not required:
get_edge(g, u, v) = get_edges(g, u, v)
IsDirected <: AbstractGraph
(must be implemented ? allow mixed graphs ?)
If false:
dest
andsrc
.edges
output edges only for one direction. (solength(edges) = ne(g)
)inedges = outedges
RangeBasedGraph <: AbstractGraph
it is assumed
vertices(g::RangeBasedGraph) = 1:nv(g)
adjacency_matrix(g::AbstractGraph)::AbstractMatrix
?A[i,j]
should have positive value (ortrue
value?) only if there is an edge betweeni
andj
weights(g::AbstractGraph)
?Mutability
(distinguish vertex and edge mutability ?)
IsSimplyMutable
graph structure able to represent the structure of any unweighted simple graph
required:
GraphType()::{GraphType<:IsSimplyMutable}
returns an empty graph.add_vertex!(g::AbstractGraph)::Vertex
return created vertex
should generally succede (graph is sufficiently general to handle any number of vertices), can fail only due to overflow. (or allow more general failure, with clean exception thrown ?)
add_edge!(g::AbstractGraph, e::AbstractEdge)::Bool
allowed to return false if an edge with same extremities already exist
If the edge does not already exist, it should generally succede.
rem_vertex!(g, v)
should always succede
adjacent edges are deleted ? (or undefined behavior as in boost, use in cojonction with clear vertex?)
vertex and edge identifiers can no longer be valid
rem_edge!(g, e)
should always succede
edge identifiers can no longer be valid, but vertices will be
not required:
-
add_vertices!(g, l) = add_vertex!(g, e) for e in edges(g)
-
add_edges!, rem_vertices!, rem_edges!
-
GraphType(g::AbstractGraph)::{GraphType<:IsSimplyMutable}
cast a graphg
to an instance ofGraphType
-
zero(GraphType<:IsSimplyMutable) = GraphType()
?-
GraphType(n::Integer)::{GraphType<:IsSimplyMutable}
returns a graph withn
vertices and no edges.IsMutable <: IsSimplyMutable
graph structure able to represent the structure of any unweighted multigraph
add_edge
should generally succedeIsWeightMutable
graph structure able to modify all it's weight (but not necessarily able to change its structure)
set_weight!(g::AbstractGraph, e::WeightedEdge{T, U}, w::U)
IsVertexStable
If mutated, vertex identifiers are guaranteed to still be valid (we can compare vertices gathered before and after a mutation, a vertex container gathered before a mutation will still be valid after)
IsEdgeStable
If mutated, edge identifiers are guaranteed to still be valid (we can compare edges gathered before and after a mutation, an edge container gathered before a mutation will still be valid after)
The text was updated successfully, but these errors were encountered: