-
Notifications
You must be signed in to change notification settings - Fork 97
Toward merge networks #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I think I understand better now; this line, "Concatenates a list of inputs." from the Keras documentation was what took me in the completely wrong direction. It's not only about concatenating inputs. It's about merging layer parameters. I will first make a working example with building blocks that we already have, and then we can discuss if and what part should be best abstracted and how. |
Hi @jvdp1, please see the new example. Like before, no library code was changed. Merging two networks to feed into one however needs some manual operations that are currently not handled by the library code. Specifically, the backward pass from net3 (downstream branch) to upstream branches (net1 and net2) needs to bypass the calculation of gradient in the branch output layers, which are currently in the library done with the loss function. In a nutshell, the flow in the example is this:
The merged network converges on a minimal example; commenting out Now, about whether and how to abstract this. I'm not clear that this could be implemented as a layer type in the existing framework where a network is assumed to have a 1-d array of layers. However, I can imaging a new network type, say net1 = network(...)
net2 = network(...)
net3 = network(...)
net = merged_nework( &
upstream_networks = [net1, net2], &
downstream_network = net3 &
) and we would define the usual Let me know what you think. If the usual time works for you tomorrow (Friday, September 12), I could do Zoom. |
Thank you @milancurcic for the new example. I will test it tomorrow on my case.
This makes also sense to me.
Tomorrow is fine. I will try to get some results for our meeting. |
Hi @jvdp1 see the latest updates and the simplification to the example.
The next thing we need to decide, IMO, is whether it's sufficient to always return a 1-d array as network output, or are 2-d and 3-d array variants necessary. 2-d or 3-d may have uses in other applications, but here we should decide how we want to do concatenation. 1-d is trivial, so if 1-d concatenation is sufficient, we can called it good enough. However, if we want to be able to pass a 2-d or 3-d array as input to |
Thank you! I tested the changes and it works fine. The simplification of the backward step (by providing the gradient instead of the output) was very useful, because I made a mistake initially in all the
For my application 1-d array is enough. However, as it returns pointers, could it return a 1-d array pointer pointing to 2/3-d arrays? So, a |
Currently this PR only adds an example program with input concatenation with basic Fortran. There is no change to the library code.
@jvdp1 this is almost exactly your example in #211. I am not sure that this is what you're looking for.
Specifically, in case of 1-d outputs and inputs, it's so trivial that no separate wrapper like a concatenate layer is needed. We just concatenate the arrays.
More generally, following the Keras concatenate, it's also not clear to me that there needs to be a layer for this. A non-trivial case is concatenating 2 N-d arrays along some arbitrary axis (but the two arrays need to have the same shape along all other axes). In this case, a function would be useful, but not sure that a dedicated layer does anything.
And, it's possible that I still don't understand the intent. :) Let me know what you think.
In support of #211