Toward merge networks #231

milancurcic · 2025-09-10T18:36:09Z

Currently this PR only adds an example program with input concatenation with basic Fortran. There is no change to the library code.

@jvdp1 this is almost exactly your example in #211. I am not sure that this is what you're looking for.

Specifically, in case of 1-d outputs and inputs, it's so trivial that no separate wrapper like a concatenate layer is needed. We just concatenate the arrays.

More generally, following the Keras concatenate, it's also not clear to me that there needs to be a layer for this. A non-trivial case is concatenating 2 N-d arrays along some arbitrary axis (but the two arrays need to have the same shape along all other axes). In this case, a function would be useful, but not sure that a dedicated layer does anything.

And, it's possible that I still don't understand the intent. :) Let me know what you think.

In support of #211

example/concatenate.f90

milancurcic · 2025-09-10T19:40:59Z

I think I understand better now; this line, "Concatenates a list of inputs." from the Keras documentation was what took me in the completely wrong direction. It's not only about concatenating inputs. It's about merging layer parameters.

I will first make a working example with building blocks that we already have, and then we can discuss if and what part should be best abstracted and how.

milancurcic · 2025-09-11T15:57:25Z

Hi @jvdp1, please see the new example. Like before, no library code was changed.

Merging two networks to feed into one however needs some manual operations that are currently not handled by the library code. Specifically, the backward pass from net3 (downstream branch) to upstream branches (net1 and net2) needs to bypass the calculation of gradient in the branch output layers, which are currently in the library done with the loss function.

In a nutshell, the flow in the example is this:

Forward propagate net1 and net2;
Concatenate outputs and pass as input to net3 and forward propagate it.
Backward propagate net3 to compute its gradients
Manually pass gradients from net3 first hidden layer to compute the gradients in the output layers of net1 and net2
Backward propagage net1 and net2 hidden layers to compute their gradients
Run the optimizer (net % update()) on all 3 networks.

The merged network converges on a minimal example; commenting out call net1 % update(), call net2 % update(), or both, results in a slower convergence because we are effectively disabling updates of parts of a merged network.

Now, about whether and how to abstract this. I'm not clear that this could be implemented as a layer type in the existing framework where a network is assumed to have a 1-d array of layers. However, I can imaging a new network type, say merged_network, that we could invoke like this:

net1 = network(...)
net2 = network(...)
net3 = network(...)

net = merged_nework( &
  upstream_networks = [net1, net2], &
  downstream_network = net3 &
)

and we would define the usual forward, backward, and update methods on the merged_network type to encapsulate the logic that is hand-coded in the example.

Let me know what you think. If the usual time works for you tomorrow (Friday, September 12), I could do Zoom.

jvdp1 · 2025-09-11T18:53:10Z

Thank you @milancurcic for the new example. I will test it tomorrow on my case.

Now, about whether and how to abstract this. I'm not clear that this could be implemented as a layer type in the existing framework where a network is assumed to have a 1-d array of layers. However, I can imaging a new network type, say merged_network, that we could invoke like this:
net1 = network(...)
net2 = network(...)
net3 = network(...)

net = merged_nework( &
  upstream_networks = [net1, net2], &
  downstream_network = net3 &
)
and we would define the usual forward, backward, and update methods on the merged_network type to encapsulate the logic that is hand-coded in the example.

This makes also sense to me.

Let me know what you think. If the usual time works for you tomorrow (Friday, September 12), I could do Zoom.

Tomorrow is fine. I will try to get some results for our meeting.

example/merge_networks.f90

…utputs

milancurcic · 2025-09-25T16:16:58Z

Hi @jvdp1 see the latest updates and the simplification to the example.

network % get_output(output) is a new subroutine that returns a 1d pointer to the output of the network. This removes the need for the explicit select type statements (for getting the output of net1 and net2) in the user code.

The next thing we need to decide, IMO, is whether it's sufficient to always return a 1-d array as network output, or are 2-d and 3-d array variants necessary. 2-d or 3-d may have uses in other applications, but here we should decide how we want to do concatenation. 1-d is trivial, so if 1-d concatenation is sufficient, we can called it good enough. However, if we want to be able to pass a 2-d or 3-d array as input to net3 without an explicit reshape() layer in the middle, we will need get_output_[23]d and a separate concatenation function or layer.

jvdp1 · 2025-10-06T07:34:23Z

Hi @jvdp1 see the latest updates and the simplification to the example.

Thank you! I tested the changes and it works fine. The simplification of the backward step (by providing the gradient instead of the output) was very useful, because I made a mistake initially in all the select type procedure.

network % get_output(output) is a new subroutine that returns a 1d pointer to the output of the network. This removes the need for the explicit select type statements (for getting the output of net1 and net2) in the user code.

The next thing we need to decide, IMO, is whether it's sufficient to always return a 1-d array as network output, or are 2-d and 3-d array variants necessary. 2-d or 3-d may have uses in other applications, but here we should decide how we want to do concatenation. 1-d is trivial, so if 1-d concatenation is sufficient, we can called it good enough. However, if we want to be able to pass a 2-d or 3-d array as input to net3 without an explicit reshape() layer in the middle, we will need get_output_[23]d and a separate concatenation function or layer.

For my application 1-d array is enough. However, as it returns pointers, could it return a 1-d array pointer pointing to 2/3-d arrays? So, a reshape() won't be needed.

Minimal concatenated input example

38c998f

milancurcic assigned jvdp1 Sep 10, 2025

milancurcic added enhancement New feature or request question Further information is requested labels Sep 10, 2025

jvdp1 reviewed Sep 10, 2025

View reviewed changes

example/concatenate.f90 Outdated Show resolved Hide resolved

Update example of merging 2 networks to feed into a 3rd network

165a6c4

milancurcic unassigned jvdp1 Sep 11, 2025

milancurcic commented Sep 12, 2025

View reviewed changes

example/merge_networks.f90 Outdated Show resolved Hide resolved

milancurcic added 3 commits September 15, 2025 14:06

Allow passing gradient to network % backward() to bypass loss function

f676780

Merge branch 'main' into concat

d6575cf

Add network % get_output() subroutine that returns a pointer to the o…

4aea615

…utputs

milancurcic removed the question Further information is requested label Sep 25, 2025

milancurcic changed the title ~~Concatenate~~ Toward merge networks Sep 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Toward merge networks #231

Toward merge networks #231

milancurcic commented Sep 10, 2025

Uh oh!

Uh oh!

milancurcic commented Sep 10, 2025

Uh oh!

milancurcic commented Sep 11, 2025 •

edited

Loading

Uh oh!

jvdp1 commented Sep 11, 2025

Uh oh!

Uh oh!

milancurcic commented Sep 25, 2025

Uh oh!

jvdp1 commented Oct 6, 2025

Uh oh!

Uh oh!

Toward merge networks #231

Are you sure you want to change the base?

Toward merge networks #231

Conversation

milancurcic commented Sep 10, 2025

Uh oh!

Uh oh!

milancurcic commented Sep 10, 2025

Uh oh!

milancurcic commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jvdp1 commented Sep 11, 2025

Uh oh!

Uh oh!

milancurcic commented Sep 25, 2025

Uh oh!

jvdp1 commented Oct 6, 2025

Uh oh!

Uh oh!

milancurcic commented Sep 11, 2025 •

edited

Loading