`--load` argument should be required when in evaluation mode #105

observingClouds · 2025-01-26T13:22:54Z

In evaluation mode a checkpoint needs to be loaded, so testing for the --load arguments should be added because otherwise the code fails with a unhelpful error further down the line:

0: [rank0]:   File "neural-lam/neural_lam/models/ar_model.py", line 394, in <dictcomp>
0: [rank0]:     f"test_loss_unroll{step}": time_step_loss[step - 1]
0: [rank0]: IndexError: index 14 is out of b
0: ounds for dimension 0 with size 10

The text was updated successfully, but these errors were encountered:

joeloskarsson · 2025-01-26T14:31:31Z

This error is not due to not having loaded a checkpoint, but rather due to running with --ar_steps_eval that is lower than the maximum number in --val_steps_to_log, e.g. trying to log validation error for a lead time that is not forecasted. However, the defaults for these arguments will give this issue and that is probably not a great setup and something we should change. It would also be good to have a check that the steps given in val_steps_to_log are all <= what is given as --ar_steps_eval, instead of failing in this unhelpful way.

joeloskarsson · 2025-02-07T10:39:52Z

Is there still an interest in requiring --load in eval mode? (even though that did not cause the issue)

Otherwise maybe we can change this issue to be about the default --val_steps_to_log argument causing this problem, as that is something I think we should fix.

sadamov · 2025-02-10T16:35:07Z

The error you are getting is now being discussed here: #120

I think we should still throw a warning when the model is in eval mode, and no checkpoint was loaded. As that is probably not what most users want to do: evaluation of initialized weights.

joeloskarsson added the enhancement New feature or request label Feb 7, 2025

joeloskarsson mentioned this issue Feb 10, 2025

Default values for val_steps_to_log and ar_steps_eval don't match #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`--load` argument should be required when in evaluation mode #105

`--load` argument should be required when in evaluation mode #105

observingClouds commented Jan 26, 2025

joeloskarsson commented Jan 26, 2025

joeloskarsson commented Feb 7, 2025

sadamov commented Feb 10, 2025 •

edited

Loading

--load argument should be required when in evaluation mode #105

--load argument should be required when in evaluation mode #105

Comments

observingClouds commented Jan 26, 2025

joeloskarsson commented Jan 26, 2025

joeloskarsson commented Feb 7, 2025

sadamov commented Feb 10, 2025 • edited Loading

`--load` argument should be required when in evaluation mode #105

`--load` argument should be required when in evaluation mode #105

sadamov commented Feb 10, 2025 •

edited

Loading