Description
Commit 4251179 based on pull request #174 changed the behaviour of the default grader in certain corner cases from JE 0
to JE
,see
4251179#commitcomment-46068430
This leads to verifyproblem
with the -d data
flag crashing in normal use:
> verifyproblem . -d a
[...]
File "/usr/lib/python3/dist-packages/problemtools/verifyproblem.py", line 567, in aggregate_results
if not (min_score <= score <= max_score) and not self._seen_oob_scores:
TypeError: '<=' not supported between instances of 'float' and 'NoneType'
Indeed, min_score
is typically 0.0
, and score
is now None
(rather than 0
).
Why does this happen?
A scoring problem with a grader aggregation mode min
wants to compute the total score of an empty test group.
(The empty test group occurs during problem development because the -d
flag to verifyproblem
was used to only run the submissions on a certain test group, such as -d group3
, or on certain input files not occurring in all test groups, such as -d huge
. This flag correctly leads to empty test groups.)
The behaviour of min
on aggregating the empty set is undefined in the specification, but the choice JE 0
, though producing the unhelpful verdict JE
also produced the score 0
, making the behaviour confusing but useful during problem development.
With the choice JE
, verifyproblem’s verdict is still the unhelpful JE
, but now the variable score
is set to None, leading to a run-time error in the comparison in line 567 quoted above and aborting verification.
Fix (quick)
Go back to JE 0
instead of JE
.
Fix (better)
Discuss what min([])
should mean. The most useful behaviour would be AC 0
or even verdict AC
and score accept_score
, but with a warning “there were empty test groups.”
Minimal example
Here is a baby problem about adding two integers with two secret input files a.in
and b.in
, each in their own test group.
To crash:
verifyproblem . -d a