-
Notifications
You must be signed in to change notification settings - Fork 19
Allowing run_time_error solutions to either WA, or TLE #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If we allow submissions that claim to be RTE to either WA or TLE, then what are we really testing? Would having a "rejected" directory, with the requirement being that it is not accepted (i.e. it can RTE, WC, TLE, ...) solve this need? |
Yes, that's fine too. I guess the reason why I'm fine with letting RTE get TLE and WA is that:
In the end I don't care very much, either solves my problem, so pick whatever you prefer :) |
@RagnarGrootKoerkamp what does BAPC tools call this? |
I'm using the Domjudge way to handle this, using |
Are they placed in an arbitrary folder matching one of the expected verdicts? +1 for |
Yes indeed, any of the matching folders works. I think I have an assert somewhere that the folder it is in must be one of the listed verdicts.
|
As discussed in Kattis#17. I added some normative guidance on the folder for users (since I think this spec is not use primarily by implementers, but by problem creators).
In case anybody is still here, I have a draft implementation that allows you to specify this in `/submissions/expected_grades.yaml' like this: time_limit_exceeded/recursive.py:
verdict:
- RTE
- TLE or, if you like it terse: time_limit_exceeded/recursive.py: ["RTE", "TLE"] This can be specified down to individual testgroups, so you could also do, hypothetically: time_limit_exceeded/recursive.py:
subgroups:
sample: AC
secret:
group1: AC
group2: AC
group3: ["TLE", "RTE"] I have a BAPCtools fork that does this here: https://github.com/thorehusfeldt/BAPCtools In, green the expected verdicts are shown. Testgroup This is very preliminary, but it parses a yaml file with arbitrarily rich specifications per submission and per testgroup, and compares with the default grader (which I added to BAPCtools) for each internal node of the testdata tree. I think this is the right way of doing it (or close enough), in particular for test groups. It is much superior to my own @EXPECTED_GRADES@ approach. |
I prefer the |
I’ve played around with various ideas now. For editing and curation, I find it much more pleasant to have a single-file overview. (Use-case: add another testgroup, or merge two existing testgroups. I can do this very quickly in a single file, with no errors. I also with a single glance and check that all submissions get On the other hand, when writing a new submission, or communicating the intent of a submission to others, the source-embedded approach makes more sense. The semantics of “my” expected-grades proposal are orthogonal to this. An expectation could be defined (along with many others) in a common ...
mixed/th.py: ["TLE", "RTE"]
time_limit_exceeded/recursive.py:
verdict: AC
score: 100
subgroups:
sample: AC
secret:
group1: AC
group2: AC
group3: ["TLE", "RTE"]
... but it could just as well reside in the source code of #! /usr/bin/env python3
"""
@EXPECTATIONS_BEGIN@
verdict: AC
score: 100
subgroups:
sample: AC
secret:
group1: AC
group2: AC
group3: ["TLE", "RTE"]
@EXPECTATIONS_END@
"""
def solve(instance):
... (I have no opinion about the convention for source code embedding syntax.) Contests or traditions could allow both or either; a tool could warn if a submission supplies both (just it currently warns about inconsistency with expectations implied by the placement of the source file.) |
Closing this issue because the actual issue is covered by both adding The discussions in the thread are still interesting, but the actual issue is now closed. |
I thought this had been the case at some point. It's something I was bitten by recently in Coding Cup, having a solution that was incorrect (in a crashy way), but which sometimes manifested as a WA and sometimes as a TLE.
TLE also allows WA since we only want to verify that /some/ test case times it out, but the above case doesn't seem to be clearly mappable in the spec right now?
@niemela @simonlindholm
The text was updated successfully, but these errors were encountered: