Skip to content

Conversation

rnichi1
Copy link

@rnichi1 rnichi1 commented Feb 6, 2025

This PR contains an example exercise 04 with LLM specific flags for automated scoring.

@@ -0,0 +1,7 @@
[[examples]]
answer = "Reversing word order and reversing characters both have O(n) complexity, but character reversal requires more operations per word, making it slightly less efficient in practice."
points = "{ \"R1\": 1, \"R2\": 1 }"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to simplify this for the task designer, i.e. use toml syntax instead of a string? Escaping things like this is a bit tedious.
Also: these are weights? Because the points below are 0.5 and 0.5 for R1 and R2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will try to change it to toml syntax. Also yes, you are right. It should be 0.5 for the examples for each rubric. These are points not weights. Thanks!

@@ -0,0 +1,9 @@
[[rubrics]]
id = "R1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even necessary to have rubric IDs or could they be parsed in the order they appear?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ids help the model understand the examples. Without them, it would be harder to tell which rubric was correctly solved for each example, especially when there are more than just 2 rubrics.
This is also useful in order to avoid duplication of the rubric, since the model has the rubrics but might not immediately know to infer the order in an unstructured and fairly large prompt as ours. Also as an educator it's easier to give them a clear id like "asymptotically_equivalent" instead of an abstract "R2", in order to then more easily create examples for the llm. I will change the R1, R2 ids to be something more along those lines.

@rnichi1 rnichi1 requested a review from sealexan February 14, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants