Training smolagents to have reproducible results #510
Replies: 1 comment
-
I’m in favor; I had the same thoughts. I’ll add that in the case of another use that would end in an error for a different set of input data, smolagent could perform a correction and run regression tests (here, you’d need to freeze the input for that step and set assertions on the output as the expected result). This method can be applied with reinforcement learning, i.e., by exposing a webhook where a developer could ping to indicate that they are currently satisfied with the results or send back feedback on what they don’t like. I know this is starting to resemble other code agent solutions, but those are closed, expensive, and heavy. The above concepts aren’t difficult to achieve, especially since they can be implemented using coding agents. Unfortunately, I’m already saturated with the possibility of diving deeper into the subject, since I already have “too many irons in the fire” (just like probably all of us—AI promised less work, but I see the opposite ;-)). |
Beta Was this translation helpful? Give feedback.
-
Hello,
I'm new to smolagents, and I was wondering if the framework allows us to "train" the agents by letting them generate code, run it, evaluate/verify results, and then store the generated code for subsequent runs with potentially different input. As long as the logic (code) that the agent generated can handle specific use case(s) input and produce desired results, why generate the code again?
Generating code is not very deterministic, but even if it was, generation still takes time, which is redundant if the agent's generated code is considered "trained" (producing the correct results) and may be used many times for reproducible results.
In other words, could smolagents generate and persist generated code (that the agent executes) with some unique run/execution ID and let us reuse the code by passing the ID, which will only execute it, without regenerating the code?
Beta Was this translation helpful? Give feedback.
All reactions