-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Code Agent in CodeTrans #331
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: letonghan <[email protected]>
Signed-off-by: letonghan <[email protected]>
Signed-off-by: letonghan <[email protected]>
…o letong/rfc_codetrans
[Remind] @ftian1 please help to review the RFC, thank you! |
community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this RFC skips too many significant details:
- If translation is to compiled language, suitable build tools need to available for all of them
- What languages this is targeted at; just Java?
- If resulting program relies on external components, those need to be identified & installed, before program can be successfully build or executed
- Some dependencies can be GB sized
- There's no mention of how the execution is sandboxed
- E.g. allowing network connectivity for malicious code would be rather serious (it could trivially do DOS attacks etc), but also innocent programs could be intended for network access
- Many programs do not run without input, but there's no mention of how input data provision is managed
- Code translation may be also asked for individual functions, not whole programs
PS. Even if agent would do only code linting instead of execution, that would also need code dependencies to be installed/present (at least their API files).
Hi @eero-t , thanks for your detailed comments! I will update this RFC later, here's some responses of your questions:
Please let me know if you have other suggestions. |
@letonghan RFC needs to answer also following questions:
I'm also wondering whether each supported language would need its own additional RFC, as that that's rather complex, language specific topic (language versions, their upgrades, access to their module repositories, security etc). |
Will same agents be used also for CodeGen? |
Then there's also the performance aspect. Users expect responses in seconds, but fetching code dependencies for building (or linting) the code could take minutes, maybe even tens of minutes. Building the code also takes extra time, especially if agents need to do several rounds of builds to get translated code into fully buildable state. Meaning that:
|
I think using lint/bandit is also a great option for code checking. |
Yes, building code sandbox, install dependencies, and execute it would take a lot of time. |
there is code execution tool https://github.com/QwenLM/Qwen-Agent/blob/main/qwen_agent/tools/code_interpreter.py |
Do we have customer requests for such code translation capabilities in OPEA? Is there a compelling need to invest engineering efforts in this? Shall we think about coding agent as a whole instead of just code translation or code generation? @letonghan @ftian1 @lkk12014402 |
Considering it for both makes more sense to me. At least I cannot quickly think of any difference between verifying / improving result for code translation, vs code generation. |
Signed-off-by: letonghan <[email protected]>
Hi @eero-t @minmin-intel , the RFC is updated. |
Signed-off-by: letonghan <[email protected]>
Signed-off-by: letonghan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better now, but I still have few comments.
This RFC proposes the integration of two Agent mechanisms into the CodeTrans Example to enhance the reliability, user experience, and code quality. The goal is to minimize the propagation of erroneous code and improve the feasibility of automated code translation. | ||
|
||
- Pre-LLM Agent: Validates the correctness of the input code before it is processed by the LLM. If errors are detected, the agent attempts to automatically fix them to ensure the code is executable. If the correction is successful, the modified code proceeds to the LLM. | ||
- Post-LLM Agent: Executes the translated code after it has been generated by the LLM. If the execution fails, the agent captures the error and sends it back to the LLM for re-generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to info later in the doc, linting is done also in post-LLM phase (which is good thing to do).
How the CodeTrans Helps: | ||
|
||
- Post-LLM Agent tracks retry attempts. | ||
- If the LLM fails to produce a correct version after three attempts (or a configurable limit), the system stops further retries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is likely to need both user-configurable limit, and agent-instance upper limit for user-selected one:
- If the LLM fails to produce a correct version after three attempts (or a configurable limit), the system stops further retries. | |
- If the LLM fails to produce a correct version after configurable number of attempts, the system stops further retries. |
|
||
UI Components: | ||
|
||
- Lint Check Button: Select to do lint check for input/output codes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linters can have a large number of plugins and rules that can be applied to the code, and often what needs to be used, are project specific. So eventually user will need to have control over those too.
But for now it would be enough for lint option be separately selectable both for input and output.
|
||
- Validates code correctness, structures input/output, executes the code, and evaluates the result. | ||
- Runs the user-provided code to check for syntax or logical errors. | ||
- If errors are detected, the agent attempts to automatically fix them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are lint errors supposed to fixed by linter or LLM, or both?
(While many linters can fix simple issues, they may fail with more serious ones.)
|
||
| Coding Language | Lint Tool | Introduction | Reference | | ||
| --------------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | | ||
| Python | Pylint | A tool that checks for errors in Python code, tries to enforce a coding standard and looks for bad code smells. | [link](https://www.pylint.org/) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are few additional Python linters which usage is suggested by PyLint:
https://pylint.readthedocs.io/en/latest/index.html#advised-linters-alongside-pylint
docker run --rm -m 512m --cpus="1" --pids-limit=128 \ | ||
--security-opt no-new-privileges -v $(pwd)/code:/code \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use only the necessary capabilities, and drop all others:
--cap-drop ALL --cap-add <what is needed>
``` | ||
2. execution: run codes using mounted dependencies | ||
```bash | ||
docker run --rm -v $(pwd):/code --network=none \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why running is not secured, like dep install is?
| Output Code Validation | Ensures reliable and accurate code conversion. | | ||
| Automated Debug Feedback | Reduces trial-and-error, improving LLM accuracy. | | ||
| Lint Static Code Check | Catch bugs early and enforce consistent code quality. | | ||
| Secure Execution Environment | Protects the system from malicious code. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is problem added by code execution (and potentially linting / dep install), not a benefit.
I think there should be a potential problem section. For example something like below.
Risks and their mitigations / user workarounds
- Node / cluster take over by execution of malicious code
- Mitigation: automated vetting of the executed code + its strict sandboxing
- Code execution exhausting node resources
- Mitigation: strict resource usage limits
- Application response taking too long due to dependency install / code execution
- Mitigation: dependency caching + enforced execution timeouts + error response to user
- Workaround: user disables linting / code execution
- Users can affect each others' results
- Mitigation: (dependency) caching is per-user session
- Code execution failing translation due to limits / sandboxing / dependency being offline
- Workaround: user disables code execution / linting
### Phase 1: Develop Code Execution Tool, target v1.3 | ||
|
||
- Develop the Code Execution Tool in Agent to provide a secure execution environment. | ||
- Ensure the code execution environment is securely isolated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this is less important that the linting features, intended to cover just one of the languages, and more likely to be service security and performance issue, so it should be done later.
|
||
* To ensure that the code runs in a fully isolated environment, the tool needs to use container-based sandbox like `Docker`. | ||
* In the context of resource constraints, the resource limits (including memory, CPU, process numbers) and security policies are needed. | ||
* Since the dependencies need to be installed first, the network authority will be processed in two stages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intention that the installed dependencies would be be shared between invocations? And if yes, only for the same user, or also between different users?
Note: I'm asking because Python requirements may include conflicting versions of the dependencies, especially if they're pulled by different modules.
Ok. (I do not see an overlap between #272 and this RFC, except both being RAG, but "before translation" phase is indeed specific just to code translation.) |
This RFC proposes the integration of two Agent mechanisms into the CodeTrans Example to enhance the reliability, user experience, and code quality.
The goal is to minimize the propagation of erroneous code and improve the feasibility of automated code translation.