[RFC] Code Agent in CodeTrans #331

letonghan · 2025-03-14T08:47:01Z

This RFC proposes the integration of two Agent mechanisms into the CodeTrans Example to enhance the reliability, user experience, and code quality.
The goal is to minimize the propagation of erroneous code and improve the feasibility of automated code translation.

Signed-off-by: letonghan <[email protected]>

…o letong/rfc_codetrans

yinghu5 · 2025-03-24T05:16:34Z

[Remind] @ftian1 please help to review the RFC, thank you!

community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md

eero-t

IMHO this RFC skips too many significant details:

If translation is to compiled language, suitable build tools need to available for all of them
- What languages this is targeted at; just Java?
If resulting program relies on external components, those need to be identified & installed, before program can be successfully build or executed
- Some dependencies can be GB sized
There's no mention of how the execution is sandboxed
- E.g. allowing network connectivity for malicious code would be rather serious (it could trivially do DOS attacks etc), but also innocent programs could be intended for network access
Many programs do not run without input, but there's no mention of how input data provision is managed
- Code translation may be also asked for individual functions, not whole programs

PS. Even if agent would do only code linting instead of execution, that would also need code dependencies to be installed/present (at least their API files).

letonghan · 2025-03-25T02:50:49Z

IMHO this RFC skips too many significant details:

If translation is to compiled language, suitable build tools need to available for all of them

What languages this is targeted at; just Java?

If resulting program relies on external components, those need to be identified & installed, before program can be successfully build or executed

Some dependencies can be GB sized

There's no mention of how the execution is sandboxed

E.g. allowing network connectivity for malicious code would be rather serious (it could trivially do DOS attacks etc), but also innocent programs could be intended for network access

Many programs do not run without input, but there's no mention of how input data provision is managed

Code translation may be also asked for individual functions, not whole programs

PS. Even if agent would do only code linting instead of execution, that would also need code dependencies to be installed/present (at least their API files).

Hi @eero-t , thanks for your detailed comments! I will update this RFC later, here's some responses of your questions:

For build tool environment:
- We do need to prepare different environment (considering using docker) for many languages, such as Python, Java, C#, GO and so on.
- The dependencies will be installed in a seperate docker container for each task.
For sandbox security:
- A security policy is needed for docker container to prevent from malicious attack, this part will be updated in RFC in detail.
- The network needs to be limited only for requirements installing, for example, using a white list.
For program input:
- We may ask user to provide a basic input and output case.

Please let me know if you have other suggestions.

eero-t · 2025-03-25T09:25:20Z

Please let me know if you have other suggestions.

@letonghan RFC needs to answer also following questions:

What is done when code translation is asked for target language that agent does not support (but LLM does)?
Why code is not linted (which typically finds more problems than running, and is easier)?
What advantage building+running provides over code linting?

I'm also wondering whether each supported language would need its own additional RFC, as that that's rather complex, language specific topic (language versions, their upgrades, access to their module repositories, security etc).

eero-t · 2025-03-25T10:15:48Z

Will same agents be used also for CodeGen?

eero-t · 2025-03-25T13:18:23Z

Then there's also the performance aspect.

Users expect responses in seconds, but fetching code dependencies for building (or linting) the code could take minutes, maybe even tens of minutes.

Building the code also takes extra time, especially if agents need to do several rounds of builds to get translated code into fully buildable state.

Meaning that:

User would need some feedback of the extra, time-consuming steps being performed
When agent usage can induce large slowdowns, it would be nice if either:
- UI would have an option to disable it (when perceived improvement is low enough), or
- Application could cache query context (e.g. fetched dependencies)
  - This way it's only one-time (time/BW/CPU) cost per context, instead of user seeing large lags also for successive queries (and user feedback telling that app is doing dumbly same thing over-and-over again)
  - But it raises the need for context IDs / management, and question of how long such (large) contexts should persist?

letonghan · 2025-03-26T02:21:52Z

What is done when code translation is asked for target language that agent does not support (but LLM does)?

Why code is not linted (which typically finds more problems than running, and is easier)?

What advantage building+running provides over code linting?

I'm also wondering whether each supported language would need its own additional RFC, as that that's rather complex, language specific topic (language versions, their upgrades, access to their module repositories, security etc).

I think using lint/bandit is also a great option for code checking.
This could be a two-step thing. For the firt step, agent will automatically check code with tools like lint and fix simple typos/faults. For the second step, the updated code will be executed to make sure it works.

letonghan · 2025-03-26T02:28:09Z

User would need some feedback of the extra, time-consuming steps being performed

When agent usage can induce large slowdowns, it would be nice if either:

UI would have an option to disable it (when perceived improvement is low enough), or

Application could cache query context (e.g. fetched dependencies)

This way it's only one-time (time/BW/CPU) cost per context, instead of user seeing large lags also for successive queries (and user feedback telling that app is doing dumbly same thing over-and-over again)

But it raises the need for context IDs / management, and question of how long such (large) contexts should persist?

Yes, building code sandbox, install dependencies, and execute it would take a lot of time.
As the two-steps thought above, I prefer to make link/execution as optional, which could be enabled/disabled in the web UI.

lkk12014402 · 2025-03-26T02:31:35Z

User would need some feedback of the extra, time-consuming steps being performed

When agent usage can induce large slowdowns, it would be nice if either:

UI would have an option to disable it (when perceived improvement is low enough), or

Application could cache query context (e.g. fetched dependencies)

This way it's only one-time (time/BW/CPU) cost per context, instead of user seeing large lags also for successive queries (and user feedback telling that app is doing dumbly same thing over-and-over again)

But it raises the need for context IDs / management, and question of how long such (large) contexts should persist?

Yes, building code sandbox, install dependencies, and execute it would take a lot of time. As the two-steps thought above, I prefer to make link/execution as optional, which could be enabled/disabled in the web UI.

there is code execution tool https://github.com/QwenLM/Qwen-Agent/blob/main/qwen_agent/tools/code_interpreter.py

minmin-intel · 2025-03-31T16:08:14Z

Do we have customer requests for such code translation capabilities in OPEA? Is there a compelling need to invest engineering efforts in this? Shall we think about coding agent as a whole instead of just code translation or code generation? @letonghan @ftian1 @lkk12014402

eero-t · 2025-03-31T16:15:44Z

Shall we think about coding agent as a whole instead of just code translation or code generation?

Considering it for both makes more sense to me. At least I cannot quickly think of any difference between verifying / improving result for code translation, vs code generation.

Signed-off-by: letonghan <[email protected]>

letonghan · 2025-04-07T10:12:46Z

Hi @eero-t @minmin-intel , the RFC is updated.
The lint check tool and the code execution tool could be reused in both CodeTrans and CodeGen example, but I think we don't need to combine the RFCs here, since CodeGen RFC was already merged.
We can make sure it satisfies the needs of code agent, then develop and refine it in release v1.4.
Let's make sure this RFC be merged in v1.3 before middle April, thanks!

Signed-off-by: letonghan <[email protected]>

eero-t

This is much better now, but I still have few comments.

community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md

eero-t · 2025-04-07T15:41:00Z

The lint check tool and the code execution tool could be reused in both CodeTrans and CodeGen example, but I think we don't need to combine the RFCs here, since CodeGen RFC was already merged.

Ok.

(I do not see an overlap between #272 and this RFC, except both being RAG, but "before translation" phase is indeed specific just to code translation.)

Co-authored-by: Eero Tamminen <[email protected]>

Signed-off-by: letonghan <[email protected]>

…o letong/rfc_codetrans

eero-t

Still not fan of code execution, but RFC itself looks OK now. There are just a few inconsistencies that would be good to fix before merging.

(Doc updates have resulted with some things being repeated in multiple sections. It would help if details of each step are described only once, and removed from more generic sections.)

community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md

Co-authored-by: Eero Tamminen <[email protected]>

* Add CodeTrans with Agents RFC Signed-off-by: letonghan <[email protected]> * update diagram Signed-off-by: letonghan <[email protected]> * refine pre-llm agent design Signed-off-by: letonghan <[email protected]> * refine rfc according to comments Signed-off-by: letonghan <[email protected]> * revert file name change Signed-off-by: letonghan <[email protected]> * fix typo Signed-off-by: letonghan <[email protected]> * refine descriptions of retry limits in use case Co-authored-by: Eero Tamminen <[email protected]> * refine rfc according to comments Signed-off-by: letonghan <[email protected]> * refine descriptions Co-authored-by: Eero Tamminen <[email protected]> * Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md Co-authored-by: Eero Tamminen <[email protected]> * Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md Co-authored-by: Eero Tamminen <[email protected]> --------- Signed-off-by: letonghan <[email protected]> Co-authored-by: Eero Tamminen <[email protected]> Signed-off-by: Tsai, Louie <[email protected]>

* Getting Started Guide: ITAC steps update (#343) * ITAC steps update Signed-off-by: alexsin368 <[email protected]> * remove FaqGen reference since it is merged into ChatQnA Signed-off-by: alexsin368 <[email protected]> * remove 1st and 2nd person words, NGINX notes Signed-off-by: alexsin368 <[email protected]> * ITAC steps update Signed-off-by: alexsin368 <[email protected]> * remove FaqGen reference since it is merged into ChatQnA Signed-off-by: alexsin368 <[email protected]> * remove 1st and 2nd person words, NGINX notes Signed-off-by: alexsin368 <[email protected]> * update docker install script and path to docs repo Signed-off-by: alexsin368 <[email protected]> --------- Signed-off-by: alexsin368 <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * [RFC] Code Agent in CodeTrans (#331) * Add CodeTrans with Agents RFC Signed-off-by: letonghan <[email protected]> * update diagram Signed-off-by: letonghan <[email protected]> * refine pre-llm agent design Signed-off-by: letonghan <[email protected]> * refine rfc according to comments Signed-off-by: letonghan <[email protected]> * revert file name change Signed-off-by: letonghan <[email protected]> * fix typo Signed-off-by: letonghan <[email protected]> * refine descriptions of retry limits in use case Co-authored-by: Eero Tamminen <[email protected]> * refine rfc according to comments Signed-off-by: letonghan <[email protected]> * refine descriptions Co-authored-by: Eero Tamminen <[email protected]> * Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md Co-authored-by: Eero Tamminen <[email protected]> * Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md Co-authored-by: Eero Tamminen <[email protected]> --------- Signed-off-by: letonghan <[email protected]> Co-authored-by: Eero Tamminen <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * [RFC] unified benchmark script for all examples under GenAIExamples (#276) * add GenAIExamples benchmark design doc * Update GenAIExamples Benchmark RFC * Fix typo in benchmark RFC and revise deploy section * Fix typos in the benchmark RFC --------- Co-authored-by: Ying Hu <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * RFC: Haystack OPEA Integration (#222) * Haystack integration rfc Signed-off-by: Gad Markovits <[email protected]> * Removed extraneous item from components list Signed-off-by: Gad Markovits <[email protected]> --------- Signed-off-by: Gad Markovits <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * add OpenTelemetry_OPEA_Guide.rst and ChatQnA.md for telemetry support Signed-off-by: Tsai, Louie <[email protected]> * Adding AgentQnA.md for Telemetry on AgentQnA Signed-off-by: Tsai, Louie <[email protected]> * Update tutorial/OpenTelemetry/deploy/AgentQnA.md Co-authored-by: Copilot <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * Update index.rst Signed-off-by: Tsai, Louie <[email protected]> * Update tutorial/OpenTelemetry/OpenTelemetry_OPEA_Guide.rst and ChatQnA.md Co-authored-by: Malini Bhandaru <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> * removing redundant empty lines Signed-off-by: Tsai, Louie <[email protected]> * addressed comments Signed-off-by: Tsai, Louie <[email protected]> * Update tutorial/OpenTelemetry/OpenTelemetry_OPEA_Guide.rst Co-authored-by: Malini Bhandaru <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> --------- Signed-off-by: alexsin368 <[email protected]> Signed-off-by: Tsai, Louie <[email protected]> Signed-off-by: letonghan <[email protected]> Signed-off-by: Gad Markovits <[email protected]> Co-authored-by: alexsin368 <[email protected]> Co-authored-by: Letong Han <[email protected]> Co-authored-by: Eero Tamminen <[email protected]> Co-authored-by: Tian, Feng <[email protected]> Co-authored-by: Ying Hu <[email protected]> Co-authored-by: gadmarkovits <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Malini Bhandaru <[email protected]>

Add CodeTrans with Agents RFC

cbd2b7e

Signed-off-by: letonghan <[email protected]>

letonghan requested review from chensuyue, ftian1, mkbhanda, preethivenkatesh, chickenrae and tomlenth as code owners March 14, 2025 08:47

Merge branch 'main' into rfc_codetrans

c2c1605

letonghan mentioned this pull request Mar 14, 2025

[Feature] Code agent (CodeGen/CodeTrans) - Phase 1: RFC opea-project/GenAIExamples#1534

Closed

2 tasks

letonghan changed the title ~~Add CodeTrans with Agents RFC~~ [RFC] Code Agent in CodeTrans Mar 14, 2025

joshuayao added this to the v1.3 milestone Mar 17, 2025

joshuayao requested review from ashahba, lvliang-intel and lkk12014402 March 17, 2025 01:28

letonghan and others added 4 commits March 17, 2025 13:52

update diagram

aaf26c3

Signed-off-by: letonghan <[email protected]>

Merge branch 'main' into rfc_codetrans

e367d30

refine pre-llm agent design

89e0cc9

Signed-off-by: letonghan <[email protected]>

Merge branch 'rfc_codetrans' of https://github.com/letonghan/docs int…

0ccf150

…o letong/rfc_codetrans

ftian1 mentioned this pull request Mar 19, 2025

[RFC] Hybrid Rag Mega service framework. Create a new application in GenAI examples #333

Closed

lkk12014402 reviewed Mar 24, 2025

View reviewed changes

community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md Outdated Show resolved Hide resolved

eero-t suggested changes Mar 24, 2025

View reviewed changes

yinghu5 added the A0 need to scrub label Mar 26, 2025

eero-t mentioned this pull request Mar 31, 2025

Added python code execution tool support opea-project/GenAIComps#1470

Closed

1 task

letonghan and others added 2 commits April 2, 2025 11:45

Merge branch 'opea-project:main' into rfc_codetrans

815a4a9

refine rfc according to comments

b99142b

Signed-off-by: letonghan <[email protected]>

letonghan and others added 3 commits April 7, 2025 18:13

Merge branch 'main' into rfc_codetrans

f5c4554

revert file name change

ca69693

Signed-off-by: letonghan <[email protected]>

fix typo

2c8bc30

Signed-off-by: letonghan <[email protected]>

lkk12014402 approved these changes Apr 7, 2025

View reviewed changes

lvliang-intel approved these changes Apr 7, 2025

View reviewed changes

eero-t reviewed Apr 7, 2025

View reviewed changes

letonghan and others added 3 commits April 8, 2025 10:10

refine descriptions of retry limits in use case

9683af4

Co-authored-by: Eero Tamminen <[email protected]>

refine rfc according to comments

e4a6f24

Signed-off-by: letonghan <[email protected]>

Merge branch 'rfc_codetrans' of https://github.com/letonghan/docs int…

9911c30

…o letong/rfc_codetrans

eero-t reviewed Apr 8, 2025

View reviewed changes

joshuayao linked an issue Apr 8, 2025 that may be closed by this pull request

[Feature] Code agent (CodeGen/CodeTrans) - Phase 1: RFC opea-project/GenAIExamples#1534

Closed

2 tasks

letonghan and others added 4 commits April 9, 2025 10:04

refine descriptions

5887a78

Co-authored-by: Eero Tamminen <[email protected]>

Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md

3c5b0b1

Co-authored-by: Eero Tamminen <[email protected]>

Update community/rfcs/25-03-14-GenAIExample-001-CodeTrans-with-Agents.md

f7cf667

Co-authored-by: Eero Tamminen <[email protected]>

Merge branch 'main' into rfc_codetrans

01836e8

joshuayao added this to OPEA Apr 9, 2025

joshuayao moved this to In review in OPEA Apr 9, 2025

joshuayao added the documentation Improvements or additions to documentation label Apr 9, 2025

joshuayao merged commit c5bb594 into opea-project:main Apr 10, 2025
4 checks passed

github-project-automation bot moved this from In review to Done in OPEA Apr 10, 2025

[RFC] Code Agent in CodeTrans #331

[RFC] Code Agent in CodeTrans #331

Uh oh!

Conversation

letonghan commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yinghu5 commented Mar 24, 2025

Uh oh!

Uh oh!

eero-t left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

letonghan commented Mar 25, 2025

Uh oh!

eero-t commented Mar 25, 2025

Uh oh!

eero-t commented Mar 25, 2025

Uh oh!

eero-t commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

letonghan commented Mar 26, 2025

Uh oh!

letonghan commented Mar 26, 2025

Uh oh!

lkk12014402 commented Mar 26, 2025

Uh oh!

minmin-intel commented Mar 31, 2025

Uh oh!

eero-t commented Mar 31, 2025

Uh oh!

letonghan commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eero-t left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eero-t commented Apr 7, 2025

Uh oh!

eero-t left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

letonghan commented Mar 14, 2025 •

edited

Loading

eero-t left a comment •

edited

Loading

eero-t commented Mar 25, 2025 •

edited

Loading

letonghan commented Apr 7, 2025 •

edited

Loading