Skip to content

Unexpected high memory usage in AppWorld Env leading to Ray OOM #30

@jingyi-xioa

Description

@jingyi-xioa

Hi ,
I have been studying your paper AgentEvolver and I find it to be a very impressive and valuable piece of work.
However, I encountered a issue while reproducing the code with the AppWorld environment. The memory usage on my node spiked to 1.4 TB, triggering Ray's 0.95 threshold and causing the worker to be killed.
It appears to be a memory leak. The logs show that two single RemoteEnv processes consumed over 500 GB of memory each, which is abnormal.
Logs are shown below:
Memory on the node ... was 1406.36GB / 1480.00GB (0.950242)
Top 10 memory users:
PID MEM(GB) COMMAND
70473 536.45 ray::RemoteEnv.step <-- !!!
70355 533.25 ray::RemoteEnv.step <-- !!!
Have you encountered this excessive memory consumption during your experiments? Could you please advise on which part of the code I should inspect to fix this leak?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions