Skip to content

Commit

Permalink
Support interactive commands (All-Hands-AI#3653)
Browse files Browse the repository at this point in the history
* hacky solution for interactive commands

* add more behavior

* debug

* fix continue functionality

* remove prints

* refactor a bit

* reduce test sleep

* fix python version

* fix pre-commit issue

* Regenerate integration tests

* Update openhands/runtime/client/client.py

* revert some prompt stuff

* several integration mock files regenerated

* execute_action: remove duplicate exception logging

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: tobitege <[email protected]>
  • Loading branch information
3 people authored Sep 8, 2024
1 parent 5100d12 commit ab38515
Show file tree
Hide file tree
Showing 69 changed files with 3,124 additions and 315 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/regenerate_integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,6 @@ jobs:
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
git add .
git commit -m "Regenerate integration tests"
# run it twice in case pre-commit makes changes
git commit -am "Regenerate integration tests" || git commit -am "Regenerate integration tests"
git push
9 changes: 7 additions & 2 deletions agenthub/codeact_agent/system_prompt.j2
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@ The assistant can use a Python environment with <execute_ipython>, e.g.:
print("Hello World!")
</execute_ipython>
The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The assistant is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process",
the assistant should retry running the command in the background.
{% endset %}
Expand Down
70 changes: 51 additions & 19 deletions openhands/runtime/client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ class ActionRequest(BaseModel):
INIT_COMMANDS = [
'git config --global user.name "openhands" && git config --global user.email "[email protected]" && alias git="git --no-pager"',
]
SOFT_TIMEOUT_SECONDS = 5


class RuntimeClient:
Expand Down Expand Up @@ -212,6 +213,9 @@ def _get_bash_prompt_and_update_pwd(self):
if ps1 == pexpect.EOF:
logger.error(f'Bash shell EOF! {self.shell.after=}, {self.shell.before=}')
raise RuntimeError('Bash shell EOF')
if ps1 == pexpect.TIMEOUT:
logger.warning('Bash shell timeout')
return ''

# begin at the last occurrence of '[PEXPECT_BEGIN]'.
# In multi-line bash commands, the prompt will be repeated
Expand Down Expand Up @@ -243,39 +247,56 @@ def _execute_bash(
command: str,
timeout: int | None,
keep_prompt: bool = True,
kill_on_timeout: bool = True,
) -> tuple[str, int]:
logger.debug(f'Executing command: {command}')
self.shell.sendline(command)
return self._continue_bash(
timeout=timeout, keep_prompt=keep_prompt, kill_on_timeout=kill_on_timeout
)

def _interrupt_bash(self, timeout: int | None = None) -> tuple[str, int]:
self.shell.sendintr() # send SIGINT to the shell
self.shell.expect(self.__bash_expect_regex, timeout=timeout)
output = self.shell.before
exit_code = 130 # SIGINT
return output, exit_code

def _continue_bash(
self,
timeout: int | None,
keep_prompt: bool = True,
kill_on_timeout: bool = True,
) -> tuple[str, int]:
try:
self.shell.sendline(command)
self.shell.expect(self.__bash_expect_regex, timeout=timeout)

output = self.shell.before

# Get exit code
self.shell.sendline('echo $?')
logger.debug(f'Executing command for exit code: {command}')
logger.debug('Requesting exit code...')
self.shell.expect(self.__bash_expect_regex, timeout=timeout)
_exit_code_output = self.shell.before
logger.debug(f'Exit code Output: {_exit_code_output}')
exit_code = int(_exit_code_output.strip().split()[0])

except pexpect.TIMEOUT as e:
self.shell.sendintr() # send SIGINT to the shell
self.shell.expect(self.__bash_expect_regex, timeout=timeout)
output = self.shell.before
output += (
'\r\n\r\n'
+ f'[Command timed out after {timeout} seconds. SIGINT was sent to interrupt it.]'
)
exit_code = 130 # SIGINT
logger.error(f'Failed to execute command: {command}. Error: {e}')
if kill_on_timeout:
output, exit_code = self._interrupt_bash()
output += (
'\r\n\r\n'
+ f'[Command timed out after {timeout} seconds. SIGINT was sent to interrupt it.]'
)
logger.error(f'Failed to execute command. Error: {e}')
else:
output = self.shell.before or ''
exit_code = -1

finally:
bash_prompt = self._get_bash_prompt_and_update_pwd()
if keep_prompt:
output += '\r\n' + bash_prompt
logger.debug(f'Command output: {output}')

return output, exit_code

async def run_action(self, action) -> Observation:
Expand All @@ -293,11 +314,23 @@ async def run(self, action: CmdRunAction) -> CmdOutputObservation:
commands = split_bash_commands(action.command)
all_output = ''
for command in commands:
output, exit_code = self._execute_bash(
command,
timeout=action.timeout,
keep_prompt=action.keep_prompt,
)
if command == '':
output, exit_code = self._continue_bash(
timeout=SOFT_TIMEOUT_SECONDS,
keep_prompt=action.keep_prompt,
kill_on_timeout=False,
)
elif command.lower() == 'ctrl+c':
output, exit_code = self._interrupt_bash(
timeout=SOFT_TIMEOUT_SECONDS
)
else:
output, exit_code = self._execute_bash(
command,
timeout=SOFT_TIMEOUT_SECONDS,
keep_prompt=action.keep_prompt,
kill_on_timeout=False,
)
if all_output:
# previous output already exists with prompt "user@hostname:working_dir #""
# we need to add the command to the previous output,
Expand Down Expand Up @@ -690,5 +723,4 @@ async def list_files(request: Request):
return []

logger.info(f'Starting action execution API on port {args.port}')
print(f'Starting action execution API on port {args.port}')
run(app, host='0.0.0.0', port=args.port)
2 changes: 2 additions & 0 deletions openhands/runtime/utils/bash.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@


def split_bash_commands(commands):
if not commands.strip():
return ['']
try:
parsed = bashlex.parse(commands)
except bashlex.errors.ParsingError as e:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ The assistant can use a Python environment with <execute_ipython>, e.g.:
print("Hello World!")
</execute_ipython>
The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The assistant is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process",
the assistant should retry running the command in the background.

Expand Down Expand Up @@ -175,7 +180,6 @@ IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_bro
The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
The assistant must avoid apologies and thanks in its responses.


----------

Here is an example of how you can interact with the environment for task solving:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,6 @@ Don't execute multiple actions at once if you need feedback from the page.




----------

# Current Accessibility Tree:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,6 @@ Don't execute multiple actions at once if you need feedback from the page.




----------

# Current Accessibility Tree:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,6 @@ Don't execute multiple actions at once if you need feedback from the page.




----------

# Current Accessibility Tree:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ The assistant can use a Python environment with <execute_ipython>, e.g.:
print("Hello World!")
</execute_ipython>
The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The assistant is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process",
the assistant should retry running the command in the background.

Expand Down Expand Up @@ -175,7 +180,6 @@ IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_bro
The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
The assistant must avoid apologies and thanks in its responses.


----------

Here is an example of how you can interact with the environment for task solving:
Expand Down Expand Up @@ -404,15 +408,13 @@ The server is running on port 5000 with PID 126. You can access the list of numb
NOW, LET'S START!
Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.


----------


<execute_browse>
Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.. I should start with: Get the content on "http://localhost:8000"
</execute_browse>


----------

OBSERVATION:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ To accomplish our goal of retrieving the ultimate answer to life, we need to cli

Here's the action I'll take:

In order to accomplish my goal, I need to click on the button with bid 10
In order to accomplish our goal, I need to click on the button with bid 10
```click("10"
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Based on the current Accessibility Tree and the previous actions taken, it appears that we have already successfully navigated to the webpage and clicked the button to reveal the answer. The answer is now visible on the page.
Based on the current Accessibility Tree and the previous actions taken, it appears that we have already navigated to the correct page and clicked the button to reveal the answer. The answer is now visible on the page.

To accomplish our goal of retrieving the ultimate answer to life, I will now send this information to the user:
To accomplish our goal of retrieving the ultimate answer to life, I will now send a message to the user with the revealed answer:

```
send_msg_to_user("The ultimate answer to life, the universe, and everything is: OpenHands is all you need!"
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ Based on the content retrieved from localhost:8000, I can provide you with the u

The ultimate answer to life, the universe, and everything is: OpenHands is all you need!

This appears to be the answer provided by the server at localhost:8000. It's a playful reference to Douglas Adams' "The Hitchhiker's Guide to the Galaxy," where the answer to life, the universe, and everything is famously 42. In this case, it's been customized to highlight "OpenHands."
This appears to be the answer provided by the server at localhost:8000. It's a playful reference to Douglas Adams' "The Hitchhiker's Guide to the Galaxy," where the answer to life, the universe, and everything is famously 42. In this case, it's been customized to highlight OpenHands.

<finish></finish>
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ The assistant can use a Python environment with <execute_ipython>, e.g.:
print("Hello World!")
</execute_ipython>
The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The assistant is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process",
the assistant should retry running the command in the background.

Expand Down Expand Up @@ -175,7 +180,6 @@ IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_bro
The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
The assistant must avoid apologies and thanks in its responses.


----------

Here is an example of how you can interact with the environment for task solving:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ The assistant can use a Python environment with <execute_ipython>, e.g.:
print("Hello World!")
</execute_ipython>
The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
The assistant is not allowed to run interactive commands. For commands that may run indefinitely,
the output should be redirected to a file and the command run in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a bash command returns exit code `-1`, this means the process is not yet finished.
The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
or it can contain the text `ctrl+c` to interrupt the process.

For commands that may run indefinitely, the output should be redirected to a file and the command run
in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
If a command execution result says "Command timed out. Sending SIGINT to the process",
the assistant should retry running the command in the background.

Expand Down Expand Up @@ -175,7 +180,6 @@ IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_bro
The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
The assistant must avoid apologies and thanks in its responses.


----------

Here is an example of how you can interact with the environment for task solving:
Expand Down Expand Up @@ -404,15 +408,13 @@ The server is running on port 5000 with PID 126. You can access the list of numb
NOW, LET'S START!
Fix typos in bad.txt. Do not ask me for confirmation at any point.


----------

Certainly! I'll fix the typos in the bad.txt file. Let's start by opening the file and examining its contents.
<execute_ipython>
open_file('bad.txt')
</execute_ipython>


----------

OBSERVATION:
Expand Down
Loading

0 comments on commit ab38515

Please sign in to comment.