This project aims to provide the open-source community with an easy-to-use system for building, self-hosting, and evaluating web agent computer-use models. Our goal is to offer an alternative to the $200/month ChatGPT Pro and cloud-based, uncontrolled execution environments.
With open-operator, you can:
- Annotate your web trajectory data.
- Export the data for further processing.
- Prepare the data for supervised fine-tuning (SFT).
- Host and deploy the model to interact with live websites.
- Automatically evaluate the model’s performance.
We believe in empowering developers to have complete control over their web agents, from training to deployment and evaluation.
Briefly describe the roadmap of the project. Green part will be included in this repo.
conda create -n open-operator python=3.11
pip install -r requirements.txt
For the browser environment, you can use browserbase to setup the following environment variables.
export BROWSERBASE_API_KEY=your_api_key
python inference/app.py
You can select the base model you want to use in the dropdown menu.(From Anthropic, Google, OpenAI, etc.)
Then start your first experience with Open-Operator!
Follow the step wise instruction below:
- Download the latest iMean builder extension here: iMean Builder
- Install the extension on your browser.
- Record your web trajectory data you want to train your model on in the natural way you interact with the website. Edit the title of each data.
- Create a private channel on iMean Builder Platform and move all the data into that channel. -> How to: Docs
- Create a private challenge on WebCanvas website and connect it with the channel in the last step. -> How to: Docs
- Get the challenge id and use it to download all the data from the iMean Builder Platform.
Set the challenge id, iMean Builder username, password in configs/config.yaml
.
Just run python main.py
to download the data. Now you can download some sample data by default challenge id.
If you log in iMean Builder with Google account, you can set the password on the profile page.
For Dom Tree mode, Just run python main.py
For Vision mode, code coming soon.
coming soon
coming soon
- Instruction on how to annotate your web trajectory data
- Data downloading
- Pre-process the data to be SFT-ready - DOM Tree
- Pre-process the data to be SFT-ready - Vision
- Host the local model and inference on live websites
- Automatically evaluation using WebCanvas framework
For reference on web agent evaluation, you can check out the WebCanvas repo: WebCanvas
For more information on open-source GUI agent research projects and collaborations, check out WebAgentLab (WebAgentLab Homepage).
Stay tuned!