|
| 1 | +# Internet Search |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +There are a few different Internet search tools for GPTScript: |
| 6 | + |
| 7 | +- `bing` |
| 8 | +- `bing-image` |
| 9 | +- `brave` |
| 10 | +- `brave-image` |
| 11 | +- `duckduckgo` |
| 12 | +- `google` |
| 13 | +- `google-image` |
| 14 | + |
| 15 | +Each of these tools use the corresponding search engine's API to perform searches and return results that |
| 16 | +the LLM can process. The number of results returned will depend on the |
| 17 | +search engine. This is the format of the results: |
| 18 | + |
| 19 | +Web search format: |
| 20 | +``` |
| 21 | +Title: the title of the web page |
| 22 | +URL: the link to the web page |
| 23 | +Description: a short snippet from the web page |
| 24 | +
|
| 25 | +Title: ... |
| 26 | +URL: ... |
| 27 | +Description: ... |
| 28 | +
|
| 29 | +<etc.> |
| 30 | +``` |
| 31 | + |
| 32 | +Image search format: |
| 33 | +``` |
| 34 | +Title: the title of the image |
| 35 | +Source: the link to the web page where the image came from |
| 36 | +Image URL: the link to the image |
| 37 | +
|
| 38 | +Title: ... |
| 39 | +Source: ... |
| 40 | +Image URL: ... |
| 41 | +
|
| 42 | +<etc.> |
| 43 | +``` |
| 44 | + |
| 45 | +The tools are developed in the [gptscript-ai/search repo](https://github.com/gptscript-ai/search). |
| 46 | + |
| 47 | +## Setup |
| 48 | + |
| 49 | +:::note |
| 50 | +This will become easier in the future, once packaging for remote GPTScript tools has been implemented. |
| 51 | +::: |
| 52 | + |
| 53 | +To set up your local environment to use these tools, do the following: |
| 54 | + |
| 55 | +```bash |
| 56 | +# Clone the repo |
| 57 | +git clone https://github.com/gptscript-ai/search.git |
| 58 | + |
| 59 | +# Build the Go binary |
| 60 | +make build |
| 61 | +``` |
| 62 | + |
| 63 | +Now you can reference the search tools by the relative path to the `tool.gpt` file in the cloned repo, like this: |
| 64 | + |
| 65 | +```yaml |
| 66 | +tools: google-image from ./search/tool.gpt, sys.download |
| 67 | + |
| 68 | +Search for images of pandas and download one. |
| 69 | +``` |
| 70 | + |
| 71 | +:::tip |
| 72 | +You must always specify which tool you are specifically importing from the tool.gpt file, as shown |
| 73 | +in the example above. If no tool is specified, it will default to the first tool defined in the file, |
| 74 | +which currently is `bing`. |
| 75 | +::: |
| 76 | + |
| 77 | +Specific instructions for each search engine follow. |
| 78 | + |
| 79 | +## Bing |
| 80 | + |
| 81 | +The `bing` and `bing-image` tools return search results from the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api). |
| 82 | + |
| 83 | +The environment variable `GPTSCRIPT_BING_SEARCH_TOKEN` must be set to your API key in order for it to work. |
| 84 | + |
| 85 | +## Brave |
| 86 | + |
| 87 | +The `brave` and `brave-image` tools return search results from the [Brave Search API](https://brave.com/search/api/). |
| 88 | + |
| 89 | +The environment variable `GPTSCRIPT_BRAVE_SEARCH_TOKEN` must be set to your API key in order for it to work. |
| 90 | + |
| 91 | +## DuckDuckGo |
| 92 | + |
| 93 | +The `duckduckgo` tool returns search results from the [DuckDuckGo HTML-only Site](https://html.duckduckgo.com). |
| 94 | + |
| 95 | +No API key is required to use this tool. |
| 96 | + |
| 97 | +By default, this tool will make an HTTP request to DuckDuckGo and parse the results. |
| 98 | +If you do this enough times, it will start to get rate limited. |
| 99 | +Rate limits can be more easily avoided by using Google Chrome in headless mode. |
| 100 | +The tool will do this if the `GPTSCRIPT_USE_CHROME` environment variable is set to `true`. |
| 101 | + |
| 102 | +## Google |
| 103 | + |
| 104 | +The `google` and `google-image` tools return search results from the [Google Custom Search JSON API](https://developers.google.com/custom-search/v1/overview). |
| 105 | + |
| 106 | +The environment variable `GPTSCRIPT_GOOGLE_SEARCH_TOKEN` must be set to your API key in order for it to work, |
| 107 | +and `GPTSCRIPT_GOOGLE_SEARCH_ENGINE_ID` must be set to your Programmable Search Engine's engine ID. |
0 commit comments