-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(search_toolkit): Add Alibaba Tongxiao Search API Support #2127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @RonaldJEN for the contribution! there's some conflicts, could you resolve this?
添加新的 search_ali 方法,调用阿里巴巴通晓搜索 API,支持以下特性: - 支持时间范围和行业过滤 - 支持分页查询 - 自动提取摘要信息 (优先使用 summary,回退到 mainText) - 可选返回网页正文和 Markdown 格式内容 - 支持搜索结果重排序优化 - 返回结构与其他搜索方法保持一致 - fix(search_toolkit): resolve linting issues in search_ali method
添加新的 search_ali 方法,调用阿里巴巴通晓搜索 API
Add example for search_ali method usage
将 search_ali_response 示例输出字符串中的全角冒号 : 修改为半角冒号 :
Fix line length (E501)
Adds a trailing comma to the parameters of the search_ali function
@Wendong-Fan Thanks for pointing out the conflicts! I've resolved them now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @RonaldJEN 's contribution, left some comments below, could we also add unit test code?
effective for Chinese language queries. | ||
|
||
Args: | ||
query (str): The search query string (length >= 1 and <= 100). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add validation for length limit?
timeRange (str): Time frame filter for search results. Default | ||
is "NoLimit". Options include: | ||
- 'OneDay': Past day. | ||
- 'OneWeek': Past week. | ||
- 'OneMonth': Past month. | ||
- 'OneYear': Past year. | ||
- 'NoLimit': No time limit (default). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use literal instead of str here? for the variable naming, following current camel style use time_range
instead of timeRange
industry (Optional[str]): Industry-specific search filter. When | ||
specified, only returns results from sites in the specified | ||
industries. Multiple industries can be comma-separated. | ||
Options include: | ||
- 'finance': Financial industry. | ||
- 'law': Legal industry. | ||
- 'medical': Medical industry. | ||
- 'internet': Internet (curated). | ||
- 'tax': Tax industry. | ||
- 'news_province': Provincial news. | ||
- 'news_center': Central news. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use literal instead of str here?
- 'tax': Tax industry. | ||
- 'news_province': Provincial news. | ||
- 'news_center': Central news. | ||
page (int): Page number for results pagination. Default is 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for default value, use format (default: :obj:
1)
, same as others
Dict[str, Any]: A dictionary containing search results or an error | ||
message. The structure includes: | ||
- 'requestId': A unique identifier for the request. | ||
- 'results': A list of dictionaries, each representing a | ||
search result with the following keys: | ||
- 'result_id': The index of the result. | ||
- 'title': The title of the webpage. | ||
- 'snippet': A dynamic summary of relevant content matching | ||
the query keywords. | ||
- 'mainText': The main content of the webpage (if | ||
returnMainText is True). | ||
- 'markdownText': Markdown formatted content (if | ||
returnMarkdownText is True). | ||
- 'hostname': The name of the website. | ||
- 'url': The URL of the webpage. | ||
- 'publishTime': Publication timestamp in milliseconds. | ||
- 'score': Relevance score. | ||
- 'searchInformation': Additional metadata about the search | ||
operation. | ||
- or 'error': An error message if something went wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we simpify the returns to make it more tidy? since the toolkit would be used by agent, it would be more LLM friendly to make it simple
@@ -394,3 +394,20 @@ class PersonInfo(BaseModel): | |||
on improving efficiency, fault tolerance, and minimizing resource overheads. | |||
=============================================================================== | |||
""" | |||
|
|||
search_ali_response = SearchToolkit().search_ali( | |||
query="阿里巴巴2025年的芯片投入", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use english query as example would be better
@RonaldJEN you can fix all things mentioned and its gtg |
Add Alibaba Tongxiao Search API Support
Feature Description
This PR introduces the
search_ali
method to theSearchToolkit
class, integrating the Alibaba Tongxiao Search API. Tongxiao Search is a powerful real-time search API providing structured data from various search engines and knowledge bases, with specific optimizations for Chinese content search.API Reference: Standard Search API - GenericSearch
Implementation Details
Retrieves API credentials via the
TONGXIAO_API_KEY
environment variable.Supports the following search parameters:
timeRange
: Time frame filter (OneDay/OneWeek/OneMonth/OneYear/NoLimit)industry
: Industry filter (finance, law, medical, etc.)page
: Result paginationreturnMainText
: Whether to return webpage main textreturnMarkdownText
: Whether to return Markdown formatted contentenableRerank
: Whether to enable result reranking (can reduce response time)Return Structure Formatting:
summary
field; usesmainText
as a fallback summary ifsummary
is unavailable.search_bing
).Documentation
The method includes a detailed docstring covering:
This PR enhances CAMEL's search capabilities, particularly for Chinese content, offering users more diverse search options.