Skip to content

Commit cbfca3c

Browse files
vichayturensimon824imbajinHJ-Young
authored
feat(llm): added the process of text2gql in graphrag V1.0 (#105)
address #10 1. added the process of intelligent generated gremlin retrivecal 2. added text2gremlin block in rag app 3. add text2gremlin prompt & config 4. fix log bug in py-client 5. .... We also add a `flag` value under the interface of graph query rag/graph: - `1` represents text2gql accurate matching success - `0` represents (k-neighbor) generalization matching success - `-1` represents no relevant graph info --------- Co-authored-by: Simon Cheung <[email protected]> Co-authored-by: imbajin <[email protected]> Co-authored-by: HaoJin Yang <[email protected]>
1 parent 71b6261 commit cbfca3c

File tree

23 files changed

+644
-405
lines changed

23 files changed

+644
-405
lines changed

hugegraph-llm/README.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -45,21 +45,18 @@ graph systems and large language models.
4545
```bash
4646
python3 -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 18001
4747
```
48-
6. Or start the gradio interactive demo of **Text2Gremlin**, you can run with the following command, and open http://127.0.0.1:8002 after starting. You can also change the default host `0.0.0.0` and port `8002` as above. (🚧ing)
49-
```bash
50-
python3 -m hugegraph_llm.demo.gremlin_generate_web_demo
51-
```
52-
7. After running the web demo, the config file `.env` will be automatically generated at the path `hugegraph-llm/.env`. Additionally, a prompt-related configuration file `config_prompt.yaml` will also be generated at the path `hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml`.
48+
49+
6. After running the web demo, the config file `.env` will be automatically generated at the path `hugegraph-llm/.env`. Additionally, a prompt-related configuration file `config_prompt.yaml` will also be generated at the path `hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml`.
5350
You can modify the content on the web page, and it will be automatically saved to the configuration file after the corresponding feature is triggered. You can also modify the file directly without restarting the web application; simply refresh the page to load your latest changes.
5451
(Optional)To regenerate the config file, you can use `config.generate` with `-u` or `--update`.
5552
```bash
5653
python3 -m hugegraph_llm.config.generate --update
5754
```
58-
8. (__Optional__) You could use
55+
7. (__Optional__) You could use
5956
[hugegraph-hubble](https://hugegraph.apache.org/docs/quickstart/hugegraph-hubble/#21-use-docker-convenient-for-testdev)
6057
to visit the graph data, could run it via [Docker/Docker-Compose](https://hub.docker.com/r/hugegraph/hubble)
6158
for guidance. (Hubble is a graph-analysis dashboard include data loading/schema management/graph traverser/display).
62-
9. (__Optional__) offline download NLTK stopwords
59+
8. (__Optional__) offline download NLTK stopwords
6360
```bash
6461
python ./hugegraph_llm/operators/common_op/nltk_helper.py
6562
```

hugegraph-llm/src/hugegraph_llm/api/rag_api.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,8 @@ def graph_rag_recall(
3939
) -> dict:
4040
from hugegraph_llm.operators.graph_rag_task import RAGPipeline
4141
rag = RAGPipeline()
42-
rag.extract_keywords().keywords_to_vid().query_graphdb().merge_dedup_rerank(
42+
43+
rag.extract_keywords().keywords_to_vid().import_schema(settings.graph_name).query_graphdb().merge_dedup_rerank(
4344
rerank_method=rerank_method,
4445
near_neighbor_first=near_neighbor_first,
4546
custom_related_information=custom_related_information,

hugegraph-llm/src/hugegraph_llm/config/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
# under the License.
1717

1818

19-
__all__ = ["settings", "resource_path"]
19+
__all__ = ["settings", "prompt", "resource_path"]
2020

2121
import os
2222
from .config import Config, PromptConfig

hugegraph-llm/src/hugegraph_llm/config/config.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,8 @@ def ensure_yaml_file_exists(self):
118118

119119
def save_to_yaml(self):
120120
indented_schema = "\n".join([f" {line}" for line in self.graph_schema.splitlines()])
121+
indented_text2gql_schema = "\n".join([f" {line}" for line in self.text2gql_graph_schema.splitlines()])
122+
indented_gremlin_prompt = "\n".join([f" {line}" for line in self.gremlin_generate_prompt.splitlines()])
121123
indented_example_prompt = "\n".join([f" {line}" for line in self.extract_graph_prompt.splitlines()])
122124
indented_question = "\n".join([f" {line}" for line in self.default_question.splitlines()])
123125
indented_custom_related_information = (
@@ -132,6 +134,9 @@ def save_to_yaml(self):
132134
yaml_content = f"""graph_schema: |
133135
{indented_schema}
134136
137+
text2gql_graph_schema: |
138+
{indented_text2gql_schema}
139+
135140
extract_graph_prompt: |
136141
{indented_example_prompt}
137142
@@ -147,6 +152,9 @@ def save_to_yaml(self):
147152
keywords_extract_prompt: |
148153
{indented_keywords_extract_template}
149154
155+
gremlin_generate_prompt: |
156+
{indented_gremlin_prompt}
157+
150158
"""
151159
with open(yaml_file_path, "w", encoding="utf-8") as file:
152160
file.write(yaml_content)

hugegraph-llm/src/hugegraph_llm/config/config_data.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,9 @@ class PromptData:
219219
}
220220
"""
221221

222+
# TODO: we should provide a better example to reduce the useless information
223+
text2gql_graph_schema = ConfigData.graph_name
224+
222225
# Extracted from llm_op/keyword_extract.py
223226
keywords_extract_prompt = """指令:
224227
请对以下文本执行以下任务:
@@ -266,3 +269,24 @@ class PromptData:
266269
# Text:
267270
# {question}
268271
# """
272+
273+
gremlin_generate_prompt = """\
274+
Given the example query-gremlin pairs:
275+
{example}
276+
277+
Given the graph schema:
278+
```json
279+
{schema}
280+
```
281+
282+
Given the extracted vertex vid:
283+
{vertices}
284+
285+
Generate gremlin from the following user query.
286+
{query}
287+
The output format must be like:
288+
```gremlin
289+
g.V().limit(10)
290+
```
291+
The generated gremlin is:
292+
"""

hugegraph-llm/src/hugegraph_llm/demo/gremlin_generate_web_demo.py

Lines changed: 0 additions & 208 deletions
This file was deleted.

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
apply_graph_config,
3737
)
3838
from hugegraph_llm.demo.rag_demo.other_block import create_other_block
39+
from hugegraph_llm.demo.rag_demo.text2gremlin_block import create_text2gremlin_block
3940
from hugegraph_llm.demo.rag_demo.rag_block import create_rag_block, rag_answer
4041
from hugegraph_llm.demo.rag_demo.vector_graph_block import create_vector_graph_block
4142
from hugegraph_llm.resources.demo.css import CSS
@@ -55,6 +56,7 @@ def authenticate(credentials: HTTPAuthorizationCredentials = Depends(sec)):
5556
headers={"WWW-Authenticate": "Bearer"},
5657
)
5758

59+
5860
# pylint: disable=C0301
5961
def init_rag_ui() -> gr.Interface:
6062
with gr.Blocks(
@@ -93,9 +95,11 @@ def init_rag_ui() -> gr.Interface:
9395
textbox_input_schema, textbox_info_extract_template = create_vector_graph_block()
9496
with gr.Tab(label="2. (Graph)RAG & User Functions 📖"):
9597
textbox_inp, textbox_answer_prompt_input, textbox_keywords_extract_prompt_input = create_rag_block()
96-
with gr.Tab(label="3. Graph Tools 🚧"):
98+
with gr.Tab(label="3. Text2gremlin ⚙️"):
99+
textbox_gremlin_inp, textbox_gremlin_schema, textbox_gremlin_prompt = create_text2gremlin_block()
100+
with gr.Tab(label="4. Graph Tools 🚧"):
97101
create_other_block()
98-
with gr.Tab(label="4. Admin Tools ⚙️"):
102+
with gr.Tab(label="5. Admin Tools 🛠"):
99103
create_admin_block()
100104

101105
def refresh_ui_config_prompt() -> tuple:
@@ -104,10 +108,11 @@ def refresh_ui_config_prompt() -> tuple:
104108
return (
105109
settings.graph_ip, settings.graph_port, settings.graph_name, settings.graph_user,
106110
settings.graph_pwd, settings.graph_space, prompt.graph_schema, prompt.extract_graph_prompt,
107-
prompt.default_question, prompt.answer_prompt, prompt.keywords_extract_prompt
111+
prompt.default_question, prompt.answer_prompt, prompt.keywords_extract_prompt,
112+
prompt.default_question, settings.graph_name, prompt.gremlin_generate_prompt
108113
)
109114

110-
hugegraph_llm_ui.load(fn=refresh_ui_config_prompt, outputs=[ #pylint: disable=E1101
115+
hugegraph_llm_ui.load(fn=refresh_ui_config_prompt, outputs=[ # pylint: disable=E1101
111116
textbox_array_graph_config[0],
112117
textbox_array_graph_config[1],
113118
textbox_array_graph_config[2],
@@ -118,7 +123,10 @@ def refresh_ui_config_prompt() -> tuple:
118123
textbox_info_extract_template,
119124
textbox_inp,
120125
textbox_answer_prompt_input,
121-
textbox_keywords_extract_prompt_input
126+
textbox_keywords_extract_prompt_input,
127+
textbox_gremlin_inp,
128+
textbox_gremlin_schema,
129+
textbox_gremlin_prompt
122130
])
123131

124132
return hugegraph_llm_ui

0 commit comments

Comments
 (0)