Why is "add_question_sql" storing vectors as question+sql? SQL isn't natural language, so it wouldn't affect the semantic matching of input queries? #452

qingwu11 · 2024-05-17T01:50:31Z

qingwu11
May 17, 2024

Why is "add_question_sql" storing vectors as question+sql? SQL isn't natural language, so it wouldn't affect the semantic matching of input queries?

erickrribeiro · 2024-05-17T14:38:27Z

erickrribeiro
May 17, 2024

Yes, @qingwu11. I also wondered the same thing when I saw how this id is created. I think that only the question field should be used to generate the unique code and not the question+sql

0 replies

zainhoda · 2024-05-20T14:36:52Z

zainhoda
May 20, 2024
Maintainer

In our testing we actually did find better search by embedding the SQL query particularly if it contains specific terminology that may not be contained in the question.

However, to @erickrribeiro 's point -- I think that the ID should only be based on the question

0 replies

erickrribeiro · 2024-05-23T15:11:44Z

erickrribeiro
May 23, 2024

Thanks @zainhoda for considering the ID suggestion based on the question. Continuing with the subject, I separated three advantages of using the ID in this way:

Vanna will not have two SQLs pointing to the same question. When two sqls point to the same question there may possible that one of them is incorrect, when this occurs it creates ambiguity in the system.
ID based on question will facilitate the process of updating the SQL of an existing document. For example, in ChromaDb you can use upsert
It will be easy to create a feature to update a document

0 replies

watertianyi · 2025-02-15T15:01:24Z

watertianyi
Feb 15, 2025

@erickrribeiro Is training these ddl, sql and document files mainly for offline vector storage operations? Do these files need to be defined according to their own databases? Is each input query question converted into a vector and then the similarity calculated with the three file vectors in the vector library? Then combine it into prompt and send it to llm? Is this the workflow?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is "add_question_sql" storing vectors as question+sql? SQL isn't natural language, so it wouldn't affect the semantic matching of input queries? #452

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Why is "add_question_sql" storing vectors as question+sql? SQL isn't natural language, so it wouldn't affect the semantic matching of input queries? #452

qingwu11 May 17, 2024

Replies: 4 comments

erickrribeiro May 17, 2024

zainhoda May 20, 2024 Maintainer

erickrribeiro May 23, 2024

watertianyi Feb 15, 2025

qingwu11
May 17, 2024

erickrribeiro
May 17, 2024

zainhoda
May 20, 2024
Maintainer

erickrribeiro
May 23, 2024

watertianyi
Feb 15, 2025