Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Milvus] Store array and JSON metadata fields directly #7429

Open
5 tasks done
rakuzen25 opened this issue Dec 25, 2024 · 3 comments
Open
5 tasks done

[Milvus] Store array and JSON metadata fields directly #7429

rakuzen25 opened this issue Dec 25, 2024 · 3 comments
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@rakuzen25
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

// Based on https://js.langchain.com/docs/integrations/vectorstores/milvus/
import { Milvus } from "@langchain/community/vectorstores/milvus";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Document } from "langchain/document";

const docs: Document[] = [
    new Document({
        pageContent: "This is a test document.",
        metadata: {
            source: "test.txt",
            foo: {
                bar: "baz",
            },
            qux: [1, 2, 3],
        },
    })
]

const vectorStore = await Milvus.fromDocuments(docs, new OpenAIEmbeddings(), {
    collectionName: "foobar",
});

const response = await vectorStore.similaritySearch("test", 2,
    // This won't work...
    "array_contains(qux, 1)",
    // Only this will
    "qux like '%1%'",
);

Error Message and Stack Trace (if applicable)

No response

Description

Milvus 2.2.9 and 2.3.2, released in June 2023 and October 2023, added support for JSON and array data types respectively. This enables access to more efficient operators such as json_contains and array_contains. However, LangChain's current implementation uses VarChar for all metadata fields:

// use json for other types
try {
fields.push({
name: key,
description: `Metadata JSON field`,
data_type: DataType.VarChar,
type_params: {
max_length: jsonFieldMaxLength.toString(),
},
});
} catch (e) {
throw new Error("Failed to parse metadata field as JSON");
}

Is it possible to offer it as an option to the user, or do some magic version detection through MilvusClient.getVersion?

System Info

Not sure how pnpm info langchain would be useful since it always shows the latest version, but my installed versions are:

@langchain/community 0.2.33
langchain 0.2.20

Windows, node v23.4.0, pnpm v9.15.1

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Dec 25, 2024
Copy link

dosubot bot commented Mar 26, 2025

Hi, @rakuzen25. I'm Dosu, and I'm helping the LangChain JS team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • LangChain.js currently uses VarChar for all metadata fields.
  • Milvus versions 2.2.9 and 2.3.2 support JSON and array data types.
  • You suggested updating LangChain to utilize these data types for more efficient operations like json_contains and array_contains.
  • There have been no comments or activity on this issue yet.

Next Steps:

  • Is this issue still relevant to the latest version of the LangChain JS repository? If so, please comment to keep the discussion open.
  • If there is no further activity, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 26, 2025
@rakuzen25
Copy link
Author

Still relevant

@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 27, 2025
Copy link

dosubot bot commented Mar 27, 2025

@jacoblee93, the user @rakuzen25 has indicated that this issue is still relevant. Could you please assist them with the update to utilize JSON and array data types in LangChain.js?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant