Skip to content

Weaviate does not work #53

@heinsenberg82

Description

@heinsenberg82

Describe the bug

I opened a similar issue in the Semantic Kernel repository (it was one of the reasons I came to this repository). microsoft/semantic-kernel#8934

I can't use Weaviate Vector Store with Google Vertex AI (and, I suspect, other integrations with Weaviate may not be working either).

This is my code:

        var provider = new VertexAIProvider(new VertexAIConfiguration
        {
            GoogleCredential = GoogleCredential.FromFile("D:\\code\\my-google-cloud-project.json"),
            Location = "us-central1",
        });
        var embeddingModel = new VertexAIEmbeddingModel(provider, id: "text-multilingual-embedding-002");
        var llm = new VertexAIChatModel(provider, id: "gemini-1.5-pro-001");

        
        var weviateApiKey = "weaviate-api-key";
        var collection = "Test_Collection";
        WeaviateMemoryStore memoryStore = new("https://my-weaviate-endpoint.c0.us-east1.gcp.weaviate.cloud", weviateApiKey);
        var vectorDatabase = new WeaviateVectorDatabase(memoryStore);

        // Exeception is thrown here
        var vectorCollection = await vectorDatabase.AddDocumentsFromAsync<PdfPigPdfLoader>(
            embeddingModel, // Used to convert text to embeddings
            dimensions: 384, // Should be 384 for all-minilm
            dataSource: DataSource.FromUrl("https://canonburyprimaryschool.co.uk/wp-content/uploads/2016/01/Joanne-K.-Rowling-Harry-Potter-Book-1-Harry-Potter-and-the-Philosophers-Stone-EnglishOnlineClub.com_.pdf"),
            collectionName: "harrypotter", // Can be omitted, use if you want to have multiple collections
            textSplitter: null,
            behavior: AddDocumentsToDatabaseBehavior.JustReturnCollectionIfCollectionIsAlreadyExists);


        const string question = "What is Harry's Address?";
        var similarDocuments = await vectorCollection.GetSimilarDocuments(embeddingModel, question, amount: 5);
        // Use similar documents and LLM to answer the question
        var answers = llm.GenerateAsync(
            $"""
             Use the following pieces of context to answer the question at the end.
             If the answer is not in context then just say that you don't know, don't try to make up an answer.
             Keep the answer as short as possible.

             {similarDocuments.AsString()}

             Question: {question}
             Helpful Answer:
             """);
        
        await foreach (var answer in answers)
        {
            Console.WriteLine($"LLM answer: {answer}");
        }

I keep getting the same error:

Microsoft.SemanticKernel.HttpOperationException: Response status code does not indicate success: 401 (Unauthorized).
 ---> System.Net.Http.HttpRequestException: Response status code does not indicate success: 401 (Unauthorized).
   at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode()
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Connectors.Weaviate.WeaviateMemoryStore.ExecuteHttpRequestAsync(HttpRequestMessage request, CancellationToken cancel)
   at Microsoft.SemanticKernel.Connectors.Weaviate.WeaviateMemoryStore.DoesCollectionExistAsync(String collectionName, CancellationToken cancellationToken)
   at LangChain.Databases.SemanticKernel.SemanticKernelMemoryDatabase.IsCollectionExistsAsync(String collectionName, CancellationToken cancellationToken) in /_/src/SemanticKernel/src/SemanticKernelMemoryDatabase.cs:line 34
   at LangChain.Extensions.VectorDatabaseExtensions.AddDocumentsFromAsync[TLoader](IVectorDatabase vectorDatabase, IEmbeddingModel embeddingModel, Int32 dimensions, DataSource dataSource, String collectionName, ITextSplitter textSplitter, DocumentLoaderSettings loaderSettings, EmbeddingSettings embeddingSettings, AddDocumentsToDatabaseBehavior behavior, CancellationToken cancellationToken) in /_/src/Core/src/Extensions/VectorDatabaseExtensions.cs:line 42
   at Api.LangchainTest.Execute() in D:\code\Meu-Aluguel\Api\LangchainTest.cs:line 35
   at Program.<Main>$(String[] args) in D:\code\Meu-Aluguel\Api\Program.cs:line 13
   at Program.<Main>(String[] args)

I suspect that the Semantic Kernel library (responsible for the WeaviateMemoryStore class, on which this library is dependent) is not placing the necessary headers in requests managed by the Vector database classes. For instance, the Weaviate documentation (https://weaviate.io/developers/weaviate/model-providers/google/embeddings) says that, for the integration with Vertex AI to work, the Vertex AI API key must be passed in the request header in the X-Google-Vertex-Api-Key field. In the case of Open AI, it would be the X-OpenAI-Api-Key field.

Alternatively, would there be any way to use Weaviate with this Langchain library without going through the Semantic Kernel?

Steps to reproduce the bug

Execute my code

Expected behavior

No response

Screenshots

No response

NuGet package version

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdependenciesPull requests that update a dependency filedocumentationImprovements or additions to documentationhelp wantedExtra attention is needed

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions