Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The collections/flush api does not take effect after execution #39546

Open
1 task done
zhuwenxing opened this issue Jan 23, 2025 · 3 comments
Open
1 task done
Assignees
Labels
feature/restful v2 kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@zhuwenxing
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20250123-f070af67-amd64
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

test code

    def test_collection_flush(self):
        """
        target: test collection flush
        method: create collection, insert data multiple times and flush
        expected: flush successfully
        """
        # Create collection
        name = gen_collection_name()
        client = self.collection_client
        vector_client = self.vector_client
        payload = {
            "collectionName": name,
            "schema": {
                "fields": [
                    {"fieldName": "book_id", "dataType": "Int64", "isPrimary": True, "elementTypeParams": {}},
                    {"fieldName": "my_vector", "dataType": "FloatVector", "elementTypeParams": {"dim": 128}}
                ]
            }
        }
        client.collection_create(payload)

        # Insert small batches of data multiple times
        for i in range(3):
            vectors = [gen_vector(dim=128) for _ in range(10)]
            insert_data = {
                "collectionName": name,
                "data": [
                    {
                        "book_id": i * 10 + j,
                        "my_vector": vector
                    }
                    for i, vector in enumerate(vectors)
                    for j in range(10)
                ]
            }
            response = vector_client.vector_insert(insert_data)
            assert response["code"] == 0
        c = Collection(name)
        num_entities_before_flush = c.num_entities
        # Flush collection
        response = client.flush(name)
        assert response["code"] == 0
        # check segments
        num_entities_after_flush = c.num_entities
        logger.info(f"num_entities_before_flush: {num_entities_before_flush}, num_entities_after_flush: {num_entities_after_flush}")
        # assert num_entities_after_flush > num_entities_before_flush
        c.flush()
        num_entities_after_flush = c.num_entities
        logger.info(f" num_entities_after_flush by pymilvus: {num_entities_after_flush}")
[2025-01-23 15:56:07 - DEBUG - urllib3.connectionpool]: http://10.104.21.108:19530 "POST /v2/vectordb/collections/flush HTTP/1.1" 200 20 (connectionpool.py:475)
[2025-01-23 15:56:07 - DEBUG - ci_test]: 
method: post, 
url: http://10.104.21.108:19530/v2/vectordb/collections/flush, 
cost time: 0.15013384819030762, 
header: {'Content-Type': 'application/json', 'Authorization': 'Bearer None', 'RequestId': '7ee4a1ee-d95f-11ef-9c14-acde48001122'}, 
payload: {
    "collectionName": "test_collection_2025_01_23_15_56_04_811858KhgpRpFk"
}, 
response: {"code":0,"data":{}} (milvus.py:80)
[2025-01-23 15:56:07 - INFO - ci_test]: num_entities_before_flush: 0, num_entities_after_flush: 0 (test_collection_operations.py:1609)
[2025-01-23 15:56:18 - INFO - ci_test]:  num_entities_after_flush by pymilvus: 300 (test_collection_operations.py:1613)

There are two points indicating that the flush interface is not actually executing:

The execution time is too short. Even for small data flush operations, it usually takes about 3 seconds, but the RESTful flush interface returns very quickly
Before and after the flush, the number of entities does not change to reflect the actual number of insertions

From the subsequent logs, we can see that using the flush interface through pymilvus can make the number of entities change to the actual number of inserted rows

Expected Behavior

No response

Steps To Reproduce

Milvus Log

No response

Anything else?

No response

@zhuwenxing zhuwenxing added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 23, 2025
@zhuwenxing
Copy link
Contributor Author

/assign @smellthemoon

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 24, 2025
@yanliang567 yanliang567 added this to the 2.5.5 milestone Jan 24, 2025
@yanliang567 yanliang567 removed their assignment Jan 24, 2025
Copy link

stale bot commented Feb 23, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Feb 23, 2025
@yanliang567 yanliang567 modified the milestones: 2.5.5, 2.5.6 Feb 28, 2025
@stale stale bot removed stale indicates no udpates for 30 days labels Feb 28, 2025
@yanliang567 yanliang567 modified the milestones: 2.5.6, 2.5.7 Mar 11, 2025
@yanliang567 yanliang567 modified the milestones: 2.5.7, 2.5.8 Mar 23, 2025
@zhuwenxing
Copy link
Contributor Author

/assign @MrPresent-Han

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/restful v2 kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants