Skip to content

Conversation

@zhangxu19830126
Copy link
Contributor

@zhangxu19830126 zhangxu19830126 commented Oct 22, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #16438

What this PR does / why we need it:

add txn based cache for partition metadata read


PR Type

Enhancement


Description

  • Add transaction-based cache for partition metadata reads

  • Implement Set/Get/Delete cache methods in TxnOperator interface

  • Cache partition metadata using tableID as key during transactions

  • Invalidate cache on partition metadata modifications

  • Update all test mocks and implementations with new cache methods


Diagram Walkthrough

flowchart LR
  A["TxnOperator Interface"] -->|"adds cache methods"| B["Set/Get/Delete"]
  B -->|"implemented in"| C["txnOperator"]
  C -->|"stores metadata"| D["sync.Map cache"]
  E["GetMetadata"] -->|"checks cache"| D
  E -->|"populates cache"| D
  F["Create/Redefine/Rename"] -->|"invalidates cache"| D
Loading

File Walkthrough

Relevant files
Enhancement
5 files
types.go
Add cache methods to TxnOperator interface                             
+5/-0     
operator.go
Add sync.Map cache field to txnOperator                                   
+2/-0     
operator_cache.go
Implement Set/Get/Delete cache operations                               
+27/-0   
storage.go
Integrate transaction cache for metadata                                 
+36/-15 
storage_txn_client.go
Add cache method stubs to StorageTxnOperator                         
+12/-0   
Tests
6 files
operator_cache_test.go
Add unit tests for cache operations                                           
+24/-0   
service_test.go
Implement cache methods in test mock                                         
+15/-0   
txn_mock.go
Add cache method mocks to MockTxnOperator                               
+39/-0   
txn_test.go
Implement cache methods in test operator                                 
+15/-0   
store_sql_test.go
Implement cache methods in test operator                                 
+15/-0   
entire_engine_test.go
Implement cache methods in test operator                                 
+15/-0   

@qodo-merge-pro
Copy link

qodo-merge-pro bot commented Oct 22, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Type assertion panic

Description: Possible type assertion panic when retrieving cached value from TxnOperator using
v.(partition.PartitionMetadata) without checking the dynamic type.
storage.go [90-98]

Referred Code
opts := executor.Options{}.
	WithTxn(txnOp).
	WithDatabase(catalog.MO_CATALOG).
	WithAccountID(accountID)
if txnOp != nil {
	key = fmt.Sprintf(txnBasedMetadataCacheKey, tableID)
	if v, ok := txnOp.Get(key); ok {
		return v.(partition.PartitionMetadata), true, nil
	}
Unbounded memory use

Description: Transaction-scoped cache lacks size limits and eviction which may enable memory
amplification if untrusted keys/values are inserted repeatedly during long transactions.
operator_cache.go [17-26]

Referred Code
func (tc *txnOperator) Set(key string, value any) {
	tc.reset.cache.Store(key, value)
}

func (tc *txnOperator) Get(key string) (any, bool) {
	return tc.reset.cache.Load(key)
}

func (tc *txnOperator) Delete(key string) {
	tc.reset.cache.Delete(key)
Ticket Compliance
🎫 No ticket provided
- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true -->

</details></td></tr>
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-pro
Copy link

qodo-merge-pro bot commented Oct 22, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Avoid generic cache on TxnOperator interface

Replace the generic (string, any) cache on the TxnOperator interface with a
type-safe, purpose-built cache for partition metadata. This avoids API
pollution, key collisions, and improves maintainability.

Examples:

pkg/txn/client/types.go [214-216]
	Set(key string, value any)
	Get(key string) (any, bool)
	Delete(key string)
pkg/partitionservice/storage.go [96-98]
		if v, ok := txnOp.Get(key); ok {
			return v.(partition.PartitionMetadata), true, nil
		}

Solution Walkthrough:

Before:

// In pkg/txn/client/types.go
type TxnOperator interface {
    // ... other methods
    Set(key string, value any)
    Get(key string) (any, bool)
    Delete(key string)
}

// In pkg/partitionservice/storage.go
func (s *storage) GetMetadata(ctx context.Context, tableID uint64, txnOp client.TxnOperator) (partition.PartitionMetadata, bool, error) {
    key := fmt.Sprintf("partitionservice.metadata.%d", tableID)
    if v, ok := txnOp.Get(key); ok {
        return v.(partition.PartitionMetadata), true, nil
    }
    // ... fetch metadata from storage ...
    txnOp.Set(key, metadata)
    return metadata, found, err
}

After:

// In pkg/txn/client/types.go
type TxnOperator interface {
    // ... other methods
    // NO generic Set/Get/Delete methods
    GetPartitionMetadata(tableID uint64) (partition.PartitionMetadata, bool)
    SetPartitionMetadata(tableID uint64, meta partition.PartitionMetadata)
    DeletePartitionMetadata(tableID uint64)
}

// In pkg/partitionservice/storage.go
func (s *storage) GetMetadata(ctx context.Context, tableID uint64, txnOp client.TxnOperator) (partition.PartitionMetadata, bool, error) {
    if meta, ok := txnOp.GetPartitionMetadata(tableID); ok {
        return meta, true, nil
    }
    // ... fetch metadata from storage ...
    txnOp.SetPartitionMetadata(tableID, metadata)
    return metadata, found, err
}
Suggestion importance[1-10]: 9

__

Why: This is a critical design suggestion that correctly points out the risks of adding a generic, type-unsafe cache to the core TxnOperator interface, which impacts long-term maintainability and API clarity.

High
General
Use constructor for object initialization

In TestOperatorCache, initialize txnOperator using its constructor
newTxnOperator instead of a struct literal to ensure proper initialization and
improve test robustness.

pkg/txn/client/operator_cache_test.go [7-24]

 func TestOperatorCache(t *testing.T) {
-	tc := &txnOperator{}
+	tc := newTxnOperator("", nil, nil, txn.TxnMeta{})
 
 	tc.Set("key1", "value1")
 	val, ok := tc.Get("key1")
 	if !ok {
 		t.Fatalf("expected key1 to be present")
 	}
 	if val.(string) != "value1" {
 		t.Fatalf("expected value1, got %v", val)
 	}
 
 	tc.Delete("key1")
 	_, ok = tc.Get("key1")
 	if ok {
 		t.Fatalf("expected key1 to be deleted")
 	}
 }
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly points out that using the constructor newTxnOperator is better practice for initializing the txnOperator in tests, improving robustness and consistency, even though direct instantiation works in this specific case.

Low
  • Update

@mergify mergify bot added the queued label Oct 28, 2025
@mergify mergify bot merged commit 779844e into matrixorigin:main Oct 28, 2025
19 checks passed
@mergify mergify bot removed the queued label Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement Review effort 3/5 size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants