-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Access] Refactor storage collections for access node #7093
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7093 +/- ##
==========================================
- Coverage 41.24% 41.16% -0.08%
==========================================
Files 2171 2176 +5
Lines 190061 190382 +321
==========================================
- Hits 78391 78377 -14
- Misses 105127 105451 +324
- Partials 6543 6554 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d72e501
to
a5f43fb
Compare
cmd/execution_builder.go
Outdated
@@ -218,6 +217,9 @@ func (builder *ExecutionNodeBuilder) LoadComponentsAndModules() { | |||
Module("blobservice peer manager dependencies", exeNode.LoadBlobservicePeerManagerDependencies). | |||
Module("bootstrap", exeNode.LoadBootstrapper). | |||
Module("register store", exeNode.LoadRegisterStore). | |||
AdminCommand("get-transactions", func(conf *NodeConfig) commands.AdminCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why move this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the exeNode.collections
was not initialized until exeNode.LoadCollections
is called.
That said, this change should be in a different PR, let me check.
storage/operation/collections.go
Outdated
// IndexCollectionPayload indexes the transactions within the collection payload | ||
// of a cluster block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this specific to collection cluster logic, or is this just indexing by blockID?
I'm wondering if we're overloading the codeIndexCollection
to mean different things on ANs/ENs vs LNs
t.Run("Retrieve nonexistant", func(t *testing.T) { | ||
var actual flow.LightCollection | ||
err := operation.RetrieveCollection(db.Reader(), expected.ID(), &actual) | ||
assert.Error(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert.Error(t, err) | |
assert.ErrorIs(t, err, storage.ErrNotFound) | |
assert.Nil(t, actual) |
|
||
var actual flow.LightCollection | ||
err = operation.RetrieveCollection(db.Reader(), expected.ID(), &actual) | ||
assert.Error(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you assert the specific error here and wherever we have sentinels returned
|
||
_ = db.WithReaderBatchWriter(func(rw storage.ReaderBatchWriter) error { | ||
err := operation.InsertCollection(rw.Writer(), &expected) | ||
assert.Nil(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think assert.NoError()
communicates your intent more clearly
assert.Nil(t, err) | |
assert.NoError(t, err) |
|
||
func TestTransactions(t *testing.T) { | ||
|
||
dbtest.RunWithDB(t, func(t *testing.T, db storage.DB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a not found test as well
} | ||
|
||
func NewCollections(db storage.DB, transactions *Transactions) *Collections { | ||
c := &Collections{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about adding a cache? collections are commonly looked up on access nodes. totally fine to do later
|
||
func (c *Collections) Remove(colID flow.Identifier) error { | ||
err := c.db.WithReaderBatchWriter(func(rw storage.ReaderBatchWriter) error { | ||
return operation.RemoveCollection(rw.Writer(), colID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also remove the index?
storage/store/collections.go
Outdated
// transaction is already indexed by a different collection, we should not index it again | ||
// so that the access node will always return the same collection for a given transaction | ||
// and return a consistent transaction result status. | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return an error here since LNs are supposed to prevent a tx from
- appearing multiple times in the same collection
- appearing in multiple collections
notNil(builder.events), | ||
notNil(builder.collections), | ||
notNil(builder.transactions), | ||
notNil(builder.lightTransactionResults), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for using notNil
on only some inputs?
Shouldn't all the inputs be non-nil? And if an input is nil, won't the component using it quickly panic anyway?
storage/operation/collections.go
Outdated
// RemoveCollectionTransactionIndices removes a collection id indexed by a transaction id | ||
// any error returned are exceptions | ||
func RemoveCollectionTransactionIndices(w storage.Writer, txID flow.Identifier) error { | ||
return RemoveByKey(w, MakePrefix(codeIndexCollectionByTransaction, txID)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// RemoveCollectionTransactionIndices removes a collection id indexed by a transaction id | |
// any error returned are exceptions | |
func RemoveCollectionTransactionIndices(w storage.Writer, txID flow.Identifier) error { | |
return RemoveByKey(w, MakePrefix(codeIndexCollectionByTransaction, txID)) | |
} | |
// RemoveCollectionByTransactionIndex removes a collection id indexed by a transaction id, | |
// created by [UnsafeIndexCollectionByTransaction]. | |
// Any error returned is an exception. | |
func RemoveCollectionByTransactionIndex(w storage.Writer, txID flow.Identifier) error { | |
return RemoveByKey(w, MakePrefix(codeIndexCollectionByTransaction, txID)) | |
} |
Naming to match the insert method for same index.
storage/store/collections.go
Outdated
if err != nil { | ||
return nil, err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil { | |
return nil, err | |
} |
The error is already checked above
storage/operation/collections.go
Outdated
@@ -52,3 +50,15 @@ func UnsafeIndexCollectionByTransaction(w storage.Writer, txID flow.Identifier, | |||
func RetrieveCollectionID(r storage.Reader, txID flow.Identifier, collectionID *flow.Identifier) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func RetrieveCollectionID(r storage.Reader, txID flow.Identifier, collectionID *flow.Identifier) error { | |
// LookupCollectionByTransaction looks up the collection indexed by the given transaction ID, | |
// which is the collection in which the given transaction was included. | |
// No errors are expected during normal operaion. | |
func LookupCollectionByTransaction(r storage.Reader, txID flow.Identifier, collectionID *flow.Identifier) error { |
To match naming of other methods operating on the same index.
err = c.db.WithReaderBatchWriter(func(rw storage.ReaderBatchWriter) error { | ||
// remove transaction indices | ||
for _, txID := range col.Transactions { | ||
err = operation.RemoveCollectionTransactionIndices(rw.Writer(), txID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collections.Store
above inserts transactions. Should Remove
also remove the transactions?
storage/store/collections.go
Outdated
} | ||
continue | ||
// the indexingByTx lock has ensured we are the only process indexing collection by transaction | ||
err = operation.UnsafeIndexCollectionByTransaction(rw.Writer(), txID, collection.ID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = operation.UnsafeIndexCollectionByTransaction(rw.Writer(), txID, collection.ID()) | |
err = operation.UnsafeIndexCollectionByTransaction(rw.Writer(), txID, cid) |
Avoid re-computing the hash every loop iteration
storage/store/collections.go
Outdated
if err == nil { | ||
// collection nodes have ensured that a transaction can only belong to one collection | ||
// so if transaction is already indexed by a collection, check if it's the same collection. | ||
// if not, return an error | ||
if cid != differentColTxIsIn { | ||
return fmt.Errorf("transaction %v is already indexed by a different collection %v", txID, differentColTxIsIn) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is substantially changing the behaviour.
Previously, we would skip re-indexing TXID->COLLECTIONID, if any index entry for TXID already existed. Now we are throwing an exception.
The reason we specifically check for the case of the index already existing is to make sure that we don't overwrite the index with a different collection ID, so that the information served by the Access API is consistent (if not correct). Now this scenario will cause an exception and likely the node will enter a crash-loop. To match the previous behaviour, the case of err == nil
on line 151 should be a no-op.
It is true that we don't currently expect this scenario to happen, absent a cluster consensus bug, but we have had such bugs in the past, and in the mature system we need to tolerate Byzantine clusters. So I don't think this should throw an exception.
8bdd882
to
35692d7
Compare
@@ -575,6 +577,15 @@ func (builder *FlowAccessNodeBuilder) BuildExecutionSyncComponents() *FlowAccess | |||
AdminCommand("read-execution-data", func(config *cmd.NodeConfig) commands.AdminCommand { | |||
return stateSyncCommands.NewReadExecutionDataCommand(builder.ExecutionDataStore) | |||
}). | |||
Module("transactions and collections storage", func(node *cmd.NodeConfig) error { | |||
// TODO: needs to be wrapped with ChainedCollections module, otherwise once we switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link the issue as TODO here #6523 (comment) .
Will be addressed separately. We can review and approve this PR, but not merge until the TODO is completed.
cc @fxamacker
Working towards #6515
Review #7059 first.
This PR refactors the transactions and collection storage in access node to use the generic storage module.