Access lists indexing by giurgiur99 · Pull Request #1361 · oceanprotocol/ocean-node

giurgiur99 · 2026-04-30T10:51:07Z

Fixes #1353

Changes proposed in this PR:

Index AccessList contracts: NewAccessList, AddressAdded, AddressRemoved events
Store chainId, contractAddress, name, symbol, transferable, users[], deploymentBlock, deploymentTxId in Typesense + Elasticsearch
Add HTTP routes and P2P handlers for getting and searching access lists by wallet

giurgiur99 · 2026-04-30T11:41:08Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This PR successfully implements indexing and database querying for Access Lists, introducing event processors, new HTTP endpoints, and comprehensive DB schemas for both Typesense and Elasticsearch. The implementation integrates well with existing architecture and uses robust testing. However, there are significant performance bottlenecks related to universally emitted events (OwnershipTransferred) and synchronous RPC calls per event. There is also a potential bug regarding precision loss for large tokenIds.

Comments:
• [ERROR][performance] The OwnershipTransferred event is emitted by almost every Ownable contract on the blockchain. Because this processor is hooked to the topic hash, it will trigger a database retrieve for every ownership transfer across the entire network. On active networks, this will cause severe database spam and degrade indexer performance. Consider utilizing an in-memory Set of known AccessList addresses to filter relevant events before querying the database.

+      // Suggestion: Filter using an in-memory cache first
+      if (!knownAccessLists.has(contractAddress)) return null;
       const existing = await accessList.retrieve(chainId, contractAddress)

• [WARNING][performance] Fetching the block via provider.getBlock(event.blockNumber) inside the event processor means an asynchronous RPC call is made for every AddressAdded event. If a factory deploys a list with hundreds of initial users, this will result in hundreds of duplicate RPC calls for the same block, potentially hitting RPC rate limits. Consider utilizing a simple block cache mechanism or passing the block timestamp directly from the indexer context.

-      const block = await provider.getBlock(event.blockNumber)
-      const timestamp = block?.timestamp ?? Math.floor(Date.now() / 1000)
+      // Suggestion: implement an LRU cache for block fetches
+      const timestamp = await getCachedBlockTimestamp(provider, event.blockNumber) ?? Math.floor(Date.now() / 1000)

• [WARNING][bug] tokenId in ERC721 contracts is typically a uint256. Converting it directly to a JavaScript Number will cause precision loss for any value exceeding Number.MAX_SAFE_INTEGER (2^53 - 1). This could corrupt index data and make elements un-removable. It is highly recommended to store large integer identifiers as string values (and update the DB schema accordingly) unless you enforce that token IDs will remain under the safety limit.

-      const tokenId = Number(decoded.args[1])
+      // Safer approach if database schema is updated to handle strings
+      const tokenId = BigInt(decoded.args[1]).toString()

• [INFO][bug] Typesense lacks atomic operations for nested arrays (unlike Elasticsearch's painless scripts). Fetching the document, modifying the users array in-memory, and writing it back creates a potential race condition if multiple indexing events update the same AccessList concurrently. Since node indexers typically operate sequentially per chain, it may not be an issue in practice, but it's important to document or keep in mind for future horizontal scaling.

giurgiur99 · 2026-05-04T07:09:31Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: low

Summary:
Excellent implementation of Access List indexing. The overall design, HTTP/P2P handlers, and integration tests are well-structured and comprehensive. I've left a few suggestions regarding updating array elements (for token transfers), normalizing API responses between ElasticSearch and Typesense, and hardening some edge cases around error handling and validation.

Comments:
• [WARNING][bug] If an AddressAdded event fires for an existing tokenId (e.g., the NFT was transferred to a new wallet), the current painless script simply ignores it because exists is true. It should overwrite the existing user with the new wallet so the index stays up to date.

-              boolean exists = false;
-              for (int i = 0; i < ctx._source.users.length; i++) {
-                if (ctx._source.users[i].tokenId == params.user.tokenId) { exists = true; break; }
-              }
-              if (!exists) { ctx._source.users.add(params.user); }
+              boolean exists = false;
+              for (int i = 0; i < ctx._source.users.length; i++) {
+                if (ctx._source.users[i].tokenId == params.user.tokenId) {
+                  ctx._source.users[i] = params.user;
+                  exists = true;
+                  break;
+                }
+              }
+              if (!exists) { ctx._source.users.add(params.user); }

• [WARNING][bug] Similar to the ElasticSearch implementation, if the tokenId already exists in the users array, we should overwrite it to reflect the new wallet in case of an NFT transfer.

-      const nextUsers = exists ? users : [...users, normalized]
+      const nextUsers = exists
+        ? users.map((u: AccessListUser) => u.tokenId === normalized.tokenId ? normalized : u)
+        : [...users, normalized]

• [INFO][style] In ElasticsearchAccessListDatabase, searchByWallet maps and returns an innerHits array for each document containing the explicitly matched users. Typesense does not do this natively here, leading to inconsistent API responses depending on the database backend. You can manually construct innerHits to match the ES schema.

-      return (result.hits ?? []).map((h: any) => h.document)
+      return (result.hits ?? []).map((h: any) => {
+        const doc = h.document
+        const innerHits = (doc.users ?? []).filter((u: AccessListUser) => u.wallet === lowerWallet)
+        return { ...doc, innerHits }
+      })

• [WARNING][bug] If the RPC call to accessListContract.transferable() fails due to a transient network error, the catch block logs a warning and proceeds with transferable = false. This permanently locks in potentially incorrect data. It's safer to return null and log as an error so the indexer can retry the event later.

-      } catch (err) {
-        INDEXER_LOGGER.log(
-          LOG_LEVELS_STR.LEVEL_WARN,
-          `Failed to read transferable() on ${contractAddress}: ${err.message}`
-        )
-      }
+      } catch (err) {
+        INDEXER_LOGGER.log(
+          LOG_LEVELS_STR.LEVEL_ERROR,
+          `Failed to read transferable() on ${contractAddress}: ${err.message}`
+        )
+        return null
+      }

• [INFO][bug] If task.chainId is present but is an invalid string, it gets cast to NaN. Sending NaN to the database query could cause unexpected behavior. It's safer to validate it identically to the HTTP route.

-      const chainId = task.chainId !== undefined ? Number(task.chainId) : undefined
+      let chainId: number | undefined
+      if (task.chainId !== undefined) {
+        chainId = Number(task.chainId)
+        if (Number.isNaN(chainId)) {
+          return {
+            stream: null,
+            status: { httpStatus: 400, error: 'Invalid chainId' }
+          }
+        }
+      }

• [INFO][other] Just a heads up: casting decoded.args[1] (which is a BigInt representing a uint256) using Number() can result in precision loss if the tokenId exceeds 2^53 - 1 (the max safe integer in JS). Since Ocean Protocol's AccessList tokens are likely strictly auto-incrementing counters this is perfectly safe, but if they are ever hashes or UUIDs, consider changing the database schema and this parser to handle tokenId as a string.

giurgiur99 · 2026-05-04T08:07:53Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: low

Summary:
This pull request implements comprehensive indexing and API handling for AccessList contracts. It correctly captures NewAccessList, AddressAdded, and AddressRemoved events, indexing them efficiently into Elasticsearch and Typesense. The architecture gracefully handles out-of-order log indexing using atomic scripts and upserts. However, a few improvements are required regarding P2P handler input validation (to prevent NoSQL filter injections and invalid DB queries) and stream manipulation in HTTP routes to optimize memory usage.

Comments:
• [WARNING][bug] When handlers are invoked via P2P commands, task.chainId is unchecked and might evaluate to NaN if a peer sends an invalid format. Passing NaN directly to the database layer can cause query errors. Validate chainId carefully.

-      const doc = await db.accessList.retrieve(
-        Number(task.chainId),
-        String(task.contractAddress)
-      )
+      const chainId = Number(task.chainId)
+      if (Number.isNaN(chainId)) {
+        return {
+          stream: null,
+          status: { httpStatus: 400, error: 'chainId must be a valid number' }
+        }
+      }
+      const doc = await db.accessList.retrieve(
+        chainId,
+        String(task.contractAddress)
+      )

• [WARNING][bug] Similar to GetAccessListHandler, validate chainId here to ensure a P2P caller hasn't provided a non-numeric string.

-      const chainId = task.chainId !== undefined ? Number(task.chainId) : undefined
+      let chainId: number | undefined
+      if (task.chainId !== undefined) {
+        chainId = Number(task.chainId)
+        if (Number.isNaN(chainId)) {
+          return {
+            stream: null,
+            status: { httpStatus: 400, error: 'chainId must be a valid number' }
+          }
+        }
+      }

• [WARNING][security] Unlike the HTTP route, the P2P handler does not validate task.wallet. A peer could pass crafted strings containing Typesense logical operators (&&, ||) to manipulate the backend query (NoSQL filter injection). Validate the wallet address before processing.

-      const docs = await db.accessList.searchByWallet(String(task.wallet), chainId)
+      const walletString = String(task.wallet)
+      if (!/^0x[a-fA-F0-9]{40}$/i.test(walletString)) {
+        return {
+          stream: null,
+          status: { httpStatus: 400, error: 'Invalid wallet address' }
+        }
+      }
+      const docs = await db.accessList.searchByWallet(walletString, chainId)

• [WARNING][performance] Reading the entire stream into a string, parsing it as JSON, and then serializing it again with res.json() adds unnecessary memory overhead and GC pressure. Piping the stream directly to the response object avoids these multiple transformations.

-    if (result.stream) {
-      const data = JSON.parse(await streamToString(result.stream as Readable))
-      res.json(data)
-    } else {
+    if (result.stream) {
+      res.set('Content-Type', 'application/json')
+      ;(result.stream as Readable).pipe(res)
+    } else {

• [WARNING][performance] Apply the same streaming optimization here to improve endpoint performance and reduce payload buffering.

-    if (result.stream) {
-      const data = JSON.parse(await streamToString(result.stream as Readable))
-      res.json(data)
-    } else {
+    if (result.stream) {
+      res.set('Content-Type', 'application/json')
+      ;(result.stream as Readable).pipe(res)
+    } else {

• [INFO][style] Excellent use of upsert configurations and Painless scripts! This design flawlessly handles the asynchronous and occasionally unpredictable nature of blockchain log indexing, gracefully processing AddressAdded events that might be indexed slightly before their parent NewAccessList events.
• [INFO][bug] Using Number() on uint256 arguments works fine for small, sequential AccessList identifiers. However, if a malicious contract emits AddressAdded with type(uint256).max, Number() will lose precision because it exceeds Number.MAX_SAFE_INTEGER. While the risk here is low and only impacts visual display, standard practice is to handle unbounded EVM uint256 properties as BigInt or string.

giurgiur99 added 2 commits April 30, 2026 13:50

index access lists

4a3f2c7

Merge branch 'main' into access-lists-indexing

7b565bc

alexcos20 reviewed Apr 30, 2026

View reviewed changes

remove owner

d50df36

alexcos20 reviewed May 4, 2026

View reviewed changes

giurgiur99 added 2 commits May 4, 2026 10:52

cleanup + name and symbol

8b1bcc7

wait for condition

7327ee1

alexcos20 reviewed May 4, 2026

View reviewed changes

check wallet format

5726dea

giurgiur99 marked this pull request as ready for review May 4, 2026 10:20

giurgiur99 requested review from andreip136, bogdanfazakas, dnsi0 and ndrpp as code owners May 4, 2026 10:20

ndrpp approved these changes May 4, 2026

View reviewed changes

giurgiur99 merged commit f9b71ba into main May 4, 2026
11 checks passed

giurgiur99 deleted the access-lists-indexing branch May 4, 2026 11:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Access lists indexing#1361

Access lists indexing#1361
giurgiur99 merged 6 commits intomainfrom
access-lists-indexing

giurgiur99 commented Apr 30, 2026 •

edited

Loading

Uh oh!

giurgiur99 commented Apr 30, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

giurgiur99 commented May 4, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

giurgiur99 commented May 4, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

giurgiur99 commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giurgiur99 commented Apr 30, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

giurgiur99 commented May 4, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

giurgiur99 commented May 4, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

giurgiur99 commented Apr 30, 2026 •

edited

Loading