Skip to content

Conversation

buttercookie42
Copy link

@buttercookie42 buttercookie42 commented Nov 5, 2024

While trying to speed up file system accesses in directories containing a four-digit amount of files, I've noticed that (at least with SMB1) Windows Explorer seems to issue up to 5 or 6 FIND_FIRST2 requests for launching a file, each of which needs to map the path.

In large directories, this seems to incur a noticeable overhead because in the current implementation, lastDir.list() is unconditionally always called for mapping the filename portion of the path (for the directory portion of the path at least this only happens when we need to fall back to an explicitly case-insensitive search). (I guess SMB2 traffic isn't much affected by this, because there Windows seems to repeatedly open and close the file instead, and the openFile method has a shortcut to not map the path if Files.exists() returns true.)

To fix this, I propose that
a) mapping the file name should take the "if the file exists, simply declare success" approach, too (a number of callers of mapPath already do that as well, too) and only fall back to manually traversing the whole directory for the fallback case-insensitive search, and
b) to rewrite the case-insensitive search to a NIO API-based approach, so we don't always have to preemptively traverse the whole directory, but instead only as far as until we (hopefully) find a match.

With the above fixed, I've found another follow-up issue: A wildcard search issued for listing a directory's whole contents always ends up in mapPath with the * as the "filename". So even with the above fixes in place, we always end up futilely searching the whole directory for the non-existant "*" file. Adding a shortcut for that case speeds up directory listings a bit, too.

With all these changes in place (including the search optimisation from #25), a jfileserver implementation running on my phone is now twice as fast at recursively listing my complete SD card compared to natively running Samba via Termux, even with jfileserver only using SMB1 and Samba using SMB2.

Opening a file from Windows Explorer results in multiple (possibly five or six)
FIND_FIRST2 requests for the selected file. Each FIND_FIRST2 request results in
a call to DiskInterface.mapPath().

In a large directory with thousands of files, each mapPath() has a considerable
overhead (hundreds of ms), mainly caused by the calls to lastDir.list(), which
have to list the whole directory before being able to proceed.

For the directories at least that overhead is only incurred when actually
needing to explicitly conduct a case-insensitive search, but for the file name
currently lastDir.list() is always called unconditionally, which incurs the
aforementioned large overhead.

To speed up mapPath, I propose doing two things:
- Switch the file names to same logic as the directory names, i.e. first we
  simply check whether the file exists, and only if that fails do we do the
  potentially more expensive iteration through the whole directory to check for
  a case-insensitive match.
- Seeing how this is the Java*NIO*DiskDriver anyway, rewrite the directory
  search to actually use the new NIO APIs.
No need to expensively query the file system for a file we know won't exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant