Skip to content

Conversation

yuzawa-san
Copy link
Contributor

@yuzawa-san yuzawa-san commented Aug 26, 2025

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Related issues

n/a

Describe the solution you've provided

I have eliminated several intermediate allocations in this hot piece of code in my codebase. I used a single StringBuilder to accumulate the key to be hashed. The intermediate concatenations to make the key each had allocations. Additionally, I used StringBuilder.append(int) when possible to avoid the intermediate string conversion from int. I removed the hex and substring steps, both of which allocated strings and backing bytes. Instead, I did the equivalent calculation use bit operations on the underlying hash output bytes. I added a test to ensure the old logic and new logic are aligned.

Describe alternatives you've considered

Ideally, the StringBuilder would be able to go right from the appended Strings to utf8 bytes into the sha1, but sadly the StringBuilder creates an intermediate string that then gets converted into bytes for the digest. I do not believe I can make that better, but will sleep on it. nevermind, I found https://stackoverflow.com/questions/19472011/java-stringbuffer-to-byte-without-tostring

nevermind to that, it would appear the copy is cheaper than the character conversion and wrapping

Additional context

This was the hotspot in our allocations flamegraph which I wish to improve.

image

@yuzawa-san yuzawa-san requested a review from a team as a code owner August 26, 2025 21:34
@yuzawa-san
Copy link
Contributor Author

yuzawa-san commented Aug 27, 2025

I have conducted some JMH microbenchmarking and found that 59247ee is not worthwhile. I believe the character by character conversion is not efficient relative, hence I have reverted that. There is a still an memory and throughput lift in my original commit (newVersionExtra vs oldVersion).

Benchmark                                        Mode  Cnt         Score        Error   Units
MyBenchmark.newVersion                          thrpt    5   5875653.568 ± 164107.054   ops/s
MyBenchmark.newVersion:gc.alloc.rate            thrpt    5      3271.490 ±     91.353  MB/sec
MyBenchmark.newVersion:gc.alloc.rate.norm       thrpt    5       583.842 ±      0.006    B/op
MyBenchmark.newVersion:gc.count                 thrpt    5       387.000               counts
MyBenchmark.newVersion:gc.time                  thrpt    5       180.000                   ms
MyBenchmark.newVersionExtra                     thrpt    5  11366176.062 ± 561447.905   ops/s
MyBenchmark.newVersionExtra:gc.alloc.rate       thrpt    5      5824.835 ±    343.102  MB/sec
MyBenchmark.newVersionExtra:gc.alloc.rate.norm  thrpt    5       537.512 ±     54.915    B/op
MyBenchmark.newVersionExtra:gc.count            thrpt    5       598.000               counts
MyBenchmark.newVersionExtra:gc.time             thrpt    5       239.000                   ms
MyBenchmark.oldVersion                          thrpt    5   8780421.640 ± 584040.649   ops/s
MyBenchmark.oldVersion:gc.alloc.rate            thrpt    5      4960.950 ±    329.893  MB/sec
MyBenchmark.oldVersion:gc.alloc.rate.norm       thrpt    5       592.455 ±      0.040    B/op
MyBenchmark.oldVersion:gc.count                 thrpt    5       522.000               counts
MyBenchmark.oldVersion:gc.time                  thrpt    5       225.000                   ms
@Benchmark // 59247ee
  public void newVersion(Blackhole blackhole) {
    StringBuilder keyBuilder = new StringBuilder();
    if (seed != null) {
      keyBuilder.append(seed.intValue());
    } else {
      keyBuilder.append(flagOrSegmentKey).append('.').append(salt);
    }
    keyBuilder.append('.');
    keyBuilder.append(USER_ID);

    // turn the first 15 hex digits of this into a long
    MessageDigest digest = DigestUtils.getSha1Digest();
    digest.update(StandardCharsets.UTF_8.encode(CharBuffer.wrap(keyBuilder)));
    byte[] hash = digest.digest();
    long longVal = 0;
    for (int i = 0; i < 7; i++) {
      longVal <<= 8;
      longVal |= (hash[i] & 0xff);
    }
    longVal <<= 4;
    longVal |= ((hash[7] >> 4) & 0xf);
    blackhole.consume(longVal);
  }
  
  @Benchmark // acd352d
  public void newVersionExtra(Blackhole blackhole) {
    StringBuilder keyBuilder = new StringBuilder();
    if (seed != null) {
      keyBuilder.append(seed.intValue());
    } else {
      keyBuilder.append(flagOrSegmentKey).append('.').append(salt);
    }
    keyBuilder.append('.');
    keyBuilder.append(USER_ID);

    // turn the first 15 hex digits of this into a long
    byte[] hash = DigestUtils.sha1(keyBuilder.toString());
    long longVal = 0;
    for (int i = 0; i < 7; i++) {
      longVal <<= 8;
      longVal |= (hash[i] & 0xff);
    }
    longVal <<= 4;
    longVal |= ((hash[7] >> 4) & 0xf);
    blackhole.consume(longVal);
  }
  
  @Benchmark
  public void oldVersion(Blackhole blackhole) {
    String idHash = USER_ID;
        String prefix;
        if (seed != null) {
          prefix = seed.toString();
        } else {
          prefix = flagOrSegmentKey + "." + salt;
        }
        String hash = DigestUtils.sha1Hex(prefix + "." + idHash).substring(0, 15);
        long longVal = Long.parseLong(hash, 16);
        blackhole.consume(longVal);
  }

@tanderson-ld
Copy link
Contributor

Thank you for the contribution @yuzawa-san , I will try to take a look in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants