Skip to content

allow to control Suggester rebuilds #3933

Open
@aerofeev2k

Description

@aerofeev2k

I've noticed that pulling webapp config and immediately pushing it back triggers Suggester, which in turn brings CPU utilization up to 2000% (I do have rebuildThreadPoolSizeInNcpuPercent set to 10, and there are 40 CPU cores).

The strange thing is that the main indices have not been updated since the last time Suggester has run. I thought that Suggester had these version.txt files with the last seen index generation commit number, so it could've used that as a hint that no rescanning is necessary?

While the Suggester's spinning, I see four threads sitting in about this stack:

"ForkJoinPool-1-worker-3" #552 daemon prio=5 os_prio=0 cpu=66057.03ms elapsed=136.58s tid=0x00007fdbe8004000 nid=0xa961 runnable  [0x00007fdbe3ffd000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame(SegmentTermsEnum.java:254)
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame(SegmentTermsEnum.java:246)
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:540)
        at org.apache.lucene.index.LeafReader.docFreq(LeafReader.java:82)
        at org.apache.lucene.index.BaseCompositeReader.docFreq(BaseCompositeReader.java:149)
        at org.opengrok.suggest.SuggesterUtils.computeNormalizedDocumentFrequency(SuggesterUtils.java:108)
        at org.opengrok.suggest.SuggesterUtils.computeScore(SuggesterUtils.java:97)
        at org.opengrok.suggest.SuggesterProjectData$WFSTInputIterator.weight(SuggesterProjectData.java:606)
        at org.apache.lucene.search.suggest.SortedInputIterator.sort(SortedInputIterator.java:184)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:76)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:62)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup$WFSTInputIterator.<init>(WFSTCompletionLookup.java:273)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup.build(WFSTCompletionLookup.java:115)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:266)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:253)
        at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:157)
        at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:231)
        at org.opengrok.suggest.Suggester$$Lambda$458/0x0000000800422040.run(Unknown Source)
        at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec([email protected]/ForkJoinTask.java:1407)
        at java.util.concurrent.ForkJoinTask.doExec([email protected]/ForkJoinTask.java:290)
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec([email protected]/ForkJoinPool.java:1020)
        at java.util.concurrent.ForkJoinPool.scan([email protected]/ForkJoinPool.java:1656)
        at java.util.concurrent.ForkJoinPool.runWorker([email protected]/ForkJoinPool.java:1594)
        at java.util.concurrent.ForkJoinWorkerThread.run([email protected]/ForkJoinWorkerThread.java:183)

I've noticed this because after one such configuration update Tomcat became complete unresponsive, even running out of 8GB of memory:

10-Apr-2022 10:48:13.479 INFO [configuration-3-thread-1] org.opengrok.indexer.configuration.RuntimeEnvironment.applyConfig Done applying configuration
10-Apr-2022 10:48:13.587 INFO [Thread-3677] org.opengrok.suggest.Suggester.init Initializing suggester
10-Apr-2022 10:48:13.858 WARNING [ForkJoinPool-102-worker-63] org.opengrok.suggest.SuggesterProjectData.initFields Fields [hist] will be ignored because they were not found in index directory MMapDirectory@/opengrok/data/index/solaris lockFactory=org.apache.lucene.store.NativeFSLockFactory@903a800
10-Apr-2022 10:48:13.539 SEVERE [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-5] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "ajp-nio-0:0:0:0:0:0:0:1-8009-exec-10" java.lang.OutOfMemoryError: GC overhead limit exceeded
10-Apr-2022 11:06:06.037 SEVERE [http-nio-8080-exec-3] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded
10-Apr-2022 11:06:45.230 SEVERE [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions