Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor query parsing performance with the REST client #39890

Closed
amirhadadi opened this issue Mar 10, 2019 · 5 comments
Closed

Poor query parsing performance with the REST client #39890

amirhadadi opened this issue Mar 10, 2019 · 5 comments
Assignees
Labels
Team:Data Management Meta label for data/management team

Comments

@amirhadadi
Copy link

amirhadadi commented Mar 10, 2019

Elasticsearch version: 6.3.2

Plugins installed: []

JVM version: 1.8.144

OS version: Linux 3.13.0-88-generic #135-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
When migrating from the transport client to the high level rest client, we saw our cluster throughput decrease by ~15%.
Profiling with JVisualVM, we determined the culprit to be calls to String.intern from Jackson InternCache:
image

These calls come from ScriptScoreFunctionBuilder::parse, as we have a fairly large parameter section (a few hundred parameters) in our custom scripts.
When using -XX:PrintStringTableStatistics to determine the average bucket size of the string table, we found it to be 4 with ~240K distinct strings. 4 strings on average is quite high, but I would not expect it to cause such a serious performance impact. It's just that interning is a bad idea.

Interning in Jackson was discussed here with benchmarks showing that disabling interning using JsonFactory.Feature.INTERN_FIELD_NAMES helps performance. Following that observation, in Jackson 3 interning is disabled by default.

I suggest disabling field name interning in JsonXContent's jsonFactory.

Steps to reproduce:

Provide logs (if relevant):

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@amirhadadi
Copy link
Author

We tested disabling field interning using reflection on JsonXContent.jsonFactory, effectively doing:

jsonFactory.disable(JsonFactory.Feature.INTERN_FIELD_NAMES)

And profiled with JVisualVM.
SearchSourceBuilder::parseXContent which was previously consuming 10% of total CPU time is now down to 3%.

@hub-cap
Copy link
Contributor

hub-cap commented Mar 27, 2019

Thanks for bringing this up. I need to do some research, but im looking at it now.

@hub-cap hub-cap self-assigned this Mar 27, 2019
@hub-cap
Copy link
Contributor

hub-cap commented Apr 3, 2019

HEy @amirhadadi, we discussed and totally agree with you. We have historically also removed most of the intern stuff from our codebase as well. If you would like, you can submit a PR, or I will eventually get to it. Ive got a few other higher prio items on my plate currently though!

@dakrone
Copy link
Member

dakrone commented Mar 8, 2024

Closing this as we've removed the high level rest client in favor of the Java client.

@dakrone dakrone closed this as completed Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

6 participants