Skip to content

MDEV-39603 Convert HASH interface from uchar* to void*#5098

Draft
nam-m wants to merge 1 commit into
MariaDB:10.11from
nam-m:mdev-39603
Draft

MDEV-39603 Convert HASH interface from uchar* to void*#5098
nam-m wants to merge 1 commit into
MariaDB:10.11from
nam-m:mdev-39603

Conversation

@nam-m
Copy link
Copy Markdown

@nam-m nam-m commented May 19, 2026

Summary

  • Change my_hash_get_key typedef return type from const uchar * to const void *
  • Change HASH_LINK.data from uchar * to void *
  • Change record pointer parameters and return types in my_hash_insert, my_hash_delete, my_hash_update, my_hash_replace, my_hash_element,
    my_hash_search, my_hash_first, my_hash_first_from_hash_value, and my_hash_next to use void *
  • Update all ~50 get_key callback implementations across 67 files to return const void *
  • Remove unnecessary reinterpret_cast<const uchar *> casts at callback return sites and caller insert/delete sites

Motivation

The HASH interface used uchar * as a generic pointer type for storing and retrieving arbitrary record types. This forced every caller to cast through
(uchar *) or reinterpret_cast<const uchar *> when inserting, deleting, or implementing get_key callbacks — adding noise with no type safety benefit.
Converting to void * leverages implicit pointer conversion in C and well-defined static_cast in C++, making the intent clearer at every call site.

Key byte-buffer parameters (const uchar *key in my_hash_search, my_hash_first, my_hash_next, and the my_hash_function typedef) are left
unchanged since they represent actual byte data used for hashing and comparison.

How can this PR be tested?

No new MTR tests are added since this is a refactor and we only need to make sure all functionalities are unchanged.

MTR test run with these cmd (Need to run as a non-root user as test roles.set_default_role_invalid_skip_name_resolve would fail when running as root user):

useradd -m mtruser
chown -R mtruser:mtruser /quick-rebuilds/build/mysql-test/var
su - mtruser -c "cd /quick-rebuilds && ./build/mysql-test/mysql-test-run.pl --force --parallel=auto \
    --suite=main,rpl,roles,plugins,json,handler,csv,federated,maria,perfschema,client,spider,mroonga"

Before rebasing to 10.11

The servers were restarted 908 times
Spent 2775.394 of 486 seconds executing testcases

Completed: All 2952 tests were successful.

524 tests were skipped, 133 by the test itself.

After rebasing to 10.11

I got some tests on main failed due to mysql_old_password not found or TLS/SSL error. This is probably due to my Docker env using OpenSSL 3.6.2 7 Apr 2026 (Library: OpenSSL 3.6.2 7 Apr 2026) (which disables legacy TLS), and these shouldn't be caused by my hash changes

The servers were restarted 243 times
Spent 730.695 of 176 seconds executing testcases

Completed: Failed 9/1106 tests, 99.19% were successful.

Failing test(s): main.connect main.ssl_7937 main.ssl_system_ca main.tls_version1 main.tls_version main.change_user main.set_password main.socket_conflict

Basing the PR against the correct MariaDB version

  • This is a refactoring change, and the PR is based against the main MariaDB development branch.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 19, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
16 out of 23 committers have signed the CLA.

✅ ParadoxV5
✅ GabrieleBocchi
✅ aquilamacedo
✅ vaintroub
✅ ayush-jha123
✅ varundeepsaini
✅ grooverdan
✅ DaveGosselin-MariaDB
✅ rophy
✅ gkodinov
✅ kou
✅ janlindstrom
✅ sjaakola
✅ dbart
✅ kjarir
✅ andr-sokolov
❌ Alexey Botchkov
❌ Thirunarayanan
❌ vuvova
❌ sanja-byelkin
❌ mariadb-RuchaDeodhar
❌ dr-m
❌ mariadb-YuchenPei


Alexey Botchkov seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors various hash-related functions across the codebase to use void* instead of uchar* for keys and data, improving type consistency. The review feedback suggests using static_cast instead of C-style casts for pointer conversions and removing redundant reinterpret_cast operations to improve code readability and safety.

Comment thread mysys/lf_hash.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread storage/innobase/handler/innodb_binlog.cc Outdated
@grooverdan
Copy link
Copy Markdown
Member

Nice!

roles.set_default_role_invalid_skip_name_resolve would fail when running as root user.

Could, as a separate PR - prefix this test with: -- source include/not_as_root.inc

Overall the gemini code assist has some reasonable suggestions.

I think not trap future coders, show of the good example, and simplify the merge if the majority of this can be backported to 10.11. There's some features that you've patches that exist in later versions - json_schema, json_array_intersect, hnsw that can be PRs against the versions where they where added (rounded up to currently maintained versions like 11.4, 11.8, 12.3, main) as draft, and we'll wait for the main HASH interface to merge up to those branches before merging.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label May 20, 2026
@gkodinov gkodinov self-assigned this May 20, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

Please merge all of your 3 commits into a single one.

And there are some further cleanups below to consider.

Also, if you are using any AI tool that requires attribution, please add a "Co-Authored-by:" mention for it in the commit message.

Comment thread mysys/lf_hash.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread sql/sql_cache.cc Outdated
Comment thread storage/innobase/handler/innodb_binlog.cc Outdated
Comment thread storage/innobase/handler/innodb_binlog.cc Outdated
Comment thread storage/innobase/handler/innodb_binlog.cc Outdated
@ParadoxV5 ParadoxV5 requested review from ParadoxV5 and grooverdan May 21, 2026 18:44
@ParadoxV5
Copy link
Copy Markdown
Contributor

(requested reviewers for after the initial drafting and 10.11-rebasing)

@ParadoxV5 ParadoxV5 marked this pull request as draft May 21, 2026 18:47
@nam-m
Copy link
Copy Markdown
Author

nam-m commented May 23, 2026

Nice!

roles.set_default_role_invalid_skip_name_resolve would fail when running as root user.

Could, as a separate PR - prefix this test with: -- source include/not_as_root.inc

Overall the gemini code assist has some reasonable suggestions.

I think not trap future coders, show of the good example, and simplify the merge if the majority of this can be backported to 10.11. There's some features that you've patches that exist in later versions - json_schema, json_array_intersect, hnsw that can be PRs against the versions where they where added (rounded up to currently maintained versions like 11.4, 11.8, 12.3, main) as draft, and we'll wait for the main HASH interface to merge up to those branches before merging.

Thanks for your feedback! I will review my changes to make sure they can be backported to 10.11

@nam-m
Copy link
Copy Markdown
Author

nam-m commented May 23, 2026

Thank you for your contribution! This is a preliminary review.

Please merge all of your 3 commits into a single one.

And there are some further cleanups below to consider.

Also, if you are using any AI tool that requires attribution, please add a "Co-Authored-by:" mention for it in the commit message.

Thank you for your feedback! I will merge my commits into one and attribute AI in my next commit

@mariadb-JimmyHu
Copy link
Copy Markdown

mariadb-JimmyHu commented May 23, 2026

I will merge my commits into one and attribute AI in my next commit

FYI: We prefer single squashed commits over multiple commits for various iterations.
It’s alright leaving them separate until the review phase is near conclusion.
Of course, each maintained version (read: branch) gets its own separate commit.

I will review my changes to make sure they can be backported to 10.11

It’s probably easier to open separate PRs split from this commit for each base branch.


P.S. Whoops, I forgot to switch accounts.

@nam-m nam-m changed the title MDEV-36326 Convert HASH interface from uchar* to void* MDEV-39603 Convert HASH interface from uchar* to void* May 25, 2026
The HASH interface used uchar* as a generic pointer type for storing
and retrieving arbitrary record types. This forced every caller to cast
through (uchar*) or reinterpret_cast<const uchar*> when inserting,
deleting, or implementing get_key callbacks.

Change my_hash_get_key typedef return from const uchar* to const void*,
HASH_LINK.data from uchar* to void*, and all function signatures for
insert/delete/update/replace/element/search/first/next to use void* for
record pointers. Update all ~50 get_key callback implementations to
return const void* and remove unnecessary casts.

Key byte-buffer parameters (const uchar *key in my_hash_search,
my_hash_first, my_hash_next) are left unchanged since they represent
actual byte data used for hashing and comparison.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
@nam-m nam-m changed the base branch from main to 10.11 May 25, 2026 05:29
@nam-m
Copy link
Copy Markdown
Author

nam-m commented May 25, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request executes a comprehensive refactoring of the HASH utility and its consumers by transitioning key and record pointers from uchar* to void*. This architectural cleanup spans the entire codebase, including the server core, client libraries, and multiple storage engines like Aria, Spider, and Performance Schema. By standardizing on void*, the PR successfully eliminates a significant amount of legacy casting boilerplate, resulting in cleaner and more maintainable code. I have no feedback to provide.

@nam-m
Copy link
Copy Markdown
Author

nam-m commented May 25, 2026

@ParadoxV5 I changed the target branch from main to 10.11 as the hash changes would require modifying features that do not exist in 10.11 yet (i.e. json_schema, vector_mhnsw, innodb_binlog).

@grooverdan
I have checked changes in all files with git cat-file against each LTS branch (10.11, 11.4, 11.8, 12.3) to determine which files (or hash usage within them) only exist on newer versions. Here are the changes that I proposed porting to new PR's targeting 11.4, 11.8, 12.3

Target Files Reason
11.4 sql/json_schema.cc, sql/json_schema.h, sql/json_schema_helper.cc, sql/json_schema_helper.h, sql/item_jsonfunc.cc json_schema added in 11.4; json_array_intersect hash usage added in 11.4
11.8 sql/vector_mhnsw.cc HNSW vector index added in 11.8
12.3 storage/innobase/handler/innodb_binlog.cc, sql/handler.h (handler_binlog_xid_info::get_key), sql/opt_trace_ddl_info.cc These files/callbacks added in 12.3

These will be submitted as PRs against their respective branches.

@gkodinov I have modified all C-style casts per Gemini's review. I re-requested its review and it looks ok now.

Note: I renamed this PR title and my commit to target MDEV-39603 instead of MDEV-36326 as I only dealt with the hash interface. Please let me know if this is ok.

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

6 participants