-
-
Notifications
You must be signed in to change notification settings - Fork 844
ICU-23254 Remove C++ static initialization #3766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| - name: ICU4C with clang on MacOS | ||
| env: | ||
| CPPFLAGS: '-Wall -Wextra -Wextra-semi -Wundef -Wnon-virtual-dtor -Wctad-maybe-unsupported -Werror' | ||
| CPPFLAGS: '-Wall -Wextra -Wextra-semi -Wundef -Wnon-virtual-dtor -Wctad-maybe-unsupported -Wglobal-constructors -Wexit-time-destructors -Werror' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prevents the C++ static initialization from coming back.
| FIELD_INIT(parseCaseSensitive, &gIntOps), | ||
| FIELD_INIT(outputCurrency, &gStrOps) | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All moved to the object constructor.
| expect(UnicodeString("Test case #") + i, | ||
| UnicodeString(BEGIN_END_TEST_CASES[i][0], -1, US_INV), | ||
| UnicodeString(BEGIN_END_TEST_CASES[i][1], -1, US_INV), | ||
| UnicodeString(BEGIN_END_TEST_CASES[i][2], -1, US_INV)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restructuring this actually makes it much easier to read.
| {"process(uca_rules)", u"process(uca_rules)", parseUCARules}, | ||
| {"process(collation)", u"process(collation)", nullptr /* not implemented yet */}, | ||
| {"process(transliterator)", u"process(transliterator)", parseTransliterator}, | ||
| {"process(dependency)", u"process(dependency)", parseDependency}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that C++11 is required, the extra string initialization step is no longer required.
| static UBool write_xliff = false; | ||
| static const char* outputEnc =""; | ||
|
|
||
| static ResFile poolBundle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved to the main function in genrb.cpp.
| # define cp1047_to_8859(c) cp1047_8859_1[c] | ||
|
|
||
| // Our app's name | ||
| std::string prog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved to the main function.
60ffca0 to
7252b8c
Compare
|
Getting static initialization out of the ICU library code is good/needed. But is it really necessary to remove it all from the tests and tools? Use there isn't going to affect the ICU libraries themselves. And we should maintain some level of testing that application use of static ICU objects works and doesn't cause anything weird to happen with loading and cleanup. We had it, somewhat accidentally, through the incidental use of statics in the test code. |
|
@aheninger You make a fair point. I'm mostly concerned about the common and i18n libraries themselves. I didn't quickly see an obvious way to target only the CPP flags for only the library code. I'm sure if I looked deeper that I'd find one. With that being said, here are some additional observations.
If we have non-accidental (intentional) C++ static initialization in the tests only, I can try to work with that. |
|
@aheninger Today's ICU-TC discussed this pull request. @eggrobin mentioned his support for enforcing this for the test suite too based on his experience with previous debugging issues in this area. So we will move forward with that TC choice. |
| const int32_t ScriptRun::pairedCharExtra = pairedCharCount - pairedCharPower; | ||
|
|
||
| int8_t ScriptRun::highBit(int32_t value) | ||
| static constexpr int8_t highBit(int32_t value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markusicu I tried to minimize the changes to this file, but this one time use function really should be switched to a constexpr function (C++11 syntax) to avoid the C++ static initialization. The constexpr functions are hard to separate between definition and declaration.
richgillam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOKTM
8925c93 to
3cf043b
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
|
@roubert Please take look at the latest changes. |
|
Hi everyone, I would really like to review this PR before it goes in, but I am swamped. I promise to get to it soon after UTW... |
markusicu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this is a lot of work...
- thank you!!
- sorry for the late review
- i do have some comments...
icu4c/source/i18n/smpdtfmt.cpp
Outdated
| case UDAT_YEAR_FIELD: | ||
| case UDAT_YEAR_WOY_FIELD: | ||
| if (fDateOverride.compare(hebr)==0 && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { | ||
| if (fDateOverride.compare(HEBREW_CALENDAR_VALUE, HEBREW_CALENDAR_VALUE_LEN)==0 && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (fDateOverride.compare(HEBREW_CALENDAR_VALUE, HEBREW_CALENDAR_VALUE_LEN)==0 && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { | |
| if (fDateOverride == u"hebr" && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { |
and don't add the new constants above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was meant to be time sensitive, I was trying to avoid doing a strlen. I'd rather go with Roubert's suggestion on these lines. I was trying to avoid the construction of u16string_view and a string length calculation from that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can expect the compiler to do all of this at compile time. No runtime overhead from UnicodeString::operator==(string literal).
I also had to get used to the fact that we are no longer dealing with the C++ of our college days...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have switched it to be operator== instead. I have left it as a named constexpr, since I consider that to be best practice, and it's used in multiple locations.
icu4c/source/i18n/smpdtfmt.cpp
Outdated
| // century, for parsed strings from "00" to "99". Any other string | ||
| // is treated literally: "2250", "-1", "1", "002". | ||
| if (fDateOverride.compare(hebr)==0 && value < 1000) { | ||
| if (fDateOverride.compare(HEBREW_CALENDAR_VALUE, HEBREW_CALENDAR_VALUE_LEN)==0 && value < 1000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
icu4c/source/i18n/smpdtfmt.cpp
Outdated
| case UDAT_YEAR_WOY_FIELD: | ||
| // Comment is the same as for UDAT_Year_FIELDs - look above | ||
| if (fDateOverride.compare(hebr)==0 && value < 1000) { | ||
| if (fDateOverride.compare(HEBREW_CALENDAR_VALUE, HEBREW_CALENDAR_VALUE_LEN)==0 && value < 1000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| // Static list of errors found | ||
| static UnicodeString errorList; | ||
| static UnicodeString& getErrorList() { | ||
| static UnicodeString* errorList = new UnicodeString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised that we don't get a memory leak complaint...
I guess that if the tests pass we don't call this. If they fail, we should get both the test failure and a leak report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no error, there is no error list to show. If there was a failure, I'm sure that this UnicodeString would have been allocated some memory, just as before.
Since the CI checks aren't complaining, I don't consider this to be an issue.
| constexpr RandomNumberGenerator::result_type defaultSeed = std::ranlux48_base::default_seed; | ||
| static RandomNumberGenerator randomNumberGenerator; | ||
| static RandomNumberGenerator& getDefaultRandomNumberGenerator() { | ||
| static RandomNumberGenerator *randomNumberGenerator = new RandomNumberGenerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory leak?
What I mean with these questions:
I was convinced that we had instrumented our test suites so that they call u_cleanup() in the end, and when we run them in valgrind (or similar), and some memory hasn't been released, we get a memory leak failure. So while this kind of code is fine in a regular program (all goes "poof" when it exits), I would expect this to fail what I thought we had as a test setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, what do you know!? You're right, and the test isn't failing as expected. I'll rethink this one.
==15900==
==15900== HEAP SUMMARY:
==15900== in use at exit: 120 bytes in 1 blocks
==15900== total heap usage: 9,985,177 allocs, 9,985,176 frees, 2,062,279,131 bytes allocated
==15900==
==15900== 120 bytes in 1 blocks are still reachable in loss record 1 of 1
==15900== at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==15900== by 0x55533F: getDefaultRandomNumberGenerator (rbbitst.cpp:1991)
==15900== by 0x55533F: RBBITest::RunMonkey(icu::BreakIterator*, RBBIMonkeyKind&, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, long, signed char, _IO_FILE*, signed char) (rbbitst.cpp:3068)
==15900== by 0x52EF0F: RBBITest::TestMonkey() (rbbitst.cpp:2958)
==15900== by 0x304CC6: IntlTest::runTestLoop(char*, char*, char*) (intltest.cpp:809)
==15900== by 0x3048D3: IntlTest::runTest(char*, char*, char*) (intltest.cpp:730)
==15900== by 0x30477D: IntlTest::callTest(IntlTest&, char*) (intltest.cpp:631)
==15900== by 0x51FBB0: IntlTestRBBI::runIndexedTest(int, signed char, char const*&, char*) (itrbbi.cpp:36)
==15900== by 0x304CC6: IntlTest::runTestLoop(char*, char*, char*) (intltest.cpp:809)
==15900== by 0x3048D3: IntlTest::runTest(char*, char*, char*) (intltest.cpp:730)
==15900== by 0x30477D: IntlTest::callTest(IntlTest&, char*) (intltest.cpp:631)
==15900== by 0x316CEC: MajorTestLevel::runIndexedTest(int, signed char, char const*&, char*) (itmajor.cpp:118)
==15900== by 0x304CC6: IntlTest::runTestLoop(char*, char*, char*) (intltest.cpp:809)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI you need to use something like --errors-for-leak-kinds=definite or a variation of that to make valgrind fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, the udbg_knownIssue_open is causing memory to still be reachable according to one of the CI checks. My changes don't touch that code. Is that working as expected, or did you want an issue submitted for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, the udbg_knownIssue_open is causing memory to still be reachable according to one of the CI checks. My changes don't touch that code. Is that working as expected, or did you want an issue submitted for that?
Please create a ticket.
We should probably have an xxx_cleanup() function for test-only libraries that intltest & cintltst then need to call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitted as ICU-23275
|
|
||
| U_STRING_DECL(k_type_string, "string", 6); | ||
| U_STRING_DECL(k_type_binary, "binary", 6); | ||
| U_STRING_DECL(k_type_bin, "bin", 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why keep a few of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because they're not used in gResourceTypes, whose variable references caused the C++ static initialization. These are aliases that are used elsewhere. I was trying to minimize the changes to make it easier to review.
27539ed to
890e3e3
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
|
@markusicu Please take another look at these changes. |
|
@grhoten I tried to look for further changes in context, but I see that you squashed prematurely... Please remember not to squash until after approval! |
markusicu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except for the string literals in smpdtfmt.cpp
Sorry about that. I thought I had addressed your comments. Ideally, this extra manual squash step should be removed, and the GitHub settings of this repository should be switched from "Rebase and merge" to "Squash and merge". It really simplifies the process, and the commit check could be removed. |
markusicu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this extra manual squash step should be removed, and the GitHub settings of this repository should be switched from "Rebase and merge" to "Squash and merge". It really simplifies the process, and the commit check could be removed.
Except it doesn't work for a project like ICU. We want every change to come with a Jira ticket, and we are enforcing it with a CI check that looks for an open ticket.
Squash-and-merge would go around that, because it lets you edit the commit message, without another CI check.
This is much less annoying than it was at first: PR approval is no longer dismissed when you truly squash and not change any file contents. So we can build up lots of commits and comments, get a PR approved, then squash -- doable via GitHub UI -- and let it rattle through the CI checks once more, some of which we need anyway for checking the PR title and commit message.
| case UDAT_YEAR_FIELD: | ||
| case UDAT_YEAR_WOY_FIELD: | ||
| if (fDateOverride.compare(hebr)==0 && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { | ||
| if (fDateOverride == HEBREW_CALENDAR_VALUE && value>HEBREW_CAL_CUR_MILLENIUM_START_YEAR && value<HEBREW_CAL_CUR_MILLENIUM_END_YEAR) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI -- Thanks, this looks ok.
Note that for a named string constant like this you could also have a constexpr std::u16string_view. Then there would be no doubt that the compiler computes the length.
35dcf23 to
bf705f3
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
We should not be using C++ static initialization. It can lengthen the time to load the ICU library. This can be an issue when frequently launching a program on some platforms, and only a small part of the ICU library is being used.
The
-Wglobal-constructors -Wexit-time-destructorsclang options can be used to prevent C++ static initialization from coming back.Some of the C++ initialization was moved to a local context, since the static context added no value.
Checklist