-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[libcxx] Provide locale conversions to tests through lit substitution #86649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libcxx] Provide locale conversions to tests through lit substitution #86649
Conversation
ea03aea
to
e400580
Compare
✅ With the latest revision this PR passed the Python code formatter. |
e400580
to
d5b6e61
Compare
d5b6e61
to
452e395
Compare
@llvm/pr-subscribers-testing-tools @llvm/pr-subscribers-libcxx Author: Rodrigo Salazar (4-rodrigo-salazar) ChangesThere are 2 problems today that this PR resolves:
This PR includes the following changes:
This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. Full diff: https://github.com/llvm/llvm-project/pull/86649.diff 10 Files Affected:
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
index 3effa80e7d6f79..51dc25b8171c2d 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
@@ -15,6 +15,8 @@
// REQUIRES: locale.fr_FR.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DFR_MON_THOU_SEP=%{LOCALE_CONV_FR_FR_UTF_8_MON_THOUSANDS_SEP}
+
// <locale>
// class money_get<charT, InputIterator>
@@ -54,7 +56,8 @@ class my_facetw
};
static std::wstring convert_thousands_sep(std::wstring const& in) {
- return LocaleHelpers::convert_thousands_sep_fr_FR(in);
+ const wchar_t fr_sep = LocaleHelpers::mon_thousands_sep_or_default(FR_MON_THOU_SEP);
+ return LocaleHelpers::convert_thousands_sep(in, fr_sep);
}
#endif // TEST_HAS_NO_WIDE_CHARACTERS
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp
index e680f2ea8816ac..371cf0e90c8d3c 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp
@@ -11,6 +11,8 @@
// REQUIRES: locale.ru_RU.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DRU_MON_THOU_SEP=%{LOCALE_CONV_RU_RU_UTF_8_MON_THOUSANDS_SEP}
+
// XFAIL: glibc-old-ru_RU-decimal-point
// <locale>
@@ -52,7 +54,8 @@ class my_facetw
};
static std::wstring convert_thousands_sep(std::wstring const& in) {
- return LocaleHelpers::convert_thousands_sep_ru_RU(in);
+ const wchar_t ru_sep = LocaleHelpers::mon_thousands_sep_or_default(RU_MON_THOU_SEP);
+ return LocaleHelpers::convert_thousands_sep(in, ru_sep);
}
#endif // TEST_HAS_NO_WIDE_CHARACTERS
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
index 05b4ee474944af..646ab651cf3973 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
@@ -15,6 +15,8 @@
// REQUIRES: locale.fr_FR.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DFR_MON_THOU_SEP=%{LOCALE_CONV_FR_FR_UTF_8_MON_THOUSANDS_SEP}
+
// <locale>
// class money_put<charT, OutputIterator>
@@ -54,7 +56,8 @@ class my_facetw
};
static std::wstring convert_thousands_sep(std::wstring const& in) {
- return LocaleHelpers::convert_thousands_sep_fr_FR(in);
+ const wchar_t fr_sep = LocaleHelpers::mon_thousands_sep_or_default(FR_MON_THOU_SEP);
+ return LocaleHelpers::convert_thousands_sep(in, fr_sep);
}
#endif // TEST_HAS_NO_WIDE_CHARACTERS
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp
index 4aea1016e735b9..be1e3974884682 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp
@@ -11,6 +11,8 @@
// REQUIRES: locale.ru_RU.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DRU_MON_THOU_SEP=%{LOCALE_CONV_RU_RU_UTF_8_MON_THOUSANDS_SEP}
+
// XFAIL: glibc-old-ru_RU-decimal-point
// <locale>
@@ -52,7 +54,8 @@ class my_facetw
};
static std::wstring convert_thousands_sep(std::wstring const& in) {
- return LocaleHelpers::convert_thousands_sep_ru_RU(in);
+ const wchar_t ru_sep = LocaleHelpers::mon_thousands_sep_or_default(RU_MON_THOU_SEP);
+ return LocaleHelpers::convert_thousands_sep(in, ru_sep);
}
#endif // TEST_HAS_NO_WIDE_CHARACTERS
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
index 2a70741d2a0fa6..1fc222a5ee8b10 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
@@ -16,6 +16,9 @@
// REQUIRES: locale.ru_RU.UTF-8
// REQUIRES: locale.zh_CN.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DFR_MON_THOU_SEP=%{LOCALE_CONV_FR_FR_UTF_8_MON_THOUSANDS_SEP}
+// ADDITIONAL_COMPILE_FLAGS: -DRU_MON_THOU_SEP=%{LOCALE_CONV_RU_RU_UTF_8_MON_THOUSANDS_SEP}
+
// <locale>
// class moneypunct_byname<charT, International>
@@ -27,6 +30,7 @@
#include <cassert>
#include "test_macros.h"
+#include "locale_helpers.h"
#include "platform_support.h" // locale name macros
class Fnf
@@ -110,17 +114,10 @@ int main(int, char**)
Fnt f(LOCALE_fr_FR_UTF_8, 1);
assert(f.thousands_sep() == ' ');
}
- // The below tests work around GLIBC's use of U202F as mon_thousands_sep.
+
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
-#if defined(_CS_GNU_LIBC_VERSION)
- const wchar_t fr_sep = glibc_version_less_than("2.27") ? L' ' : L'\u202F';
-#elif defined(_WIN32)
- const wchar_t fr_sep = L'\u00A0';
-#elif defined(_AIX)
- const wchar_t fr_sep = L'\u202F';
-#else
- const wchar_t fr_sep = L' ';
-#endif
+ const wchar_t fr_sep = LocaleHelpers::mon_thousands_sep_or_default(FR_MON_THOU_SEP);
+
{
Fwf f(LOCALE_fr_FR_UTF_8, 1);
assert(f.thousands_sep() == fr_sep);
@@ -140,19 +137,8 @@ int main(int, char**)
assert(f.thousands_sep() == sep);
}
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
- // The below tests work around GLIBC's use of U00A0 as mon_thousands_sep
- // and U002E as mon_decimal_point.
- // TODO: Fix thousands_sep for 'char'.
- // related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006
-# if defined(_CS_GNU_LIBC_VERSION)
- // FIXME libc++ specifically works around \u00A0 by translating it into
- // a regular space.
- const wchar_t wsep = glibc_version_less_than("2.27") ? L'\u00A0' : L'\u202F';
-# elif defined(_WIN32) || defined(_AIX)
- const wchar_t wsep = L'\u00A0';
-# else
- const wchar_t wsep = L' ';
-# endif
+ const wchar_t wsep = LocaleHelpers::mon_thousands_sep_or_default(RU_MON_THOU_SEP);
+
{
Fwf f(LOCALE_ru_RU_UTF_8, 1);
assert(f.thousands_sep() == wsep);
diff --git a/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp b/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
index d7e1178c92e041..80c42ff48d3403 100644
--- a/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
@@ -15,6 +15,8 @@
// REQUIRES: locale.en_US.UTF-8
// REQUIRES: locale.fr_FR.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DFR_THOU_SEP=%{LOCALE_CONV_FR_FR_UTF_8_THOUSANDS_SEP}
+
// <locale>
// template <class charT> class numpunct_byname;
@@ -26,6 +28,7 @@
#include <cassert>
#include "test_macros.h"
+#include "locale_helpers.h"
#include "platform_support.h" // locale name macros
int main(int, char**)
@@ -75,16 +78,11 @@ int main(int, char**)
}
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
{
-#if defined(_CS_GNU_LIBC_VERSION)
- const wchar_t wsep = glibc_version_less_than("2.27") ? L' ' : L'\u202f';
-#elif defined(_WIN32)
- const wchar_t wsep = L'\u00A0';
-#else
- const wchar_t wsep = L',';
-#endif
- typedef wchar_t C;
- const std::numpunct<C>& np = std::use_facet<std::numpunct<C> >(l);
- assert(np.thousands_sep() == wsep);
+ const wchar_t wsep = LocaleHelpers::thousands_sep_or_default(FR_THOU_SEP);
+
+ typedef wchar_t C;
+ const std::numpunct<C>& np = std::use_facet<std::numpunct<C> >(l);
+ assert(np.thousands_sep() == wsep);
}
#endif // TEST_HAS_NO_WIDE_CHARACTERS
}
diff --git a/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp b/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
index e5d11ab4672bdf..287d5ed846b2cc 100644
--- a/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
+++ b/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
@@ -16,6 +16,9 @@
// REQUIRES: locale.fr_FR.UTF-8
// REQUIRES: locale.ja_JP.UTF-8
+// ADDITIONAL_COMPILE_FLAGS: -DFR_THOU_SEP=%{LOCALE_CONV_FR_FR_UTF_8_THOUSANDS_SEP}
+// ADDITIONAL_COMPILE_FLAGS: -DFR_DEC_POINT=%{LOCALE_CONV_FR_FR_UTF_8_DECIMAL_POINT}
+
// <chrono>
// template<class Rep, class Period = ratio<1>> class duration;
@@ -33,6 +36,7 @@
#include <sstream>
#include "make_string.h"
+#include "locale_helpers.h"
#include "platform_support.h" // locale name macros
#include "test_macros.h"
@@ -88,22 +92,10 @@ static void test_values() {
assert(stream_fr_FR_locale<CharT>(1'000.123456s) == SV("1 000,1235s"));
#endif
} else {
-#ifdef _WIN32
- assert(stream_fr_FR_locale<CharT>(-1'000'000s) == SV("-1\u00A0000\u00A0000s"));
- assert(stream_fr_FR_locale<CharT>(1'000'000s) == SV("1\u00A0000\u00A0000s"));
- assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == SV("-1\u00A0000,1235s"));
- assert(stream_fr_FR_locale<CharT>(1'000.123456s) == SV("1\u00A0000,1235s"));
-#elif defined(__APPLE__)
- assert(stream_fr_FR_locale<CharT>(-1'000'000s) == SV("-1000000s"));
- assert(stream_fr_FR_locale<CharT>(1'000'000s) == SV("1000000s"));
- assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == SV("-1000,1235s"));
- assert(stream_fr_FR_locale<CharT>(1'000.123456s) == SV("1000,1235s"));
-#else
- assert(stream_fr_FR_locale<CharT>(-1'000'000s) == SV("-1\u202f000\u202f000s"));
- assert(stream_fr_FR_locale<CharT>(1'000'000s) == SV("1\u202f000\u202f000s"));
- assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == SV("-1\u202f000,1235s"));
- assert(stream_fr_FR_locale<CharT>(1'000.123456s) == SV("1\u202f000,1235s"));
-#endif
+ assert(stream_fr_FR_locale<CharT>(-1'000'000s) == L"-1" FR_THOU_SEP "000" FR_THOU_SEP "000s");
+ assert(stream_fr_FR_locale<CharT>(1'000'000s) == L"1" FR_THOU_SEP "000" FR_THOU_SEP "000s");
+ assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == L"-1" FR_THOU_SEP "000" FR_DEC_POINT "1235s");
+ assert(stream_fr_FR_locale<CharT>(1'000.123456s) == L"1" FR_THOU_SEP "000" FR_DEC_POINT "1235s");
}
assert(stream_ja_JP_locale<CharT>(-1'000'000s) == SV("-1,000,000s"));
diff --git a/libcxx/test/support/locale_helpers.h b/libcxx/test/support/locale_helpers.h
index 3eb24ebf28f524..946c2fed0f3a53 100644
--- a/libcxx/test/support/locale_helpers.h
+++ b/libcxx/test/support/locale_helpers.h
@@ -41,37 +41,6 @@ std::wstring convert_thousands_sep(std::wstring const& in, wchar_t sep) {
return out;
}
-// GLIBC 2.27 and newer use U+202F NARROW NO-BREAK SPACE as a thousands separator.
-// This function converts the spaces in string inputs to U+202F if need
-// be. FreeBSD's locale data also uses U+202F, since 2018.
-// Windows uses U+00A0 NO-BREAK SPACE.
-std::wstring convert_thousands_sep_fr_FR(std::wstring const& in) {
-#if defined(_CS_GNU_LIBC_VERSION)
- if (glibc_version_less_than("2.27"))
- return in;
- else
- return convert_thousands_sep(in, L'\u202F');
-#elif defined(__FreeBSD__)
- return convert_thousands_sep(in, L'\u202F');
-#elif defined(_WIN32)
- return convert_thousands_sep(in, L'\u00A0');
-#else
- return in;
-#endif
-}
-
-// GLIBC 2.27 uses U+202F NARROW NO-BREAK SPACE as a thousands separator.
-// FreeBSD, AIX and Windows use U+00A0 NO-BREAK SPACE.
-std::wstring convert_thousands_sep_ru_RU(std::wstring const& in) {
-#if defined(TEST_HAS_GLIBC)
- return convert_thousands_sep(in, L'\u202F');
-# elif defined(__FreeBSD__) || defined(_WIN32) || defined(_AIX)
- return convert_thousands_sep(in, L'\u00A0');
-# else
- return in;
-# endif
-}
-
std::wstring negate_en_US(std::wstring s) {
#if defined(_WIN32)
return L"(" + s + L")";
@@ -80,6 +49,12 @@ std::wstring negate_en_US(std::wstring s) {
#endif
}
+wchar_t thousands_sep_or_default(std::wstring s) { return !s.empty() ? s[0] : L','; }
+
+wchar_t mon_thousands_sep_or_default(std::wstring s) { return thousands_sep_or_default(s); }
+
+wchar_t decimal_point_or_default(std::wstring s) { return !s.empty() ? s[0] : L'.'; }
+
#endif // TEST_HAS_NO_WIDE_CHARACTERS
std::string negate_en_US(std::string s) {
diff --git a/libcxx/utils/libcxx/test/features.py b/libcxx/utils/libcxx/test/features.py
index 0793c34fd7f0bf..f0230508feba06 100644
--- a/libcxx/utils/libcxx/test/features.py
+++ b/libcxx/utils/libcxx/test/features.py
@@ -8,6 +8,7 @@
from libcxx.test.dsl import *
from lit.BooleanExpression import BooleanExpression
+from lit.TestRunner import substitutionsWillEscapeBackslashes
import re
import shutil
import subprocess
@@ -335,6 +336,10 @@ def _getAndroidDeviceApi(cfg):
"fr_CA.ISO8859-1": ["fr_CA.ISO8859-1", "French_Canada.1252"],
"cs_CZ.ISO8859-2": ["cs_CZ.ISO8859-2", "Czech_Czech Republic.1250"],
}
+provide_locale_conversions = {
+ "fr_FR.UTF-8": ["decimal_point", "mon_thousands_sep", "thousands_sep"],
+ "ru_RU.UTF-8": ["mon_thousands_sep"],
+}
for locale, alts in locales.items():
# Note: Using alts directly in the lambda body here will bind it to the value at the
# end of the loop. Assigning it to a default argument works around this issue.
@@ -342,10 +347,90 @@ def _getAndroidDeviceApi(cfg):
Feature(
name="locale.{}".format(locale),
when=lambda cfg, alts=alts: hasAnyLocale(cfg, alts),
- )
+ actions=lambda cfg, locale=locale, alts=alts: _getLocaleFlagsAction(
+ cfg, locale, alts, provide_locale_conversions[locale]
+ )
+ if locale in provide_locale_conversions
+ and "_LIBCPP_HAS_NO_WIDE_CHARACTERS" not in compilerMacros(cfg)
+ else [],
+ ),
)
+# Provide environment locale conversions through substitutions to avoid platform specific
+# maintenance.
+def _getLocaleFlagsAction(cfg, locale, alts, members):
+ alts_list = ",".join([f'"{l}"' for l in alts])
+ get_member_list = ",".join([f"lc->{m}" for m in members])
+
+ localeconv_info = programOutput(
+ cfg,
+ r"""
+ #if defined(_WIN32) && !defined(_CRT_SECURE_NO_WARNINGS)
+ #define _CRT_SECURE_NO_WARNINGS
+ #endif
+ #include <stdio.h>
+ #include <locale.h>
+ #include <stdlib.h>
+ #include <wchar.h>
+
+ // Print each requested locale conversion member on separate lines.
+ int main() {
+ const char* locales[] = { %s };
+ for (int loc_i = 0; loc_i < %d; ++loc_i) {
+ if (!setlocale(LC_ALL, locales[loc_i])) {
+ continue; // Choose first locale name that is recognized.
+ }
+
+ lconv* lc = localeconv();
+ const char* members[] = { %s };
+ for (size_t m_i = 0; m_i < %d; ++m_i) {
+ if (!members[m_i]) {
+ printf("\n"); // member value is an empty string
+ continue;
+ }
+
+ size_t len = mbstowcs(nullptr, members[m_i], 0);
+ if (len == static_cast<size_t>(-1)) {
+ fprintf(stderr, "mbstowcs failed unexpectedly");
+ return 1;
+ }
+ wchar_t* dst = new wchar_t[len + 1]; // Include room for null terminator
+ size_t ret = mbstowcs(dst, members[m_i], len);
+ if (ret == static_cast<size_t>(-1)) {
+ fprintf(stderr, "mbstowcs failed unexpectedly");
+ return 1;
+ }
+
+ for (size_t i = 0; i < len; ++i) {
+ printf("\\u%%04x", dst[i]);
+ }
+ printf("\n");
+ delete[] dst;
+ }
+ return 0;
+ }
+
+ return 1;
+ }
+ """
+ % (alts_list, len(alts), get_member_list, len(members)),
+ )
+ if not substitutionsWillEscapeBackslashes():
+ # Account for different escaping depending on platform running lit
+ localeconv_info = localeconv_info.replace("\\", "\\\\")
+ valid_define_name = re.sub(r"[.-]", "_", locale).upper()
+ return [
+ # Provide locale conversion through a substitution.
+ # Example: %{LOCALE_CONV_FR_FR_UTF_8_THOUSANDS_SEP} = L"\u202f"
+ AddSubstitution(
+ f"%{{LOCALE_CONV_{valid_define_name}_{member.upper()}}}",
+ lambda cfg, value=value: f"'L\"{value}\"'",
+ )
+ for member, value in zip(members, localeconv_info.split("\n"))
+ ]
+
+
# Add features representing the target platform name: darwin, linux, windows, etc...
DEFAULT_FEATURES += [
Feature(name="darwin", when=lambda cfg: "__APPLE__" in compilerMacros(cfg)),
diff --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
index da7fa86fd39173..e1d019a741cae0 100644
--- a/llvm/utils/lit/lit/TestRunner.py
+++ b/llvm/utils/lit/lit/TestRunner.py
@@ -1625,6 +1625,10 @@ def adjust_substitutions(self, substitutions):
substitutions[existing[0]] = (self.name, value_repl)
+def substitutionsWillEscapeBackslashes():
+ return kIsWindows
+
+
def applySubstitutions(script, substitutions, conditions={}, recursion_limit=None):
"""
Apply substitutions to the script. Allow full regular expression syntax.
@@ -1738,7 +1742,7 @@ def processLine(ln):
# Apply substitutions
ln = substituteIfElse(escapePercents(ln))
for a, b in substitutions:
- if kIsWindows:
+ if substitutionsWillEscapeBackslashes():
b = b.replace("\\", "\\\\")
# re.compile() has a built-in LRU cache with 512 entries. In some
# test suites lit ends up thrashing that cache, which made e.g.
|
You can test this locally with the following command:darker --check --diff -r 6a6f9bf38e65ec45b32da4b578e2830341a9b364...ea03aea1148e581983aa667a2728fdac9a9a12b5 libcxx/utils/libcxx/test/features.py llvm/utils/lit/lit/TestRunner.py View the diff from darker here.--- libcxx/utils/libcxx/test/features.py 2024-03-26 10:08:18.000000 +0000
+++ libcxx/utils/libcxx/test/features.py 2024-03-26 10:23:41.146995 +0000
@@ -345,22 +345,25 @@
# end of the loop. Assigning it to a default argument works around this issue.
DEFAULT_FEATURES.append(
Feature(
name="locale.{}".format(locale),
when=lambda cfg, alts=alts: hasAnyLocale(cfg, alts),
- actions=lambda cfg, locale=locale, alts=alts:
- _getLocaleFlagsAction(cfg, locale, alts, provide_locale_conversions[locale])
- if locale in provide_locale_conversions
- and "_LIBCPP_HAS_NO_WIDE_CHARACTERS" not in compilerMacros(cfg) else []
+ actions=lambda cfg, locale=locale, alts=alts: _getLocaleFlagsAction(
+ cfg, locale, alts, provide_locale_conversions[locale]
+ )
+ if locale in provide_locale_conversions
+ and "_LIBCPP_HAS_NO_WIDE_CHARACTERS" not in compilerMacros(cfg)
+ else [],
),
)
+
# Provide environment locale conversions through substitutions to avoid platform specific
# maintenance.
def _getLocaleFlagsAction(cfg, locale, alts, members):
alts_list = ",".join([f'"{l}"' for l in alts])
- get_member_list = ",".join([f'lc->{m}' for m in members])
+ get_member_list = ",".join([f"lc->{m}" for m in members])
localeconv_info = programOutput(
cfg,
r"""
#if defined(_WIN32) && !defined(_CRT_SECURE_NO_WARNINGS)
@@ -408,23 +411,25 @@
return 0;
}
return 1;
}
- """ % (alts_list, len(alts), get_member_list, len(members)),
+ """
+ % (alts_list, len(alts), get_member_list, len(members)),
)
if not substitutionsWillEscapeBackslashes():
# Account for different escaping depending on platform running lit
localeconv_info = localeconv_info.replace("\\", "\\\\")
valid_define_name = re.sub(r"[.-]", "_", locale).upper()
return [
# Provide locale conversion through a substitution.
# Example: %{LOCALE_CONV_FR_FR_UTF_8_THOUSANDS_SEP} = L"\u202f"
AddSubstitution(
- f'%{{LOCALE_CONV_{valid_define_name}_{member.upper()}}}',
- lambda cfg, value=value: f'\'L"{value}"\''
- ) for member, value in zip(members, localeconv_info.split('\n'))
+ f"%{{LOCALE_CONV_{valid_define_name}_{member.upper()}}}",
+ lambda cfg, value=value: f"'L\"{value}\"'",
+ )
+ for member, value in zip(members, localeconv_info.split("\n"))
]
# Add features representing the target platform name: darwin, linux, windows, etc...
DEFAULT_FEATURES += [
|
Substitutions can be added in a couple different ways; they can be added via the calling python scripts by adding entries to the config.substitutions dictionary, or via DEFINE lines in the scripts themselves. The substitution strings passed to Python's re classes are interpreted so that backslashes expand to escape sequences, and literal backslashes need to be escaped. On Unix, the script defined substitutions don't (usually, so far) contain backslashes - but on Windows, they often do, due to paths containing backslashes. This lead to a Windows specific escaping of backslashes before doing Python re substitutions - since 7c9eab8. There's nothing inherently Windows specific about this though - any intended literal backslashes in the substitution strings need to be escaped; this is how the Python re API works. The DEFINE lines were added later, and in order to cope with backslashes, escaping of backslashes was added in the SubstDirective class in TestRunner, applying to DEFINE lines in the tests only. The fact that the escaping right before passing to the Python re API was done conditionally on Windows led to two inconsistencies: - DEFINE lines in the tests that contain backslashes got double backslashes on Windows. (This was visible as a FIXME in llvm/utils/lit/tests/Inputs/shtest-define/value-escaped.txt.) - Script provided substitutions containing backslashes did not work on Unix, but they did work on Windows. By removing the escaping from SubstDirective and escaping it unconditionally in the processLine function, before feeding the substitutions to Python's re classes, we should have consistent behaviour across platforms, and get rid of the FIXME in the lit test. This fixes issues with substitutions containing backslashes on Unix platforms, as encountered in PR llvm#86649.
Hey, @4-rodrigo-salazar - I see you closed this PR... I presume this was due to the lack of reviews? I'd be interested in picking this up and pushing it forward, if you don't have time/interest to do it. Were there any other issues with it that you would like to share? I looked at the issue with escaping of backslashes, which seemed to require changes to the core of |
FWIW I agree -- there are some things to change in this patch but I think the direction makes sense. It cleans up a lot of complexity in the test suite. |
Substitutions can be added in a couple different ways; they can be added via the calling python scripts by adding entries to the config.substitutions dictionary, or via DEFINE lines in the scripts themselves. The substitution strings passed to Python's re classes are interpreted so that backslashes expand to escape sequences, and literal backslashes need to be escaped. On Unix, the script defined substitutions don't (usually, so far) contain backslashes - but on Windows, they often do, due to paths containing backslashes. This lead to a Windows specific escaping of backslashes before doing Python re substitutions - since 7c9eab8. There's nothing inherently Windows specific about this though - any intended literal backslashes in the substitution strings need to be escaped; this is how the Python re API works. The DEFINE lines were added later, and in order to cope with backslashes, escaping of backslashes was added in the SubstDirective class in TestRunner, applying to DEFINE lines in the tests only. The fact that the escaping right before passing to the Python re API was done conditionally on Windows led to two inconsistencies: - DEFINE lines in the tests that contain backslashes got double backslashes on Windows. (This was visible as a FIXME in llvm/utils/lit/tests/Inputs/shtest-define/value-escaped.txt.) - Script provided substitutions containing backslashes did not work on Unix, but they did work on Windows. By removing the escaping from SubstDirective and escaping it unconditionally in the processLine function, before feeding the substitutions to Python's re classes, we should have consistent behaviour across platforms, and get rid of the FIXME in the lit test. This fixes issues with substitutions containing backslashes on Unix platforms, as encountered in PR #86649.
There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]>
I reopened a new PR for pushing this further, in #105651. |
There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]>
Substitutions can be added in a couple different ways; they can be added via the calling python scripts by adding entries to the config.substitutions dictionary, or via DEFINE lines in the scripts themselves. The substitution strings passed to Python's re classes are interpreted so that backslashes expand to escape sequences, and literal backslashes need to be escaped. On Unix, the script defined substitutions don't (usually, so far) contain backslashes - but on Windows, they often do, due to paths containing backslashes. This lead to a Windows specific escaping of backslashes before doing Python re substitutions - since 7c9eab8. There's nothing inherently Windows specific about this though - any intended literal backslashes in the substitution strings need to be escaped; this is how the Python re API works. The DEFINE lines were added later, and in order to cope with backslashes, escaping of backslashes was added in the SubstDirective class in TestRunner, applying to DEFINE lines in the tests only. The fact that the escaping right before passing to the Python re API was done conditionally on Windows led to two inconsistencies: - DEFINE lines in the tests that contain backslashes got double backslashes on Windows. (This was visible as a FIXME in llvm/utils/lit/tests/Inputs/shtest-define/value-escaped.txt.) - Script provided substitutions containing backslashes did not work on Unix, but they did work on Windows. By removing the escaping from SubstDirective and escaping it unconditionally in the processLine function, before feeding the substitutions to Python's re classes, we should have consistent behaviour across platforms, and get rid of the FIXME in the lit test. This fixes issues with substitutions containing backslashes on Unix platforms, as encountered in PR llvm#86649.
There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]>
There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]>
…#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in #86649. Co-authored-by: Rodrigo Salazar <[email protected]>
…ubstitution (#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm/llvm-project#86649. Co-authored-by: Rodrigo Salazar <[email protected]>
…llvm#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]> (cherry picked from commit f909b22)
…llvm#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]> (cherry picked from commit f909b22)
…llvm#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm#86649. Co-authored-by: Rodrigo Salazar <[email protected]> (cherry picked from commit f909b22)
…ubstitution (#105651) There are 2 problems today that this PR resolves: libcxx tests assume the thousands separator for fr_FR locale is x00A0 on Windows. This currently fails when run on newer versions of Windows (it seems to have been updated to the new correct value of 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere). Depending the OS version, you need different values. There are several ifdefs to determine the environment/platform-specific locale conversion values and it leads to maintenance as things change over time. This PR includes the following changes: - Provide the environment's locale conversion values through a substitution. The test can opt in by placing the substitution value in a define flag. - Remove the platform ifdefs (the swapping of values between Windows, Linux, Apple, AIX). This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself. In addition, as a result of the above, this PR: - Fixes a handful of locale tests which unexpectedly fail on newer Windows versions. - Resolves 3 XFAIL FIX-MEs. Originally submitted in llvm/llvm-project#86649. Co-authored-by: Rodrigo Salazar <[email protected]> (cherry picked from commit f909b22)
There are 2 problems today that this PR resolves:
This PR includes the following changes:
This is accomplished through a lit feature action that fetches the environment's locale conversions (lconv) for members like 'thousands_sep' that we need to provide. This should ensure that we don't lose the effectiveness of the test itself.
In addition, as a result of the above, this PR: