Skip to content

Make simplecpp friendlier toward 64-bit builds. #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zlaski
Copy link

@zlaski zlaski commented Jan 16, 2023

Use std::size_t to make build bit-width agnostic. On Windows, suppress warning about assignment in a conditional expression.

@danmar
Copy link
Owner

danmar commented Jan 17, 2023

some CI warning

@@ -779,7 +779,7 @@ static bool isFloatSuffix(const simplecpp::Token *tok)
{
if (!tok || tok->str().size() != 1U)
return false;
const char c = std::tolower(tok->str()[0]);
const char c = (char)std::tolower(tok->str()[0]);
Copy link
Contributor

@diamante0018 diamante0018 Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, shouldn't this also be static_cast<char>( std::tolower( static_cast<unsigned char>( tok->str()[0] ) ) ); ?
If I remember correctly, it should be like that with this kind of function since tok->string() returns a std::string so the type of char[0] should be char

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion but static_cast is pretty verbose in this case.

If we know that the first char ascii value is 0-127 then the cast is not needed. well I guess for garbage input the value could be out of range so the cast makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I added the casts to please the compiler. Like you said, your solution is verbose, and I'm not sure what it buys us (perhaps I just don't understand the intricacies of static_cast<...>). Also, since std::string is made up of elements of type char, it is intended to handle only ASCII values 0-127. If you want to handle values 0-255, I think it would be better to do something like

typedef std::basic_string<unsigned char> TokenString;

and then to change char to unsigned char throughout.

I guess a broader question is whether modern preprocessors should also handle UTF8/Unicode. I have absolutely no opinion on this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since std::string is made up of elements of type char, it is intended to handle only ASCII values 0-127.

If utf-8 is used then on my machine since char is signed there will sometimes be negative values.

I don't know if there is anything unintended about that. I.e. as far as I know there is not undefined behavior. But maybe something in the conversion is undefined.

typedef std::basic_string TokenString;

ouch. I believe that would complicate the interface. For instance, Cppcheck reads the tokens from simplecpp and expects that TokenString can be converted directly to plain simple std::string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah when this is looked at from a broader perspective the first cast to unsigned char does seem unnecessary, but it would not harm. The second cast (casting return value of tolower to char) would appease the MSVC compiler though and perhaps it could be added.

@@ -457,7 +457,7 @@ static bool isStringLiteralPrefix(const std::string &str)
str == "R" || str == "uR" || str == "UR" || str == "LR" || str == "u8R";
}

void simplecpp::TokenList::lineDirective(unsigned int fileIndex, unsigned int line, Location *location)
void simplecpp::TokenList::lineDirective(std::size_t fileIndex, unsigned int line, Location *location)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's rather excessive to use a 64-bit value for fileIndex. you might see 10-20 files in a translation unit but hardly millions of files..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. For some reason, I thought that fileIndex was a file offset.

@@ -36,6 +36,7 @@
# else
# define SIMPLECPP_LIB
# endif
# pragma warning(disable: 4706)
Copy link
Contributor

@diamante0018 diamante0018 Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be put in its own block?
Similar to how nullptr is defined as NULL in this file.

@danmar
Copy link
Owner

danmar commented Mar 2, 2023

I am not a fan about using std::size_t for various variables. Maybe we can shut off the MSVC warnings globally instead.

@diamante0018
Copy link
Contributor

I am not a fan about using std::size_t for various variables. Maybe we can shut off the MSVC warnings globally instead.

This should have been now completely resolved. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants