Initial draft defining syntax, semantics of controlling expressions #65

gklimowicz · 2025-05-08T03:28:41Z

We describe a subset of the C constant-expression syntax for use in controlling expressions. Expression evaluation itself follows Fortran arithmetic expression semantics.

Note that the tables are a bit terse as we try to keep the line length less than the 75-character limit for J3 papers.

We describe a subset of the C constant-expression syntax for use in controlling expressions. Expression evaluation itself follows Fortran arithmetic expression semantics. Note that the tables are a bit terse as we try to keep the line length less than the 75-character limit for J3 papers.

gklimowicz · 2025-05-08T03:30:08Z

This is pretty rough, but I wanted to produce something earlier rather than later. I have to go off for a day or so and work on other assignments for classes.

drafts/25-xxx-specifications.txt

Co-authored-by: Patrick Fasano <[email protected]>

bonachea

Thanks @gklimowicz for making a start on this tricky area!

Initial set of feedback:

drafts/25-xxx-specifications.txt

bonachea · 2025-05-12T03:09:04Z

drafts/25-xxx-specifications.txt

+Since expression evaluation occurs *after* token expansion, there will
+be no object-like macros or function-like macros left to evaluate. All


"token expansion" is not a defined concept.

In CPP the correct term is "macro expansion".

Also CPP macros are never "evaluated": they are either "expanded" or "replaced"

Suggested change

Since expression evaluation occurs *after* token expansion, there will

be no object-like macros or function-like macros left to evaluate. All

Since expression evaluation occurs *after* macro expansion, there will

be no object-like macro or function-like macro invocations left to expand. All

bonachea · 2025-05-12T03:24:41Z

drafts/25-xxx-specifications.txt

+instances of ID or ID (args) will all have been replaced with their
+expansions.


This last sentence falsely implies there will be no instances of ID after expansion. This is misleading and actually quite common, with code like:

#if ___GNUC___

which is shorthand to test whether ___GNUC___ is defined to a non-zero value.

This works because of 6.10.2-13 (emphasis added):

Prior to evaluation, macro invocations in the list of preprocessing tokens that will become the controlling constant expression are replaced (except for those macro names modified by the defined unary operator), just as in normal text. If the token defined is generated as a result of this replacement process or use of the defined unary operator does not match one of the two specified forms prior to macro replacement, the behavior is undefined. After all replacements due to macro expansion and evaluations of defined macro expressions, has_include expressions, has_embed expressions, and has_c_attribute expressions have been performed, all remaining identifiers other than true (including those lexically identical to keywords such as false) are replaced with the pp-number 0, true is replaced with pp-number 1, and then each preprocessing token is converted into a token.

We'll need similar rules (ignoring the C23 features we are not keeping) to explain the replacement of any ID with 0 after expansion.

bonachea · 2025-05-12T03:29:25Z

drafts/25-xxx-specifications.txt

+| ID           | The expansion of the object-like macro ID   |
+| ID (args)    | The expansion of the function-like macro ID |


I don't understand why are ID and ID(args) listed as primary expressions here. The paragraph immediately above has just explained that macros have already been expanded away during evaluation of conditional expressions, so macro invocations are NOT Primary expressions in the post-expansion expression grammar.

Listing them here "for completeness" is not helpful, it's just plain wrong. No conditional expression evaluation whatsoever is performed until after macros are completely expanded, and the pre-expansion text may look nothing like a valid conditional expression.

Here is a valid input example demonstrating what I mean:

#define LPAREN ( #define RPAREN ) #define ONE_PLUS 1 + #if ONE_PLUS ZERO * LPAREN ONE_PLUS 4 RPAREN integer :: tada #endif

Pre-expansion, the list of tokens in the expression above looks like:

#if ID ID * ID ID WHOLE_NUMBER ID

post-expansion it looks like this:

#if WHOLE_NUMBER + WHOLE_NUMBER * ( WHOLE_NUMBER + WHOLE_NUMBER )

So wildly different that it's not useful to talk about grammar of the conditional expressions prior to expansion (aside from the bare minimum required to delineate arguments in FLM invocations).

Let's only describe the post-expansion grammar, and not the pre-expansion grammar. There is no such pre-expansion grammar. Dan points out that undefined ID replacement has to be done after processing of ## tokens.

bonachea · 2025-05-12T20:16:36Z

drafts/25-xxx-specifications.txt

+|      | defined |  defined ID  | nonassoc |    1 if the identifier     |
+|      |         |              |          |   has a #defined value,    |
+|      |         |              |          |     0 otherwise            |


On further thought, this is just plain wrong.
defined cannot appear in this table, because it needs to be applied AFTER macro expansion and BEFORE ID replacement with zero. Hence the defined operator (as in CPP) must be resolved and replaced before this post-expansion grammar is applied.

bonachea · 2025-05-12T03:59:28Z

drafts/25-xxx-specifications.txt

+| ID           | The expansion of the object-like macro ID   |
+| ID (args)    | The expansion of the function-like macro ID |
+| WHOLE_NUMBER | Decimal value of WHOLE_NUMBER               |
+| ( expr )     | Parenthesized expressions                   |


Parenthesized expressions are listed in the operator table below, so listing them here is redundant.

After macro expansion and ID-replacement, I believe the only "primaries" left in valid conditional expressions should be WHOLE_NUMBER, and the operators in the table below combining them (which includes defined as an operator).

In short, I suggest we delete this "primary table" entirely and replace it with a statement to that effect.

bonachea · 2025-05-12T04:01:46Z

drafts/25-xxx-specifications.txt

+| ID (args)    | The expansion of the function-like macro ID |
+| WHOLE_NUMBER | Decimal value of WHOLE_NUMBER               |
+| ( expr )     | Parenthesized expressions                   |
+|--------------+---------------------------------------------|


Minor aside: C23 6.10.2-13 perversely also allows single-character character constants as "primaries" in conditional expressions, example from C23:

#if ’z’ - ’a’ == 25

however their exact CPP evaluation semantics are implementation-defined, which means their use is not guaranteed to be portable. I don't believe I've ever seen this bizarre "feature" used in practice.

I suspect this is some weird legacy holdover in CPP and unless someone provides a strong rationale for their inclusion I think FPP should prohibit character constants in conditional expressions.

Agreed, character constants in conditional expressions should be a (pathological) processor-dependent extension.

bonachea · 2025-05-12T04:42:47Z

drafts/25-xxx-specifications.txt

+|      |    >    |   e1 > e2    | nonassoc | 1 if e1 > e2, 0 otherwise  |
+|      |   >=    |   e1 >= e2   | nonassoc | 1 if e1 >= e2, 0 otherwise |
+|      |    <    |   e1 < e2    | nonassoc | 1 if e1 < e2, 0 otherwise  |
+|      |   <=    |   e1 <= e2   | nonassoc | 1 if e1 <= e2, 0 otherwise |


In C (and hence CPP) relational-expression operators are left associative.

Example:

#if 0 > -4 > 0 left-assoc #else right-assoc #endif

Expands to "left-assoc" in CPP.

However it appears some FPPs may (unintentionally?) diverge on this detail. Another great example why we need standardization!

Suggested change

| | > | e1 > e2 | nonassoc | 1 if e1 > e2, 0 otherwise |

| | >= | e1 >= e2 | nonassoc | 1 if e1 >= e2, 0 otherwise |

| | < | e1 < e2 | nonassoc | 1 if e1 < e2, 0 otherwise |

| | <= | e1 <= e2 | nonassoc | 1 if e1 <= e2, 0 otherwise |

| | > | e1 > e2 | left | 1 if e1 > e2, 0 otherwise |

| | >= | e1 >= e2 | left | 1 if e1 >= e2, 0 otherwise |

| | < | e1 < e2 | left | 1 if e1 < e2, 0 otherwise |

| | <= | e1 <= e2 | left | 1 if e1 <= e2, 0 otherwise |

I put them as non-associative in the Fortran preprocessor, for potential future compatibility if we ever add Fortran operators. The Fortran operators are non-associative, so we might want to disallow expressions like 0 > -4 > 0 now. (And I think chaining relational operators is terrible, but that's just my opinion. If I saw that in C code myself, I would probably replace it with something people could immediately understand.)

drafts/25-xxx-specifications.txt

bonachea · 2025-05-12T05:37:40Z

drafts/25-xxx-specifications.txt

+
+| Prec |   Op    |    Syntax    |  Assoc'y |  Evaluation Semantics      |
+|------+---------+--------------+----------+----------------------------|
+| low  |   ? :   | e1 ? e2 : e3 |  right   |      conditional-expr      |


I'll note in passing that CPP also allows the comma operator in conditional expressions (at lower priority than conditional-expression), although it's pretty pointless in preprocessor expressions and I'm not aware of any compelling use cases.

For this reason we (implicitly) omitted it from the requirements doc in 25-114r2. I'm only raising it now in case someone has a compelling argument to include it (something other than strict compatibility with CPP), otherwise I'm fine dropping it.

I left them out on purpose, but didn't have a strong reason to do so.

I may be missing a subtlety here.

In general, conditional expression evaluation is side-effect free. So, elaborating

#if (my_complicated_expression, my_other_expression)

results in only the value of my_other_expression affecting the #if. my_complicated_expression may be evaluated, but its result is throw away.

There's also the slippery slope in the C grammar, where the CPP conditional expressions start at conditional-expression, which I think can unfold down to primary-expression, which then includes the ( expression ) which brings in the whole barnyard of comma-expressions and assignment-expression.

So I just chopped those rules out of the grammar.

gak · 2025-05-12T06:09:29Z

Thanks @gak for making a start on this tricky area!

I'd like to take credit but you pinged the wrong person :)

Co-authored-by: Dan Bonachea <[email protected]>

Co-authored-by: Patrick Fasano <[email protected]>

Co-authored-by: Dan Bonachea <[email protected]>

bonachea · 2025-05-12T19:55:40Z

drafts/25-xxx-specifications.txt

+|      |   ¦¦    |   e1 || e2   |   left   |       Fortran .OR.         |
+|------+---------+--------------+----------+----------------------------|
+|      |   &&    |   e1 && e2   |   left   |       Fortran .AND.        |


In the 5/12 call we resolved these should be short-circuit evaluation as in CPP, to allow things like:

#if x && 1/x #endif

which means they are NOT simply Fortran .OR. / .AND.

Co-authored-by: Dan Bonachea <[email protected]>

bonachea · 2025-05-12T20:16:36Z

drafts/25-xxx-specifications.txt

+|      | defined |  defined ID  | nonassoc |    1 if the identifier     |
+|      |         |              |          |   has a #defined value,    |
+|      |         |              |          |     0 otherwise            |


On further thought, this is just plain wrong.
defined cannot appear in this table, because it needs to be applied AFTER macro expansion and BEFORE ID replacement with zero. Hence the defined operator (as in CPP) must be resolved and replaced before this post-expansion grammar is applied.

gklimowicz requested review from bonachea, kc9jud and aury6623 May 8, 2025 03:28

kc9jud reviewed May 8, 2025

View reviewed changes

drafts/25-xxx-specifications.txt Outdated Show resolved Hide resolved

drafts/25-xxx-specifications.txt Outdated Show resolved Hide resolved

drafts/25-xxx-specifications.txt Outdated Show resolved Hide resolved

gklimowicz and others added 2 commits May 8, 2025 05:56

Fix typo: period should be space

a37133e

Co-authored-by: Patrick Fasano <[email protected]>

Typo: Fix table formatting

5363fc6

Co-authored-by: Patrick Fasano <[email protected]>

bonachea requested changes May 12, 2025

View reviewed changes

bonachea reviewed May 12, 2025

View reviewed changes

drafts/25-xxx-specifications.txt Outdated Show resolved Hide resolved

bonachea reviewed May 12, 2025

View reviewed changes

drafts/25-xxx-specifications.txt Outdated Show resolved Hide resolved

bonachea reviewed May 12, 2025

View reviewed changes

gklimowicz and others added 4 commits May 12, 2025 11:29

Better terminology: macro "calls" should be "invocations"

e7a8b53

Co-authored-by: Dan Bonachea <[email protected]>

Renumber second ex20

172487d

Co-authored-by: Dan Bonachea <[email protected]>

Agreed to use "reject" wording on computational error

d7e1696

Co-authored-by: Patrick Fasano <[email protected]>

Fix wording in ex17

1af7563

Co-authored-by: Dan Bonachea <[email protected]>

bonachea reviewed May 12, 2025

View reviewed changes

gklimowicz and others added 2 commits May 12, 2025 13:06

Separate the precedence of logical infix operators

5472a56

Co-authored-by: Dan Bonachea <[email protected]>

Add missing unary operators and fix typo

a2163ef

Co-authored-by: Dan Bonachea <[email protected]>

bonachea reviewed May 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial draft defining syntax, semantics of controlling expressions #65

Initial draft defining syntax, semantics of controlling expressions #65

gklimowicz commented May 8, 2025

gklimowicz commented May 8, 2025

bonachea left a comment •

edited

Loading

bonachea May 12, 2025

bonachea May 12, 2025

bonachea May 12, 2025

gklimowicz May 12, 2025

bonachea May 12, 2025

bonachea May 12, 2025

bonachea May 12, 2025

kc9jud May 13, 2025

bonachea May 12, 2025

gklimowicz May 12, 2025

bonachea May 12, 2025

gklimowicz May 12, 2025

gklimowicz May 12, 2025

gak commented May 12, 2025

bonachea May 12, 2025

bonachea May 12, 2025

		Since expression evaluation occurs after token expansion, there will
		be no object-like macros or function-like macros left to evaluate. All

		instances of ID or ID (args) will all have been replaced with their
		expansions.

		\| ID \| The expansion of the object-like macro ID \|
		\| ID (args) \| The expansion of the function-like macro ID \|

Initial draft defining syntax, semantics of controlling expressions #65

Are you sure you want to change the base?

Initial draft defining syntax, semantics of controlling expressions #65

Conversation

gklimowicz commented May 8, 2025

gklimowicz commented May 8, 2025

bonachea left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gak commented May 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bonachea left a comment •

edited

Loading