Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Raw #symbols would not go through EQUS expansion #1605

Closed
Rangi42 opened this issue Jan 9, 2025 · 8 comments · Fixed by #1648
Closed

[Feature request] Raw #symbols would not go through EQUS expansion #1605

Rangi42 opened this issue Jan 9, 2025 · 8 comments · Fixed by #1648
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM
Milestone

Comments

@Rangi42
Copy link
Contributor

Rangi42 commented Jan 9, 2025

It's too early to even consider deprecating or removing EQUS expansion (#905) -- we don't yet have user-defined functions or other replacements for EQUS use cases. However, they can still be inconvenient -- you have to do "{foo}" to get the value of your string symbol foo.

ASMotor has a shorter way of doing that, |foo|. However, I wouldn't want to add new syntax, whether as delimiters, a prefix, or a suffix, since we may eventually remove the feature anyway, and since it's not significantly simpler than typing "{ }".

However, we already support the # prefix for raw symbols. That was introduced to allow symbols that share a name with keywords, which is a very niche application (most of the time you can just use non-clashing names).

I'd like to make raw symbols not undergo EQUS expansion. Note that the syntax can be used on any symbol, not just ones that clash with keywords. So you could do do #foo instead of db "{foo}", which saves three characters and is IMO a worthwhile savings to read and to write.

In theory this would break someone's code if they were currently doing something like def #purge equs "rst PurgeMyCache" and then #purge as an instruction macro. But I think it's clear that nobody is doing such a thing. :P

@Rangi42 Rangi42 added enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM labels Jan 9, 2025
@Rangi42 Rangi42 added this to the 0.9.1 milestone Jan 9, 2025
@aaaaaa123456789
Copy link
Member

This is overloading an operator to have a completely different meaning; it's both confusing (why does this behave differently in this particular corner case?) and detrimental (no way to use raw symbols as raw symbols). All this to save three characters.

Adding 20 different meanings to the same operator depending on finicky details about the context is unwarranted complexity — too much JavaScript in your life isn't good for you :)

@Rangi42
Copy link
Contributor Author

Rangi42 commented Jan 10, 2025

I thought the above already addressed those points, but:

  • People aren't using raw symbols (unless they're generating SM83 for LLVM, which was the user case that got raw symbol syntax added.). People definitely aren't using raw symbol syntax for non-clashing symbols that don't need it. It's okay to overload the syntax for a fairly common use case; it won't be "confusing".
  • Did you read my last paragraph? "In theory this would break someone's code if they were currently doing something like def #purge equs "rst PurgeMyCache" and then #purge as an instruction macro." So your concern about this change being "detrimental" is not based in reality -- and if for some reason you do insist on having a string symbol named #purge, you can expand it explicitly by doing {#purge}.
  • You miscounted; this isn't 20 different meanings, it's two. (I know you're exaggerating. It's a habit of yours and I really don't like it, because I have to either deflate it or just address the actual reality or waste verbiage on both.)

@Rangi42 Rangi42 modified the milestones: 0.9.1, 0.9.2 Jan 17, 2025
@Rangi42
Copy link
Contributor Author

Rangi42 commented Feb 2, 2025

ISSOtm approves <3

@Rangi42
Copy link
Contributor Author

Rangi42 commented Feb 4, 2025

git grep -P '"\{[A-Za-z_][A-Za-z_0-9@#$]*\}"' turns up a lot that can be simplified with this.

@aaaaaa123456789
Copy link
Member

I'm still concerned about the slippery slope (hence my "20 different meanings" remark from above: I'm not expecting this to be the last weird case someone proposes). But if you strongly feel like this one in particular has that much value to add, and you have the data to show it, it might be worth the mental complexity.

@Rangi42
Copy link
Contributor Author

Rangi42 commented Feb 4, 2025

This is a step towards decrease in complexity.

Previously:

"How do I just use a string symbol directly, without it being EQUS-expanded?"
"You don't, you have to interpolate-expand it inside a brand-new string literal."

With this PR:

"How do I just use a string symbol directly, without it being EQUS-expanded?"
"Just put a hash sign in front.1"

In future:

"How do I just use a string symbol directly, without it being EQUS-expanded?"
"Just type its name! We removed EQUS expansion."

Footnotes

  1. "If your string symbol has the same name as a language keyword, you're already putting a hash sign in front! Which means you can't EQUS-expand a symbol with the same name as a keyword any more. But I know and you know that nobody wanted to or should be doing that."

@Rangi42
Copy link
Contributor Author

Rangi42 commented Feb 5, 2025

Some non-exhaustive examples of how non-expanded raw string symbols should behave:

  • def bar equs #foo should work like def bar equs "{foo}"
  • strsub(#foo, 4, 1) should work like `strsub("{foo}", 4, 1)
  • #foo:: (when def foo equs "42" already) should work like foo:: when def foo equ 42 already: it should give an "already defined" error
  • bank(#foo) (when def foo equs "42") should work like bank(foo) (when def foo equ 42): it should give a "BANK argument must be a label" error

@ISSOtm
Copy link
Member

ISSOtm commented Feb 12, 2025

This is overloading an operator to have a completely different meaning; it's both confusing (why does this behave differently in this particular corner case?) and detrimental (no way to use raw symbols as raw symbols). All this to save three characters.

I would argue this change is in the spirit of the intended use case for raw symbols: suppressing implicit behaviour.

Raw symbols were introduced because the same syntax is being used for both keywords and identifiers, so a codegen backend has to check the latter against a whitelist to ensure that the latter isn't actually the former.

The way EQUS expansion currently works, there is a similar problem with an identifier potentially not expanding to itself like normal, but actually to something else. As long as all symbols are controlled directly by the codegen backend, this is fine; but I think this is too fragile/too strong an assumption to force users to make.

This was my rationale for approving this change when Sylvie asked. Maybe we will come to regret this later, but I think it's less likely that we'll regret this change than keeping the current behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM
Projects
None yet
3 participants