Skip to content

Fix bracket parsing logic for objdump labels#50

Open
connorsimms wants to merge 6 commits into
compiler-explorer:mainfrom
connorsimms:fix/label-truncation
Open

Fix bracket parsing logic for objdump labels#50
connorsimms wants to merge 6 commits into
compiler-explorer:mainfrom
connorsimms:fix/label-truncation

Conversation

@connorsimms

Copy link
Copy Markdown

Resolves compiler-explorer/compiler-explorer#8809

This PR fixes an issue where the asm parser fails to correctly highlight/filter functions containing nested or imbalanced brackets such as
call 410145 <void a<int, int>()>
and
https://godbolt.org/z/3367rMndr.

The original logic used a forward-pass that overwrote the label start position at every < and terminated at the first > or +. In certain cases with nested/imbalanced brackets (e.g. operator<<), the complete label is not captured.

Instead of tracking label-state character-by-character, this version waits until the end of the instruction, finds the rightmost > character, and then scans backwards for a < preceded by a space " <" (unless the < is the first character).

This is safe because it is consistent with objdump formatting logic,
GNU Binutils objdump.c:

objdump_print_addr_with_sym (// ...
{
  if (!no_addresses)
    {
      objdump_print_value (vma, inf, skip_zeroes);
      (*inf->fprintf_styled_func) (inf->stream, dis_style_text, " ");
    }
// ...
(*inf->fprintf_styled_func) (inf->stream, dis_style_text,"<");

Note on tests:
I added ce-bug-8809.approved.txt. I also updated ce-bug-3963.approved.txt. The original parser was prematurely capturing char in <std::ctype<char>::_M_widen_init() const@plt> and returning char as the label. The new logic captures the entire name and allows the @plt filter to work properly.

…r-explorer/compiler-explorer#8809)

Replaced character-by-character bracket tracking with a backward scan from end of line. Objdump formats label annotations with a leading space, so scanning backwards for ' <' safely handles template arguments, operator overloads, and inline assembly noise.

Updated ce-bug-3963 snapshot since the previous parsing logic was incorrectly capturing '<char>' inside '<std::ctype<char>::_M_widen_init() const@plt>'. The new parser captures the full label and allows the shouldIgnoreFunction() filter to work.
@connorsimms

connorsimms commented Jun 27, 2026

Copy link
Copy Markdown
Author

Update: I pushed an additional commit to harden the backwards-scan condition.

I realized that scanning for " <" is a necessary condition, but it is not sufficient. For example, a templated operator< looks like this in the objdump output:
call 401128 <bool operator< <Dummy>(Dummy const&, Dummy const&)>. If we only look for a space then it would truncate the label at <Dummy>. I updated the logic to check that < is also preceded by a hex digit.

@connorsimms

connorsimms commented Jun 27, 2026

Copy link
Copy Markdown
Author

Actually, since objdump --no-addresses gets rid of the hex values altogether, it is better to just find the left most " <" instead of breaking early. I removed the hex check and the break statement. This should handle the edge cases mentioned above.

Comment thread src/objdump/parser.cpp Outdated
@connorsimms connorsimms force-pushed the fix/label-truncation branch from c52ab29 to 821e4d4 Compare June 27, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Wrong label highlighted if linking template to machine code

2 participants