Skip to content

Add dump symbols options#538

Open
Dragorn421 wants to merge 3 commits into
ethteck:mainfrom
Dragorn421:add_dump_symbols_options_w_rework_bss_size
Open

Add dump symbols options#538
Dragorn421 wants to merge 3 commits into
ethteck:mainfrom
Dragorn421:add_dump_symbols_options_w_rework_bss_size

Conversation

@Dragorn421
Copy link
Copy Markdown
Collaborator

This PR adds two options that get used when the existing dump_symbols is true, dump_symbols_segments and dump_symbols_references:

  dump_symbols: true
  dump_symbols_segments: true
  dump_symbols_references: true

As a reminder, the existing dump_symbols option enables dumping symbols to the following files into .splat/: spim_context.csv, spim_context_unksegment.csv, splat_symbols.csv

The two new options serve to add columns to the splat_symbols.csv file.

dump_symbols_segments adds the following columns: segment,subsegment,subsegment_type, where

  • segment is the splat name of the symbol's segment (as defined from the yaml), or None if the symbol is not tied to a segment
  • subsegment is the splat name of the subsegment the symbol is in, if any, or None
  • subsegment_type is the type of that subsegment (such as asm, c, data, ...) or None if there's no tied subsegment

dump_symbols_references adds the following column: referenced_by

  • referenced_by is a |-separated list of symbol names that reference the symbol of the current csv line. For example
    • `` (empty string) for an unreferenced symbols
    • func_801C77C8 for a symbol referenced by a single function, func_801C77C8
    • leoCommand|D_801D95F0|leomain|LeoReset for a symbol referenced by three functions and a data symbol

Use case

These options are useful in order to visualize the relationships between symbols, which helps identify how to split asm and rodata sections and how to associate them.

As an example, take the following python script:

Details
#!/usr/bin/env python3

# SPDX-FileCopyrightText: 2026 Dragorn421
# SPDX-License-Identifier: CC0-1.0

import argparse
import csv
import dataclasses


@dataclasses.dataclass(frozen=True)
class Sym:
    vram_start: int
    name: str
    type: str
    segment: str
    subsegment: str
    subsegment_type: str
    referenced_by: tuple[str, ...]


syms = list[Sym]()

with open(".splat/splat_symbols.csv") as f:
    for row in csv.DictReader(f):
        if row["referenced_by"] == "":
            referenced_by = []
        else:
            referenced_by = row["referenced_by"].split("|")
        syms.append(
            Sym(
                int(row["vram_start"], 16),
                row["name"],
                row["type"],
                row["segment"],
                row["subsegment"],
                row["subsegment_type"],
                tuple(referenced_by),
            )
        )

sym_by_name = {_sym.name: _sym for _sym in syms}

parser = argparse.ArgumentParser()
parser.add_argument("segment")
parser.add_argument(
    "--section",
    nargs="+",
    help=(
        "only show this section besides text,"
        " eg --section rodata will only show text and rodata"
    ),
)
args = parser.parse_args()

section_by_subsegment_type = {
    "asm": "text",
    "c": "text",
    "textbin": "text",
    "hasm": "text",
    "data": "data",
    "rodata": "rodata",
    ".rodata": "rodata",
    "bss": "bss",
}

syms_by_section: dict[str, list[Sym]] = {}
for sym in syms:
    if sym.segment != args.segment:
        continue
    section = section_by_subsegment_type.get(sym.subsegment_type)
    assert section is not None, sym
    syms_by_section.setdefault(section, []).append(sym)

text_subsegments = sorted({_sym.subsegment for _sym in syms_by_section["text"]})
color_by_subsegment: dict[str, str] = {}
for subsegment in text_subsegments:
    h = (len(color_by_subsegment) * 0.7) % 1
    color_by_subsegment[subsegment] = f"{h} 1 1"

if args.section:
    for section in list(syms_by_section.keys()):
        if section != "text" and section not in args.section:
            del syms_by_section[section]

section_by_sym_name = {
    _sym.name: _section for _section, _syms in syms_by_section.items() for _sym in _syms
}

vram_start_by_section: dict[str, int] = {}
for section, section_syms in syms_by_section.items():
    vram_start_by_section[section] = min(_s.vram_start for _s in section_syms)


colw = 10
x_by_section = {
    "text": 0 * colw,
    "data": 1 * colw,
    "rodata": 2 * colw,
    "bss": 3 * colw,
}


def gprint(l: str):
    print(l)


gprint("digraph {")

for section, section_syms in syms_by_section.items():
    section_vram_start = vram_start_by_section[section]
    x = x_by_section[section]
    filtered_syms: list[Sym] = []
    for sym in sorted(section_syms, key=lambda sym: sym.vram_start):
        if sym.type in {"label", "jtbl_label"}:
            continue
        filtered_syms.append(sym)
    cur_subsegment = None
    i = 0
    dy = 0
    for sym in filtered_syms:
        if cur_subsegment != sym.subsegment:
            if cur_subsegment is not None:
                gprint("}")
            cur_subsegment = sym.subsegment
            gprint(f"subgraph cluster_{cur_subsegment}_{section} " "{")
            y = -i / len(filtered_syms) * 100 + dy - 0.2
            gprint(f'"{cur_subsegment} {section}"' " [" f' pos = "{x},{y}!"' f' color="none"' " ]")
            dy -= 0.8
        assert cur_subsegment is not None
        if 0:
            # y = vram position
            y = -(sym.vram_start - section_vram_start) / 500
        y = -i / len(filtered_syms) * 100 + dy
        i += 1
        color = None
        if section == "text":
            color = color_by_subsegment[cur_subsegment]
        elif section == "rodata":
            if sym.type == "jtbl":
                color = "magenta"
        gprint(
            f'"{sym.name}"'
            " ["
            f' pos = "{x},{y}!"'
            + (f' color="{color}"' if color is not None else "")
            + " ]"
        )
    if cur_subsegment is not None:
        gprint("}")

for section, section_syms in syms_by_section.items():
    for sym in section_syms:
        for sym_ref_by in sym.referenced_by:
            if (
                # ignore references from outside the segment
                sym_ref_by in section_by_sym_name
                # ignore same-section references
                and section_by_sym_name[sym_ref_by] != section
                # only show
                and (
                    # references from text
                    section_by_sym_name[sym_ref_by] == "text"
                    # or references from data to rodata
                    or (
                        section_by_sym_name[sym_ref_by] == "data"
                        and section_by_subsegment_type[sym.subsegment_type] == "rodata"
                    )
                )
            ):
                try:
                    color = color_by_subsegment[sym_by_name[sym_ref_by].subsegment]
                except KeyError:
                    color = "black"
                gprint(f'"{sym_ref_by}" -> "{sym.name}"' f' [ color = "{color}" ]')

gprint("}")

This script reads the contents of .splat/splat_symbols.csv, and given a segment name will output a graph in dot language of the symbols of that segment and their relationships. The symbols will also be visually clustered by subsegment. Optionally, the script can take a --section section1 [section2 ...] argument to restrict the visualization to just text and the given sections (text, data, rodata, bss)

As an example, here is the result of

./graph_cross_sections_refs.py n64dd --section rodata > n64dd.dot && neato -Tsvg -O n64dd.dot

(using graphviz' neato to render from dot language to svg)

n64dd.dot.svg:

Details n64dd dot

We can see two columns: the left column is the text section, and the right column is the rodata section. (since the script received --section rodata, only text and rodata are shown)

text symbols and outgoing references (represented by the arrows) are colored differently per subsegment.

Subsegments are also clustered and highlighted by black rectangles, and the name of each subsegment can be seen at their top

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant