Skip to content

stdlib: base64 String methods and JSON/YAML/TOML codec namespaces#131

Open
vito wants to merge 18 commits into
mainfrom
string-conversion-apis
Open

stdlib: base64 String methods and JSON/YAML/TOML codec namespaces#131
vito wants to merge 18 commits into
mainfrom
string-conversion-apis

Conversation

@vito

@vito vito commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Addresses #105 — string/value conversion APIs that scale past two top-level functions per format, without sprawling the precious top-level namespace or minting object-module types that collide with schema types.

base64

base64 is String → String in both directions, so it lives on the String type: "x".toBase64 / "...".fromBase64, alongside toUpper/trim, staying out of the top-level namespace where names risk colliding with imported APIs. fromBase64 reports a source-located error on invalid input.

JSON / YAML / TOML codec scalars

Each format is a codec namespace with encode/decode static methods: JSON.encode(value), JSON.decode(str) :: T, and the same for YAML and TOML. Encode takes an arbitrary value — there is no universal receiver, so it can't live on String — while decode is type-driven via the :: T hint.

The key design move: Dang owns JSON/YAML/TOML as ScalarKind scalars installed in both namespaces — the type namespace (so :: JSON resolves in type position with no import) and the value namespace (so JSON.encode resolves). A scalar is opaque (a tagged string), so attaching encode/decode to it is purely additive.

Merge, not collide

A schema that defines a scalar of the same name — Dagger ships scalar JSON — would otherwise shadow the namespace, since a scalar binds its name. Instead, a small format-codec registry merges: encode/decode are grafted onto the in-scope scalar (user-declared or imported), so the scalar doubles as the namespace. :: JSON resolves the scalar type while JSON.encode resolves the grafted method. This is safe because a same-named scalar almost certainly denotes the same thing.

Because Dang owns the scalars by default, this is uniform across formats: Dagger defines scalar JSON but no YAML/TOML scalar, yet all three behave identically — encode, decode, and :: Format in type position all work with no import and no scalar declaration in scope. (An earlier iteration grafted reactively onto whatever scalar the schema happened to provide, which worked for JSON but left YAML/TOML degenerate; ownership was inverted to fix that.)

fromYAML → YAML.decode

The last top-level conversion function, fromYAML, is replaced by YAML.decode, keeping the top-level namespace clear of codecs (matching the removal of toJSON/fromJSON).

Tests

  • tests/test_codec.dang — bare case: encode/decode + :: JSON/:: YAML/:: TOML with no schema import and no scalar declaration.
  • tests/test_json.dang — a user scalar JSON in scope (the merge case).
  • tests/test_json_import.dang — the test schema's imported scalar JSON (the real Dagger shape).
  • tests/test_from_yaml.dang and the from_yaml_* error fixtures migrated to YAML.decode.

Notes

  • Example modules (mod/apko, mod/doug) use toString rather than JSON.encode: they build against a pinned dang SDK that predates this change, where JSON binds to Dagger's core JSON scalar. toString gives the same serialization the old toJSON did and exists in both SDKs.
  • TOML's top level must be a table; TOML.encode of a non-record reports that rather than emitting something surprising. Encode routes arbitrary values through their JSON form so YAML/TOML marshalling gets clean Go types; decode keeps json.Number shape for the existing materializer.
  • Editor highlighting: dropped toJSON from the VSCode builtin list. The tree-sitter highlight queries live in the zed/nvim submodules and stay internally consistent with their corpus, so they're left to a focused editor follow-up.

🤖 Generated with Claude Code

vito added 3 commits June 14, 2026 14:53
base64 conversion is String -> String in both directions, so it belongs on
the String type rather than the global namespace: "x".toBase64 and
"...".fromBase64 sit alongside toUpper/trim and stay out of the top-level
namespace where they could collide with imported APIs. fromBase64 reports a
source-located error on invalid input.

This is the first of the string-conversion APIs explored in #105. Structured
formats like JSON, whose encode side takes an arbitrary value and so cannot
live on String, are handled separately.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Replace the top-level toJSON/fromJSON functions with JSON.encode and
JSON.decode static methods. Encode takes an arbitrary value, so it cannot
live on the String type (there is no universal receiver); grouping both
directions under a JSON namespace keeps them together and off the top level,
where names risk colliding with imported APIs.

A "JSON" scalar imported from a schema (Dagger defines one) would otherwise
shadow the namespace, since a scalar binds its name as a value. A small
format-codec registry resolves this by merging instead of colliding: the
encode/decode methods are grafted onto an in-scope scalar of the same name,
user-declared or imported, so the scalar doubles as the namespace --
:: JSON resolves the scalar type while JSON.encode resolves the grafted
method. Codec scalars bind non-null so member access keeps String! intact.

base64 stays on String; YAML and TOML can join the registry the same way.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The apko and doug modules build against a pinned dang SDK (see engineVersion
in their dagger.json) that predates the JSON.encode/decode namespace, so
JSON.encode does not resolve there -- bare JSON binds to Dagger's core JSON
scalar. toString produces the same serialization these modules relied on from
the old toJSON and exists in both the old and new SDKs.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 14, 2026

Copy link
Copy Markdown

Deploying dang with  Cloudflare Pages  Cloudflare Pages

Latest commit: 351720d
Status: ✅  Deploy successful!
Preview URL: https://cdf29d67.dang-3kk.pages.dev
Branch Preview URL: https://string-conversion-apis.dang-3kk.pages.dev

View logs

Invert the format-codec design so it is uniform across formats. Instead
of reactively grafting encode/decode onto a JSON scalar the schema
happens to provide, Dang now owns JSON, YAML, and TOML as ScalarKind
codec scalars installed in both namespaces in the Prelude: AddObject so
`:: JSON` resolves in type position with no import, and a non-null value
binding so `JSON.encode`/`.decode` resolve. A user- or schema-declared
scalar of the same name shadows these and grafts the identical codec, so
the two merge rather than collide.

This closes the gap where `:: JSON` failed without an imported scalar,
and makes YAML/TOML first-class: Dagger defines `scalar JSON` but no
YAML/TOML scalar, yet all three now behave identically.

Generalize stdlib_json.go into a one-entry-per-format codec registry
(stdlib_codec.go). Encode routes arbitrary values through their JSON
form so yaml/toml marshalling gets clean Go types; decode keeps
json.Number shape for the existing materializer. TOML requires a table
at the top level.

Replace the last top-level conversion, fromYAML, with YAML.decode,
keeping the top-level namespace clear of codecs.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
@vito vito changed the title stdlib: base64 String methods and a JSON encode/decode namespace stdlib: base64 String methods and JSON/YAML/TOML codec namespaces Jun 15, 2026
@vito vito force-pushed the string-conversion-apis branch from ea4e439 to 68adbdd Compare June 15, 2026 15:19
vito added 14 commits June 15, 2026 23:57
The codec change removed the top-level toJSON function, but this test in
the separate docs/go module still asserted toJSON links to
stdlib-fn-toJSON, so it failed. The root `go test ./...` skips this module
and the docs build does not run go test, so CI never flagged it. Point the
snippet at JSON.encode and assert the static link stdlib-JSON-encode,
which keeps the test's intent (free functions link to the functions page,
static module methods link via their host module).

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
DeferredValue and the return-coercion logic still named fromJSON/fromYAML
in their doc comments after those top-level functions were replaced by the
JSON/YAML/TOML codec namespaces. Refer to JSON.decode/YAML.decode/
TOML.decode so the comments match the current API.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The dang-language skill reference and a nullability example still
documented the removed toJSON/fromJSON/fromYAML top-level functions and
omitted the new JSON/YAML/TOML codec namespaces and String base64 methods.
Since the skill is the agent-facing source of truth for writing Dang, the
stale entries would lead to code that no longer compiles. Update the
top-level list, the JSON/YAML/TOML section, the base64 String methods, and
the nullable-cast examples to the encode/decode API.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The docs site renderer is a separate Go module (github.com/vito/dang/docs),
so the test pipeline's `otelgotest ./...` from the repo root never reached it
and its linker/literate tests could break unnoticed — as one just did when a
stdlib function was removed. Add a docsGo check that runs `go test ./...` in
/src/docs with cgo enabled so the highlighter tests run too.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
TOML has no null, so TOML.encode silently omits record fields whose value is
null, unlike JSON/YAML which keep them. Call this out next to the existing
top-level-table requirement so the difference is not a surprise.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
TOML.encode rejects a non-table top level, but the message interpolated the
Go type with %T, leaking names like "[]interface {}" or "int64" that mean
nothing to a Dang user. Name the value in Dang terms instead ("a list", "an
integer", ...) via plainKindName, and add an error fixture covering it.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
JSON.decode("") errors, YAML.decode("") is null (an empty YAML document), and
TOML.decode("") is an empty table that fills declared defaults. These differ
by format by design; spell them out so the YAML null case is not a surprise.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The reference page renders every receiver and static module via the
\stdlib-statics / \stdlib-methods directives, but only Random and UUID had
sections — the JSON/YAML/TOML codec modules were absent, so the page was no
longer exhaustive. Add a section per codec module so their encode/decode
entries render alongside the rest, with a cross-link to the in-depth page.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The reference page had no in-page navigation, so reaching a method meant
scrolling the whole list. Add \stdlib-toc: a generated index that links each
section heading and, beneath it, every documented entry as a compact mono
word cloud, so a reader can jump straight to a method.

It is built from the same builtin registry as the cards, so it stays
exhaustive — a receiver or static module with no matching group fails the
build (and a unit test catches it sooner). Section and entry links are
booklit references, so a stale anchor fails the build rather than rotting
silently.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Many entries read as `.foo`, which looks awkward next to a middot. Remove the
separator and rely on spacing between the monospace links instead.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Match only arises from String/Regexp APIs, so a top-level section in the
stdlib reference read as a misfit. Move \stdlib-methods{Match} under a Regex
section on the Strings page, replacing the stale "Future: regex" note. The
reference ToC drops the Match group, and an exclusion keeps the
exhaustiveness guard satisfied now that Match is documented elsewhere.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Make .stdlib-toc-entries a flex container with flex-wrap so a section's entry
links flow cleanly onto multiple lines, with the flex gap providing the
spacing the per-link margin used to.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Match the main go test check: run the docs module's tests through otelgotest
(with privileged nesting) instead of bare `go test`, so they emit the same
OpenTelemetry traces and output the rest of the suite already does.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
The reference ToC hardcoded its section list (tocGroups) and re-enumerated the
builtin registry, duplicating what the page already declares. Replace the
custom \stdlib-toc directive with the built-in \table-of-contents and teach
toc.tmpl to list each entry's anchor tags, so the index falls out of the
content: the rows are the page's subsections and the word cloud under each is
that subsection's card anchors. This also picks up the Error types section the
hardcoded list had omitted.

Each anchor's link text is its tag title, which is now the qualified name
(e.g. List.uniq) rather than the full signature; that title doubles as the
search-result title, and the full signature still renders on the card itself.
Anchor tags live on leaf subsections, so the category-page ToCs (which list
whole pages) pick up nothing new.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant