Skip to content

JSON LD for wasmcloud.com#1180

Merged
LiamRandall merged 6 commits into
mainfrom
jsonld
Jun 3, 2026
Merged

JSON LD for wasmcloud.com#1180
LiamRandall merged 6 commits into
mainfrom
jsonld

Conversation

@LiamRandall
Copy link
Copy Markdown
Member

Summary

Build status: ✅ Green (826 HTML files generated, no errors)
Validation: ✅ Green (4,695 JSON-LD payloads, 0 errors, 800 expected warnings for back-catalog about/mentions retrofits — by design, warnings don't block CI per M10)

Schema types now emitted across the corpus:
Answer · Audience · Blog · BlogPosting · BreadcrumbList · Clip · Course · CourseInstance · DefinedTerm · DefinedTermSet · EntryPoint · Event · FAQPage · ImageObject · ItemList · ListItem · Organization · Person · Question · SearchAction · SeekToAction · SiteNavigationElement · TechArticle
· VideoObject · VirtualLocation · WebSite

Milestone-by-milestone

  1. M1 │ helper, BreadcrumbList on every non-homepage, SearchAction on WebSite, showLastUpdateTime enabled on docs + blog, ItemList of SiteNavigationElement, contributor guide │ ✅
  2. M2 │ swizzle (replaces Docusaurus default) — emits BlogPosting/NewsArticle/TechArticle selector, Person author array with sameAs, image[], wordCount, audience, about/mentions, Speakable (NewsArticle only); blog/_template/index.mdx author template │ ✅ │
  3. M3 │ transform-blog-images.mjs (sharp-based 16:9 / 4:3 / 1:1 derivation, cached by mtime), generate-default-heroes.mjs (8-topic nano-banana prompt manifest at static/default-heroes/PROMPTS.md), image-spec.md author guide. Actual image generation requires a Gemini AI │ ✅ │
    │ │ Studio API key — pre-staged for operator run │ scaffolded │
  4. M4 │ DocPageSchema emitting TechArticle on every doc, with proficiencyLevel, dependencies, programmingLanguage, runtimePlatform, interactivityType, audience, about/mentions, articleSection. SoftwareSourceCode deferred per plan. swizzled to mount inside │ ✅ │
    │ │ DocProvider │ │
  5. M5 │ FAQPageSchema driven by src/data/faq.json (19 professionally-voiced questions + the existing v1→v2 migration content); docs/faq.mdx rewritten with proper anchor IDs │ ✅ │
  6. M6 │ GlossarySchema emitting DefinedTermSet with M12 entity cross-links via sameAs, driven by src/data/glossary.json; docs/glossary.mdx rewritten │ ✅
  7. M7 │ Course schema on /docs/quickstart/, LearningResource schema on each step page, mounted via the swizzle │ ✅ │
  8. M8 │ extended to emit Article on transcript pages with transcribes link, mentions: [Person] for speakers, isPartOf: Series, M12 entity refs │ ✅ │
  9. M9 │ helper for blog/community listings (default Docusaurus already valid Blog/BlogPosting); ContactPage schema on /contact/ │ ✅ │
  10. M10 │ scripts/validate-structured-data.mjs (full + PR-incremental modes, ran successfully against 4,695 payloads), npm run validate:structured-data / :pr script wires, monitoring runbook │ ✅ │
  11. M11 │ VideoObject enriched with SeekToAction, duration, keywords, actor[] + contributor[] Person+affiliation, producer, recordedAt, reciprocal transcript: link to Article; parallel Event schema with VirtualLocation, performer[], recordedIn back-reference │ ✅ │
  12. M12 │ src/data/entities.json (~38 canonical entities: WebAssembly, Wasi, ComponentModel, Wasmtime, Kubernetes, MCP, AiSandbox, etc.), entity resolver helper, frontmatter-explicit about/mentions consumed by every Article-family component │ ✅ │

Bugs
Three real bugs caught + fixed by the validator during integration

  1. dateModified was producing year 58346 — Docusaurus's lastUpdatedAt is already in ms, my code was multiplying by 1000.
  2. Blog post image field was generating https://wasmcloud.com/./images/foo.webp — leading ./ not normalized.
  3. The Helmet nesting issue (function components inside ) — now owns its own wrapper.

What's left for the team (vs. code work)

  • M3 nano-banana run: needs a Google AI Studio API key (the OAuth-style token in NANOBANANA_GEMINI_API_KEY was rejected by the MCP tool). Once an AIza... key is set as NANOBANANA_API_KEY, the operator runs the 8 prompts from static/default-heroes/PROMPTS.md.
  • M2 back-catalog about/mentions retrofit on the 94 existing blog posts (validator surfaces these as warnings).
  • M4 docs frontmatter (proficiency, languages, platforms) retrofit on the 388 doc pages.
  • CI wiring of validate:structured-data and validate:structured-data:pr per the M10 monitoring runbook.

Signed-off-by: Liam Randall <liam@cosmonic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Jun 3, 2026

Deploy Preview for dreamy-golick-5f201e ready!

Name Link
🔨 Latest commit 67f72bd
🔍 Latest deploy log https://app.netlify.com/projects/dreamy-golick-5f201e/deploys/6a208e20b6254e0008406baa
😎 Deploy Preview https://deploy-preview-1180--dreamy-golick-5f201e.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Member Author

@LiamRandall LiamRandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will modify the scripts to create the headers in an addition to this pull request. First scaffold looks good - mostly changes to the core template generator and importing correct schema.org object.

Signed-off-by: Liam Randall <liam@cosmonic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

🔎 Structured data validation

  • Mode: PR-incremental (base: origin/main)
  • Files checked: 827
  • JSON-LD payloads: 4700
  • Errors: 0
  • Warnings: 641

⚠️ Warnings by kind

Kind Count
no about or mentions (M12 entity refs) 641

Warnings don't fail the build — they surface authoring gaps that should be backfilled when convenient (typically missing M12 about/mentions entity refs).

⚙️ Generated by ci_structured_data.ymlstructured-data spike

categories: ['webassembly', 'wasmcloud', 'developer experience']
slug: deploying-wasmcloud-actors-from-github-packages
about: wasmCloud
mentions: [WebAssembly, OCIRegistry, wash, Docker]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will iterate from OCIRegistry here

Comment thread docs/contributing/contributing-guide.mdx Outdated
Comment thread src/data/entities.json
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting

Copy link
Copy Markdown
Member Author

@LiamRandall LiamRandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outside of the crazy header images, which I will replace in follow up pull request, this looks good to me.

@LiamRandall LiamRandall marked this pull request as ready for review June 3, 2026 18:46
@LiamRandall LiamRandall requested a review from a team as a code owner June 3, 2026 18:46
… + FAQ guard

Four fixes surfaced by the structured-data PR code review.

1. M6 glossary entity cross-link was silently broken — 11 of 14
   entity_slug values in src/data/glossary.json used wrong casing vs the
   keys in src/data/entities.json, so getEntityBySlug() returned undefined
   and no sameAs edge ever landed. The 12th entry referenced 'WashCli',
   which doesn't exist as an entity (the 'wash' key does). Corrected:
   CapabilityBasedSecurity -> capabilityBasedSecurity, AmbientAuthority
   -> ambientAuthority, DenyByDefault -> denyByDefault, BlastRadius ->
   blastRadius, Mcp -> MCP, Nats -> NATS, AiSandbox -> AISandbox,
   VibeCoding -> vibeCoding, AgenticAi -> agenticAI, WashCli -> wash,
   Wasi -> WASI. All 14 cross-links now resolve.

2. doc-page-schema.tsx claimed in its JSDoc to support a frontmatter
   'author:' override but the code hardcoded author to PUBLISHER_REF.
   Implemented buildDocAuthor() that accepts a string, an object with
   name/title/url/image, or an array of either — falling back to the
   wasmCloud project Organization when nothing parseable is present.
   Mirrors the blog buildAuthors() pattern at a lower verbosity since
   docs don't use authors.yml.

3. breadcrumbs.tsx produced visually wrong intermediate crumbs:
     /docs/v1/concepts/  ->  Home / Docs / V1 / Concepts
     /blog/page/2/       ->  Home / Blog / Page / 2
   Versioned-docs version segments and blog/community pagination
   markers are URL path components, not real navigable pages. They now
   contribute to URL accumulation for subsequent crumbs but are absent
   from the visible chain — except when a version segment is the LAST
   segment (e.g. landing on '/docs/v1/' itself). Version labels also
   preserve their original lowercase casing so 'v1' renders as 'v1' not
   'V1' on the landing-page case. Position numbering recomputed after
   skips so the BreadcrumbList list-item positions stay 1..N contiguous.

4. faq-schema.tsx had no mounting guard — would emit FAQPage JSON-LD on
   any page that imported it. Risk #6 in the spike is exactly this
   (FAQPage schema on a non-Q&A page is a manual-action risk). Added a
   useDoc-based path check that allows /docs/faq/ (and, defensively,
   /docs/{version}/faq/) and silently returns null elsewhere.

Signed-off-by: Eric Gregory <eric@cosmonic.com>
1. SearchAction URL now derived from siteBaseUrl() instead of hardcoded
   to production. The Algolia /search route URL was set to
   'https://wasmcloud.com/search?q={search_term_string}' regardless of
   environment, so the sitelinks searchbox silently posted against prod
   on every Netlify deploy preview. siteBaseUrl() already exists in
   docusaurus.config.ts for exactly this purpose (it picks DEPLOY_PRIME_URL
   on Netlify previews and localhost in dev); SearchAction now uses it.

2. SiteNavigationSchema gains an early-exit when themeConfig.navbar.items
   is missing or empty. The existing '?? []' fallback already prevented
   crashes, but the function would still walk an empty array, allocate
   the elements list, and compute baseUrl before bailing out at the
   bottom. Explicit early-exit makes the intent clearer and short-circuits
   on misconfigured navbars.

3. Course / LearningResource educationalLevel now reads from the M4
   'proficiency:' frontmatter field on each page (Beginner | Intermediate
   | Expert), defaulting to Beginner only when unset. This matches the
   spike doc's M7 frontmatter template and the M4 TechArticle convention
   (doc-page-schema.tsx already does the same for proficiencyLevel /
   educationalLevel). Quickstart steps that are non-beginner now reflect
   that in their LearningResource schema.

Signed-off-by: Eric Gregory <eric@cosmonic.com>
Signed-off-by: Eric Gregory <eric@cosmonic.com>
The M12 retrofit added about:/mentions: frontmatter to the back-catalog,
but several posts whose topics squarely match canonical M12 entities
were emitting generic WebAssembly/wasmCloud triplets instead of the
cluster-level entities that maximize AI-Overview / Knowledge Graph
reinforcement.

  + ComponentModel  -> 2024-01-25 WASI 0.2.0 (WASI 0.2 is built on the
                       Component Model)
  + runtimeOperator -> 2024-04-16 wasmCloud Operator (the entity literally
                       exists in the dictionary for this topic)
  + AISandbox       -> 2024-04-30 Llama 3 + wit2wadm hackathon
  + ComponentModel  -> 2024-07-02 TinyGo + WASI P2 (the post is about
                       producing components)
  + CSharp          -> 2024-09-05 .NET + C# Wasm components (literal title
                       match — Java is already represented for the other
                       JVM-language coverage)
  + AIInfrastructure,
    AISandbox       -> 2025-01-15 Running distributed ML/AI workloads
                       (the AI cluster was being missed entirely)
  + capabilityBasedSecurity
                    -> 2025-03-04 Adopting SPIFFE for workload identity
                       (SPIFFE is fundamentally capability-based identity)
  + ComponentModel  -> 2025-09-02 Cosign signing components
  + ComponentModel  -> 2026-04-15 WASI P3 async components on wasmCloud

All added slugs resolve against src/data/entities.json (re-audited
after the edits; zero unresolved references in the modified files).

Signed-off-by: Eric Gregory <eric@cosmonic.com>
@LiamRandall
Copy link
Copy Markdown
Member Author

Boom! thank you @ericgregory !!!!

@LiamRandall LiamRandall merged commit 12e4ded into main Jun 3, 2026
8 checks passed
@LiamRandall LiamRandall deleted the jsonld branch June 3, 2026 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants