Skip to content

Automatically generate table of contents in text #2213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ed-kung
Copy link
Contributor

@ed-kung ed-kung commented Jun 7, 2025

Description

Closes #2208

Writing {:toc} on its own line will automatically render a table of contents as a bulleted list, using the headings (#, ##, etc) contained in the post.

Screenshots

auto-toc.mov

Additional Context

n/a

Checklist

Are your changes backwards compatible? Please answer below:

yes

On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:

  1. If another post had {:toc} somewhere it could render in unexpected ways.

The behavior of links that start with a hash (i.e. #section-1) has also changed, but I can't see why anyone would've wanted such links to open in a new tab.

For frontend changes: Tested on mobile, light and dark mode? Please answer below:

n/a

Did you introduce any new environment variables? If so, call them out explicitly here:

no

@huumn
Copy link
Member

huumn commented Jun 18, 2025

just a heads up, I probably won't get around to deeply reviewing this until next week. at a glance it looks good though.

Copy link
Member

@Soxasora Soxasora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It passed my stress test, leaving behind just a nitpick.

Your code is high quality, especially buildToc which addresses the complexity of the problem in an elegant way.
There's only the problem of {:toc} rendering in comments, but even then, we're talking about UX that can be fixed easily.

I also talked about the duplication issue, but that doesn't depend from your code, nor was in scope for this PR/issue, just something to keep in mind.

It works really well, nice job! ^^


return toc
}, [text])
const toc = useMemo(() => extractHeadings(text), [text])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we don't give IDs to nodes inside comments, only on full posts:

h1: ({ node, id, ...props }) => <h1 id={topLevel ? id : undefined} {...props} />,

Because of this, the table of contents won't work if used in comments. I personally think that the ToC doesn't make that much sense in comments, so we can just disable {:toc} for them with topLevel awareness. What do you think? ^^

Copy link
Contributor Author

@ed-kung ed-kung Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it wouldn't make sense to use Toc in comments, so I changed it so that remarkToc is only processed for topLevel items

const str = toString(node)
headings.push({
heading: str,
slug: slug(str.replace(/[^\w\-\s]+/gi, '')),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for the future, unrelated to review:
it seems that we don't handle duplicate headings, because we use slug instead of the GithubSlugger class.
Probably because it's faster as GithubSlugger would have instead tracked headings in memory and checked every heading against the previous ones to count 🤔.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use GithubSlugger because of #1405

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will leave this as is for now

@@ -49,6 +54,9 @@ export function SearchText ({ text }) {

// this is one of the slowest components to render
export default memo(function Text ({ rel = UNKNOWN_LINK_REL, imgproxyUrls, children, tab, itemId, outlawed, topLevel }) {
// include remarkToc if topLevel
const remarkPlugins = topLevel ? [...baseRemarkPlugins, remarkToc] : baseRemarkPlugins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, topLevel cannot be used to exclude this plugin from comments because topLevel is also true for comments when they are the root of a page:

2025-07-25.02-08-24.mp4

not sure how to best deal with this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. One possibility is to remove the topLevel check for giving headings node ids:

h1: ({ node, id, ...props }) => <h1 id={topLevel ? id : undefined} {...props} />,

I'm not sure why that check is there and if doing so would break anything.

I suppose another possibility is to only process {:toc} if the item has no parent, right? Will topLevel always be true for posts? (and thus headings always get assigned node ids in posts)

const str = toString(node)
headings.push({
heading: str,
slug: slug(str.replace(/[^\w\-\s]+/gi, '')),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the replacement for?

Copy link
Contributor Author

@ed-kung ed-kung Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just copying from line 22 of the old components/table-of-contents.js, which uses the same pattern:

toc.push({ heading: str, slug: slug(str.replace(/[^\w\-\s]+/gi, '')), depth: node.depth })

But is it redundant since we're already passing the string to slug? I will usually copy the old code whenever I can to break as few things as possible

while (stack.length && depth <= stack[stack.length - 1].depth) {
stack.pop()
}
let parent = stack[stack.length - 1].node
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Stack Underflow in buildToc Function

The buildToc function is vulnerable to a stack underflow. If a heading with a depth of 0 or less is encountered in the AST, the while loop will pop all elements from the stack. This causes stack[stack.length - 1] to become undefined, leading to a TypeError when its .node property is accessed. While standard markdown headings have depths 1-6, this edge case can crash the application if malformed AST data is processed.

Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automated generation of table of contents
4 participants