Skip to content
This repository was archived by the owner on Oct 3, 2025. It is now read-only.

Conversation

@sunshineplan
Copy link
Contributor

The previous approach to adding spaces was overly mechanical, indiscriminately inserting spaces without considering the context of surrounding characters. This resulted in unexpected spaces in the output.

This commit refactors the space insertion logic to be context-aware. It now checks if adjacent characters belong to unicode.Punct or unicode.Symbol categories. Spaces are only inserted if the neighboring characters are not punctuation or symbols. This eliminates the need for a separate replacement step to remove redundant spaces added by the previous mechanical approach.

Additionally, the "allowed characters" setting has been removed. This ensures that all content from the original text is displayed in the Pinyin output, preventing the loss of characters such as book titles marks like 《》 and French characters, which were previously excluded by the character filtering mechanism.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants