-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support human readable slugs for non-English locales #9892
Conversation
5308c32
to
210b738
Compare
210b738
to
47cdbf4
Compare
assert_equal "foo", attachment.slug | ||
end | ||
|
||
# This behaviour is built into the friendly_id gem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: I did spot in the docs:
Previous versions of FriendlyId appended a numeric sequence to make slugs unique, but this was removed to simplify using FriendlyId in concurrent code.
So I'd expect this test to need an update when we update finder_id. I did try updating finder_id locally but got Bundler attempted to update friendly_id but its version stayed the same
and it wasn't immediately clear why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have questions. Nice work though!
This PR: - No longer sets `slug` to `nil` when the locale is non-English. Every locale is considered "sluggable". - This also means we no longer need the `sluggable_locale?`, `slug_eligible?` methods and ASCII checks. (The latter has caused a number of Zendesk tickets, whereby even English HTML Attachments have been given a content-ID based URL, because they contain a non-ASCII character like a fancy single quote). - `sluggable_string` doesn't appear to have been in use, so it was deleted. (`sluggable_string` is used in the Document class: `friendly_id :sluggable_string`. But the Attachment class uses `friendly_id :title`, so overriding the `sluggable_string` method had no effect). - Adds a test for friendly_id's native "uniqueness" feature (which we've opted into via `use: :scoped, scope: :attachable` on the Attachment class), so in theory we should be protected from clashes in the attachment slug namespace. Therefore the issue alluded to in #5321 (which reverts the last time this was attempted) should not happen again. - Adds a basic safeguard that the slug-generation results in a slug that contains at least 4 a-z or A-Z characters. This is to protect against generating a slug like `-`, which could happen from sluggifying a title like `{chinese}-{chinese}`. - That said, if we somehow DO get into a state whereby an attachment has what we'd call an invalid slug (<4 azAZ chars), we don't want to fix it if the attachment is already live. So we check for `safely_resluggable?`, introduced in #5942. Result: - Most HTML attachments should now have human-readable slugs. Only languages that are non-ASCII and cannot be converted to ASCII, such as Chinese, will continue to fall back to the content-ID based URLs. Next steps: - Now that all Whitehall tables use the UTF8MB4 charset (#9767), we could potentially use CJK character sets natively in slugs - but it may not always show nicely in the browser. Compare `https://example.com/你好` with `https://example.com/%E4%BD%A0%E5%A5%BD`. - The option we'd likely want to go with is conversion to their Pinyin representations, e.g. `https://example.com/ni-hao`, probably using the `pinyin` or `ruby-pinyin` gem. Trello: https://trello.com/c/YGsEG9YV/3418-improve-html-attachment-slug-creation
47cdbf4
to
ac898d0
Compare
See code review comment: #9892 (comment) `safely_resluggable?` returns `true` for newly created attachments and `false` for attachments that have been copied over from a previous edition. New attachments will start off with a `nil` slug, or in any case a slug that is derived from the attachment's current title. It can be safely re-generated an infinite number of times before the attachment is made live. So we don't need to be checking the value of the slug in this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff, this makes a lot more sense now
This PR:
slug
tonil
when the locale is non-English. Every locale is considered "sluggable".sluggable_locale?
,slug_eligible?
methods and ASCII checks. (The latter has caused a number of Zendesk tickets, whereby even English HTML Attachments have been given a content-ID based URL, because they contain a non-ASCII character like a fancy single quote).sluggable_string
doesn't appear to have been in use, so it was deleted. (sluggable_string
is used in the Document class:friendly_id :sluggable_string
. But the Attachment class usesfriendly_id :title
, so overriding thesluggable_string
method had no effect).use: :scoped, scope: :attachable
on the Attachment class), so in theory we should be protected from clashes in the attachment slug namespace. Therefore the issue alluded to in Bump friendly #5321 (which reverts the last time this was attempted) should not happen again.-
, which could happen from sluggifying a title like{chinese}-{chinese}
.safely_resluggable?
, introduced in Re-slug HTML attachments if they are unpublished #5942.Result:
Next steps:
https://example.com/你好
withhttps://example.com/%E4%BD%A0%E5%A5%BD
.https://example.com/ni-hao
, probably using thepinyin
orruby-pinyin
gem.Trello: https://trello.com/c/YGsEG9YV/3418-improve-html-attachment-slug-creation
Follow these steps if you are doing a Rails upgrade.