Skip to content

Vague character definition #135

@est31

Description

@est31

Currently the specification reads:

A character is defined by this document for each cell in the coding space described by Unicode, regardless of whether or not Unicode allocates a character to that cell.

With a link to the Unicode 2017 standard. I think this is a bit vague in two ways. First, the term "cell" is used, and I don't know which Unicode concept this translates to. Display Cell maybe? Also wondering what "character" means in the second sentence. Unicode glossary provides four different definitions what a character can be.

This is relevant I guess in many ways, including the multiple ways the compiler exposes column numbers to user code (panic column info, column macro, nightly proc macro span introspection api). E.g. what is the output of this:

fn main(){
let (a,b) = ("🇪🇺", column!());
println!("{b}");
}

It's a non trivial question because 🇪🇺 is a multi code point grapheme cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions