Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡ is a wide symbol #100

Open
ghost opened this issue Apr 18, 2017 · 18 comments
Open

⚡ is a wide symbol #100

ghost opened this issue Apr 18, 2017 · 18 comments

Comments

@ghost
Copy link

ghost commented Apr 18, 2017

11

@chubin
Copy link
Owner

chubin commented Apr 18, 2017

Is it in a console or in a browser? I mean the screenshot
Could you please check it in the console also?

@ghost
Copy link
Author

ghost commented Apr 18, 2017

In a terminal emulator.
From http://www.unicode.org/Public/9.0.0/ucd/EastAsianWidth.txt:
26A1;W # So HIGH VOLTAGE SIGN
If I am not mistaken, W means East Asian Wide. I believe that it basically means that the character occupies two columns in a terminal.

East Asian Wide (W): ... This category includes characters that have explicit halfwidth counterparts, along with characters that have the [UTR51] property Emoji_Presentation

Also from http://unicode.org/Public/emoji/4.0/emoji-data.txt:
26A1 ; Emoji_Presentation # 4.0 [1] (⚡) high voltage

@chubin
Copy link
Owner

chubin commented Apr 18, 2017

Yes, you are right, that is a bug and it should be fixed. Thank you for reporting!
Could you please check (if it is possible) the report in the browser?
Does it look the same?

@chubin
Copy link
Owner

chubin commented Apr 18, 2017

What location is that?

@ghost
Copy link
Author

ghost commented Apr 18, 2017

@chubin
Copy link
Owner

chubin commented Apr 18, 2017

Strange enough but it works in the browser

@ghost
Copy link
Author

ghost commented Apr 18, 2017

Maybe because the browser is built with old Unicode data and defaults to width 1.
I don't know from where does terminal emulator get information that 26A1 has witdh of 2, because wcwidth(0x26A1) returns -1, i.e. current version of glibc doesn't know about this character (should be fixed in 2.26 which introduces Unicode 9.0 support). I think it uses font's data about glyphs width. (BTW wcwidth == -1 leads to edition glitches in bash (readline?) if there is ⚡ in the current line).
utf8proc_charwidth(0x26a1) (function from utf8proc library) correctly returns 2.

UPD: I think terminal emulator uses g_unichar_iswide from glib. g_unichar_iswide(0x26a1) returns TRUE

@DirtYiCE
Copy link

Please note that the built-in png output is also wrong, http://wttr.in/Budapest.png?lang=en currently generates:
budapest

@chubin
Copy link
Owner

chubin commented Jun 8, 2018

@DirtYiCE @BerserkerTroll @MAGICCC @cig0 @slump64
The problem is fixed. Please test.
Does it work for you?

Place where the everlasting thunder:

curl wttr.in/:test-thunder
https://wttr.in/:test-thunder (in browser)
http://de.wttr.in/:test-thunder.png

@ghost
Copy link
Author

ghost commented Jun 9, 2018

browser:
image
terminal:
image

@chubin
Copy link
Owner

chubin commented Jun 9, 2018

@slump64 So it is still broken in @slump64 's Chrome, but it works fine in one my firefoxes as far as I can see (but not in all of them). I'm curios what about other browsers

@cig0
Copy link

cig0 commented Jun 11, 2018

Hey @chubin,

  • Working well in iTerm2 and Terminal.app
  • Broken on all browsers I tried, each on it latest version: Safari (macOS + iOS), Chrome, Brave (macOS + iOS), TorBrowser, Firefox

@Pimon1
Copy link

Pimon1 commented Aug 9, 2018

I justed tested it, it's neither working in iTerm2 or Terminal.app nor in the browser (Firefox 61). And besides, the voltage signs/emojis are not shown in the terminal screenshot from @slump64 (or are they currently blinking?).

This is wttr.in/:test-thunder in iTerm2:

wttr in

@cig0
Copy link

cig0 commented Aug 10, 2018

This is weird, it's working well on my iTerm2:

screen shot 2018-08-09 at 21 04 42

screen shot 2018-08-09 at 21 07 13

@Pimon1
Copy link

Pimon1 commented Aug 10, 2018

It's a little weird - as it is always, when it comes to character encoding. It seems that the "string lengths" of the emojis are determined in wttr.py:137. A quick test in Python (combining lines 137, 50 and 48):

print len(re.compile(r'(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]').sub('', 'xy').decode('utf8'))
2

print len(re.compile(r'(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]').sub('', '😉').decode('utf8'))
2

print len(re.compile(r'(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]').sub('', '⚡').decode('utf8'))
1

I'm not familiar with Python, but this seems to be wrong. But, as usual, we're not alone: https://stackoverflow.com/questions/42783173/getting-proper-length-of-emojis

(And in PHP, the suggestion is to use utf-16le to deterrmine the string length of emojis, strlen(iconv('utf-8', 'utf-16le', $str)) / 2, see https://stackoverflow.com/questions/30604427/php-length-of-string-containing-emojis-special-chars)

@ghost
Copy link
Author

ghost commented Aug 10, 2018

you should do your next project in Golang if you want to work with unicode for sure

@Nikorasu
Copy link

Nikorasu commented Aug 28, 2018

I'm still seeing this broken, in both my terminals and browsers with various fonts.. It appears the same in either.
weatherglitch

@ghost
Copy link
Author

ghost commented Sep 18, 2018

@chubin ok in mate-terminal (version 1.20), but Firefox thinks that it is a narrow character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants