WAI
Status Update
Comments
na...@google.com <na...@google.com>
kl...@google.com <kl...@google.com> #2
Is hyphenation done in line layout? Anyone know about our hyphenation dictionary?
[Monorail components: -Blink Blink>Layout]
[Monorail components: -Blink Blink>Layout]
st...@gmail.com <st...@gmail.com> #3
[Empty comment from Monorail migration]
st...@gmail.com <st...@gmail.com> #4
Thanks for filing the issue!
@Reporter: Could you please share a sample test file satisfying the conditions mentioned inhttps://crbug.com/chromium/973102#c0 , which helps us to triage this further in a better way.
@Reporter: Could you please share a sample test file satisfying the conditions mentioned in
st...@gmail.com <st...@gmail.com> #5
On which platform is this? The dictionaries are OS specific.
st...@gmail.com <st...@gmail.com> #6
It's simple, create an HTML file that contains these words and change its content so that the words are at the end of the line. As I said, you'll have to play depending on your screen size and everything, I cannot provide a universal test case. You're probably used to doing this, I only do this on my local computer which is inaccessible to you.
My OS is Android 7.1.1 on a Sony Xperia Z5 Compact, but the issue has been reported to me from a user of Samsung Internet 9.2.10.15 on some Samsung device. So it does affect multiple devices.
My OS is Android 7.1.1 on a Sony Xperia Z5 Compact, but the issue has been reported to me from a user of Samsung Internet 9.2.10.15 on some Samsung device. So it does affect multiple devices.
st...@gmail.com <st...@gmail.com> #7
Thank you for providing more feedback. Adding the requester to the cc list.
For more details visithttps://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
For more details visit
kl...@google.com <kl...@google.com> #8
je...@gmail.com <je...@gmail.com> #9
Unable to reproduce.
kl...@google.com <kl...@google.com> #10
Still occurs here. What did you try to reproduce as you failed? I just opened my website in Chrome on Windows (latest stable) and I can see this.
bo...@gmail.com <bo...@gmail.com> #11
PS: Chrome on Windows now supports hyphenation. And it's just as broken as it was before on other platforms when I initially reported this.
na...@google.com <na...@google.com> #12
Reporter, it would be greatly help us to analyze if you could share any reproducing URLs for our investigations.
I guess the page reporter is seeing has lang="en" or lang="en-uk".
Tests:https://jsbin.com/boxamav/edit?html,output
(may need to shorten "12345" or make it longer depends on fonts)
* Win/Linux/ChromeOS Chrome hyphenates at:
- "start-up" for "en-us"
- "star-tup" for "en-uk" and "en"
* Mac Chrome and Safari hyphenates at:
- "start-up" for "en-us" and "en"
- "star-tup" for "en-uk"
* Firefox doesn't hyphenate at either point.
Two points need investigations:
* I'm not sure if "star-tup" is the correct hyphenation for UK English.
* When the lang is "en", should it be "en-us" or "en-uk"? Currently, Chrome matches Android behavior, but from the tests, Safari seems to use "en-us" for "en".
[Monorail components: -Blink>Layout Blink>Layout>Inline]
I guess the page reporter is seeing has lang="en" or lang="en-uk".
Tests:
(may need to shorten "12345" or make it longer depends on fonts)
* Win/Linux/ChromeOS Chrome hyphenates at:
- "start-up" for "en-us"
- "star-tup" for "en-uk" and "en"
* Mac Chrome and Safari hyphenates at:
- "start-up" for "en-us" and "en"
- "star-tup" for "en-uk"
* Firefox doesn't hyphenate at either point.
Two points need investigations:
* I'm not sure if "star-tup" is the correct hyphenation for UK English.
* When the lang is "en", should it be "en-us" or "en-uk"? Currently, Chrome matches Android behavior, but from the tests, Safari seems to use "en-us" for "en".
[Monorail components: -Blink>Layout Blink>Layout>Inline]
st...@gmail.com <st...@gmail.com> #13
My page lang value is set to "en". Does "en-uk" exist after all? AFAIK the ISO country code for UK is still "GB". At least all browsers offer me to send "en-GB" in the accepted languages header.
Anyway, you can see the effect here:https://ygoe.de/en
Both words appear in the content. Please note that hyphenation is only active for page widths below 420 pixels. You may also need to edit the content to place the words at a line end (open developer tools, find the parent <p> element and add the contenteditable attribute, then edit the text on the page before each word to place it where you want).
Anyway, you can see the effect here:
Both words appear in the content. Please note that hyphenation is only active for page widths below 420 pixels. You may also need to edit the content to place the words at a line end (open developer tools, find the parent <p> element and add the contenteditable attribute, then edit the text on the page before each word to place it where you want).
st...@gmail.com <st...@gmail.com> #14
Thanks, sorry, you're right, not "en-uk", but "en-gb".
Confirmed a few things:
1. "star-tup" is not a correct hyphenation even for "en-gb".
2. Android, and Chrome Android/Win/Linux/ChromeOS uses "en-GB" when "en" is set.
https://android.googlesource.com/platform/frameworks/base/+/master/core/jni/android_text_Hyphenator.cpp#143
From the test result, it looks like macOS uses "en-US" when "en" is set.
The "hyphenator.js" uses "en-us" when "en" is set.
https://github.com/mnater/Hyphenator/blob/master/Hyphenator.js#L93
3. Our "en-gb" dictionary is up-to-date with the TeX hyphenation dictionary.
https://github.com/hyphenation/tex-hyphen/tree/master/misc
4. The "en-gb" dictionary has the "r1tu" entry, meaning to hyphenate as "r-tu". This entry does not exist in the "en-us" dictionary. The "hyphenator.js" has this entry too.
https://github.com/mnater/Hyphenator/blob/master/patterns/en-gb.js
5. Firefox and hyphenator.js does not hyphenate "startup" at all.
I'm still not sure whether the issue is in the "en-gb" TeX dictionary (reproduces on macOS too) or in the hyphenator code (not reproducible on Firefox and hyphenator.js,) and also not sure whether to map "en" to "en-gb" or to "en-us".
Confirmed a few things:
1. "star-tup" is not a correct hyphenation even for "en-gb".
2. Android, and Chrome Android/Win/Linux/ChromeOS uses "en-GB" when "en" is set.
From the test result, it looks like macOS uses "en-US" when "en" is set.
The "hyphenator.js" uses "en-us" when "en" is set.
3. Our "en-gb" dictionary is up-to-date with the TeX hyphenation dictionary.
4. The "en-gb" dictionary has the "r1tu" entry, meaning to hyphenate as "r-tu". This entry does not exist in the "en-us" dictionary. The "hyphenator.js" has this entry too.
5. Firefox and hyphenator.js does not hyphenate "startup" at all.
I'm still not sure whether the issue is in the "en-gb" TeX dictionary (reproduces on macOS too) or in the hyphenator code (not reproducible on Firefox and hyphenator.js,) and also not sure whether to map "en" to "en-gb" or to "en-us".
kl...@google.com <kl...@google.com> #15
The following revision refers to this bug:
https://chromium.googlesource.com/chromium/src/+/a2c814f8d7db0008fc653d99532e5e7b8ff64732
commit a2c814f8d7db0008fc653d99532e5e7b8ff64732
Author: Koji Ishii <kojii@chromium.org>
Date: Tue Jun 22 01:06:11 2021
Change "en" hyphenation to use "en-us" instead of "en-gb"
When the specified langauge is "en", this patch changes to use
the "en-us" hyphenation dictionary instead of the "en-gb".
It looks like this behavior matches the other browsers.
Android maps "en" to "en-gb", but because Android takes the
language from the system, it is usually more specific (i.e.,
"en-us" or "en-gb", not "en".) On the other hand, CSS
prohibits using the system language <crbug.com/676270 > that
the use of "en" is more common.
Bug: 973102
Change-Id: I7547725b9d30fc137f987fb200fa2e4b699d2c21
Reviewed-on:https://chromium-review.googlesource.com/c/chromium/src/+/2975039
Reviewed-by: Kent Tamura <tkent@chromium.org>
Commit-Queue: Koji Ishii <kojii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#894484}
[modify]https://crrev.com/a2c814f8d7db0008fc653d99532e5e7b8ff64732/third_party/blink/renderer/platform/text/hyphenation/hyphenation_minikin.cc
[modify]https://crrev.com/a2c814f8d7db0008fc653d99532e5e7b8ff64732/third_party/blink/renderer/platform/text/hyphenation_test.cc
commit a2c814f8d7db0008fc653d99532e5e7b8ff64732
Author: Koji Ishii <kojii@chromium.org>
Date: Tue Jun 22 01:06:11 2021
Change "en" hyphenation to use "en-us" instead of "en-gb"
When the specified langauge is "en", this patch changes to use
the "en-us" hyphenation dictionary instead of the "en-gb".
It looks like this behavior matches the other browsers.
Android maps "en" to "en-gb", but because Android takes the
language from the system, it is usually more specific (i.e.,
"en-us" or "en-gb", not "en".) On the other hand, CSS
prohibits using the system language <
the use of "en" is more common.
Bug: 973102
Change-Id: I7547725b9d30fc137f987fb200fa2e4b699d2c21
Reviewed-on:
Reviewed-by: Kent Tamura <tkent@chromium.org>
Commit-Queue: Koji Ishii <kojii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#894484}
[modify]
[modify]
ma...@gmail.com <ma...@gmail.com> #16
The message #14 changes the dictionary for "en" to "en-us", which hyphenates "startup" correctly.
The issue in the "en-gb" hyphenation dictionary is not addressed yet though.
The issue in the "en-gb" hyphenation dictionary is not addressed yet though.
nf...@google.com <nf...@google.com> #17
I don't know whether it's advisable to default from "en" to "en-US" given the worldwide spread of GB-based English (including Australia, India, South Africa, Singapore and others) over the very regional home of US-based English (including Canada only).
Even if you suggest using en-US here, it's probably wrong because my content uses en-GB spelling. And if I set en-GB instead of en, nothing has improved for me. It might even get worse on Macs.
Anyway, it shouldn't matter which is used if "star-tup" is invalid everywhere. So if the system produces that hyphenation, *something* is broken for sure. And please don't forget "JavaScript" as well.
I don't understand the note about Android knowing a more specific language from the system. The language definition comes from the lang attribute in the HTML document. It can change to anything and is completely unrelated to the current system's locale setting. Most websites have a language selector that redirects the visitor to another language version of the site, setting another lang attribute.
Even if you suggest using en-US here, it's probably wrong because my content uses en-GB spelling. And if I set en-GB instead of en, nothing has improved for me. It might even get worse on Macs.
Anyway, it shouldn't matter which is used if "star-tup" is invalid everywhere. So if the system produces that hyphenation, *something* is broken for sure. And please don't forget "JavaScript" as well.
I don't understand the note about Android knowing a more specific language from the system. The language definition comes from the lang attribute in the HTML document. It can change to anything and is completely unrelated to the current system's locale setting. Most websites have a language selector that redirects the visitor to another language version of the site, setting another lang attribute.
da...@gmail.com <da...@gmail.com> #18
Thanks for the comment.
As in thehttps://crbug.com/chromium/973102#c14 , the switch to "en-us" is done to be interoperable with Safari and Firefox. As you point out, it's not related with this issue, but we found we're not interoperable with other browsers, therefore we took the change.
As you might have figured out, the "star-tup" issue reproduces in Safari too when you set lang="en-gb". All browsers use the TeX hyphenation dictionaries:
https://www.tug.org/tex-hyphen/
or one derived from TeX. I just learned its format as part of the investigation for this issue. As far as I understood, it is an issue in the dictionary itself. So I assume it reproduces in TeX too, though I don't have environment to test it.
On the other hand, however, it does not reproduce in Firefox and hyphenator.js even when I set lang="en-gb", so I'm going to look into why the difference appear. Hopefully that can figure out the real cause of the issue.
That's where I am now, sorry for the slow steps but I hope your understanding.
As in the
As you might have figured out, the "star-tup" issue reproduces in Safari too when you set lang="en-gb". All browsers use the TeX hyphenation dictionaries:
or one derived from TeX. I just learned its format as part of the investigation for this issue. As far as I understood, it is an issue in the dictionary itself. So I assume it reproduces in TeX too, though I don't have environment to test it.
On the other hand, however, it does not reproduce in Firefox and hyphenator.js even when I set lang="en-gb", so I'm going to look into why the difference appear. Hopefully that can figure out the real cause of the issue.
That's where I am now, sorry for the slow steps but I hope your understanding.
Description
Steps to reproduce:
* On an Android 5 device (should also work on emulator) create an AndroidHttpClient and connect to a server that chooses "TLS_DHE_RSA_WITH_AES_256_GCM_SHA384" as the preferred cipher.
What happens:
* An exception is thrown (e.g. SSLHandshakeException, depending on the httpClient)
What should happen:
* A successfully established connection.
Further observations:
To isolate the error i overrode the used socket factory to selectively change the ciphers that are offered to the server. I was able to observe that the connection failed when using getDefaultCipherSuites(); and worked when using getSupportedCipherSuites();. My guess is that there's one cipher in the SupportedCipherSuites that is preferred over TLS_DHE_RSA_WITH_AES_256_GCM_SHA384.
Next i tried to establish a connection using single ciphers from the list of default ciphers and log which work; in my case:
"TLS_RSA_WITH_AES_128_CBC_SHA", "TLS_RSA_WITH_AES_256_CBC_SHA", "SSL_RSA_WITH_RC4_128_SHA"
So there are ciphers in the list of default ciphers which do work; but when offering the whole list of default ciphers, the connection fails. When offering only those 3, the connection succeeds.
With a bit more trial and error i found out that the array of default ciphers work if you remove "TLS_DHE_RSA_WITH_AES_256_GCM_SHA384" from it.
My guess is that either the implementation of TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 on the android side or on the server side (no idea what server this is) is broken.
Note: establishing a connection with "TLS_DHE_RSA_WITH_AES_128_GCM_SHA256" works.
So, to recap:
* support for TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 was added in android 5
* establishing a connection to servers preferring TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 works in android 4.4. and fails in android 5.
There are already several bug reports mentioning the same error (often in connection with mail-servers though).