Apache OpenOffice (AOO) Bugzilla – Issue 50172
combining characters in indic and keyboard traversal
Last modified: 2005-10-20 13:00:22 UTC
As https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=157815 describes, and the following example shows, traversing combining characters in indic is problematic
Created attachment 26788 [details] example tamil document
Created attachment 26789 [details] sample tamil font
Created attachment 26791 [details] a demo in 1.1.4 format
cmc->fme: know anything about this sort of combining character and keyboard traversal ? *Seems* to work in 1.1.4 where on opening the .sxc three presses of "->" takes us from left to right of full sentence, while in 1.9.106 it takes six
yeah, works in the stock 1.1.4 from openoffice.org and not in a stock 1.9.106.
Created attachment 26856 [details] a simple standalone testcase for icu
Created attachment 26857 [details] build script
1.9.106 output (icu 2.6) is... Character Boundaries... ----- forward: ----------- 0 1 |AACD| 1 3 |AACD| 3 4 |AACD| 4 5 |AACD| 5 6 |AACD| 6 7 |AACD| while 1.1.4 (icu 2.2) output is... Character Boundaries... ----- forward: ----------- 0 1 |AACD| 1 3 |AACD| 3 4 |AACD| 4 6 |AACD| 6 7 |AACD| i.e. 2.6 calls it 6 logical characters, while 2.2. calls it 5 characters
http://www.jtcsv.com/cgibin/icu-bugs?findid=1587 might be relevent
Created attachment 26860 [details] patch
Well that patch reverts the behaviour to 1.1.X, but the current http://www.unicode.org/reports/tr29/ says that The Grapheme_Cluster_Break property values are defined in http://www.unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakProperty.txt and that list does not list the tamil vowel signs, but the older http://www.unicode.org/Public/3.2-Update/DerivedCoreProperties-3.2.0.txt did. So *apparently* icu is following the spec. Unless "Boundaries may be further tailored for requirements of different languages, such as the addition of “ch†for Slovak, or Indic, Thai or Tibetan character clusters." implies that it can be extended to give the patch behaviour. Dunno really.
I have create a local charactor breakiterator rule in i18npool for Tamil and applied the patch to the rule.
ready for QA. re-open issue and reassign to oc@openoffice.org
reassign to oc@openoffice.org
reset resolution to FIXED
Ready for QA. re-open issue and reassign to oc@openoffice.org
Hi Eric, please take over re-open issue and reassign to es@openoffice.org
reassign to es@openoffice.org
Verified in CWS i18n20
Ok in src680m135