Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | combining characters in indic and keyboard traversal | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | gsl | Reporter: | caolanm | ||||||||||||||
Component: | code | Assignee: | eric.savary | ||||||||||||||
Status: | CLOSED FIXED | QA Contact: | issues@gsl <issues> | ||||||||||||||
Severity: | Trivial | ||||||||||||||||
Priority: | P3 | CC: | frank.meies, issues, ooo | ||||||||||||||
Version: | 680m104 | ||||||||||||||||
Target Milestone: | OOo 2.0.1 | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | Linux, all | ||||||||||||||||
Issue Type: | DEFECT | Latest Confirmation in: | --- | ||||||||||||||
Developer Difficulty: | --- | ||||||||||||||||
Attachments: |
|
Description
caolanm
2005-06-01 13:00:11 UTC
Created attachment 26788 [details]
example tamil document
Created attachment 26789 [details]
sample tamil font
Created attachment 26791 [details]
a demo in 1.1.4 format
cmc->fme: know anything about this sort of combining character and keyboard traversal ? *Seems* to work in 1.1.4 where on opening the .sxc three presses of "->" takes us from left to right of full sentence, while in 1.9.106 it takes six yeah, works in the stock 1.1.4 from openoffice.org and not in a stock 1.9.106. Created attachment 26856 [details]
a simple standalone testcase for icu
Created attachment 26857 [details]
build script
1.9.106 output (icu 2.6) is... Character Boundaries... ----- forward: ----------- 0 1 |AACD| 1 3 |AACD| 3 4 |AACD| 4 5 |AACD| 5 6 |AACD| 6 7 |AACD| while 1.1.4 (icu 2.2) output is... Character Boundaries... ----- forward: ----------- 0 1 |AACD| 1 3 |AACD| 3 4 |AACD| 4 6 |AACD| 6 7 |AACD| i.e. 2.6 calls it 6 logical characters, while 2.2. calls it 5 characters http://www.jtcsv.com/cgibin/icu-bugs?findid=1587 might be relevent Created attachment 26860 [details]
patch
Well that patch reverts the behaviour to 1.1.X, but the current http://www.unicode.org/reports/tr29/ says that The Grapheme_Cluster_Break property values are defined in http://www.unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakProperty.txt and that list does not list the tamil vowel signs, but the older http://www.unicode.org/Public/3.2-Update/DerivedCoreProperties-3.2.0.txt did. So *apparently* icu is following the spec. Unless "Boundaries may be further tailored for requirements of different languages, such as the addition of “ch†for Slovak, or Indic, Thai or Tibetan character clusters." implies that it can be extended to give the patch behaviour. Dunno really. I have create a local charactor breakiterator rule in i18npool for Tamil and applied the patch to the rule. ready for QA. re-open issue and reassign to oc@openoffice.org reassign to oc@openoffice.org reset resolution to FIXED Ready for QA. re-open issue and reassign to oc@openoffice.org reassign to oc@openoffice.org reset resolution to FIXED Hi Eric, please take over re-open issue and reassign to es@openoffice.org reassign to es@openoffice.org reset resolution to FIXED Verified in CWS i18n20 Ok in src680m135 |