Apache OpenOffice (AOO) Bugzilla – Issue 56348
Special letter characters in first letter position is not handled by spell checking in Writer
Last modified: 2013-02-07 22:02:03 UTC
New Hungarian breakiterator patch (http://www.openoffice.org/issues/show_bug.cgi?id=56347) contains some Unicode characters in first letter position in words (ALetter). Breakiterator doesn't handle these Unicode characters. For example, suffixed forms of euro sign is not accepted by breakiterator, and €-val (eg. "25 €-val" = "with 25 €" in Hungarian) is broken two parts: € and -val, despite its ALetter declaration.
Karl, Can we do something about this in the 2.0.4 code line? Eike
I thought I reassigned this issue.. so again, Karl, Can we do something about this in the 2.0.4 code line? Eike
I use following StarBasic code to test hu breakiterator, it works as expect, the boundary is 0, 5. Sub Main breakiterator=createUnoService("com.sun.star.i18n.BreakIterator") dim locale as new com.sun.star.lang.Locale locale.Language="hu" wordtype=com.sun.star.i18n.WordType.DICTIONARY_WORD word="€-val" boundary=breakiterator.getWordBoundary(word, 0, locale, wordtype, true) print word, boundary.startPos, boundary.endPos End Sub There two breakiterators, DICTIONARY_WORD is for spell checker, for cursor traveling, you should create edit_word_hu.txt.
SBA->Karl: As discussed via mail, this issue does not fit in the OOo 2.04 time frame. Set Target to OOo 2.x. (Means "next mile stone, but there is no 2.05 target available yet)
nemeth->khong: many thanks for the instruction. I will make the edit_word_hu.txt file for OOo 2.0.5.
Fixed.
ready for QA.
For testing this feature, enter a word with dash, like "re-send", set language to Hungarian, move cursor over the word by control-arrow key, it shoudl treat "re-send" as a word. Previous version treated it as 3 words.
SBA: Thanks, Karl. I was looking "via spellchecker" and saw no difference. But when I do the cursor-travelling as described, I can see the difference. Verified in CWS i18n27.
SBA: Correcting target to OOo 2.1
SBA: OK in OOE680m5 Build 9093. Closed.
This bug is not resolved by the breakiterator patterns. A more trivial example is the bad Unicode fi-ligature handling. Please, check the following patterns in the default English language: finite infinite Only the second form (inner U+FB01 ligature position) is recognized by the break iterator for spell checking. This is a real bug for the new Hungarian spelling dictionary with automatic ligature handling (input character conversion).
Created attachment 67833 [details] Unicode ligatures is not recognized in first letter position by the word breaking algorithm for spell checking
This is a Writer specific bug. Impress handles well these Unicode characters in spell checking.
Summary: -> in Writer
tl->nemeth,sba: In this issue are multiple different problems mentioned. :-( As far as I see everything but the ligature problem mentioned in the posting from 'Mon Feb 15' are already fixed. If that is true and the ligature problem is the only item still missing please either close this issue or set it as a duplicate to issue 113785 which was used to fix the ligature problem only.
assegned to the default contact rbircher > sba feel free to reassigne to your self @all This issue is maybe solved. Can sameone tell more about it?