Apache OpenOffice (AOO) Bugzilla – Issue 85074
X11: wrong display of vocalized justified arabic text
Last modified: 2009-02-12 12:47:42 UTC
Arabic justified text is misaligned in writers linux versions. when i knew how to do attatchments here I will put some screenshots and some text abaout
Created attachment 50730 [details] misaligned justified arabic text
Created attachment 50731 [details] correct arabic text (not justified)
I attatced 2 screenshots of the pdf-export: In attachment "arabicjustifiedtext.png" you can see the misalignements in the red circles. The second attachment "arabictext.png" you can see the same text in its correct form without being justified. These misalignments are only under linux and macos (neooffice), not under windows (but there are other problems). I'am using the native openoffice 2.3.1 german version under debian etch with enabled Complex Text Layout. For arabic scripts I'am using the Scheherazade-Font from scripts.sil.org, but with other fonts there is the same problem. I think the "main poblem" are the vocalizations like shadda, damma, kasra etc. (e.g. the little lines above and under the letters). When deleting some of them the misalignments moving throw the text. When you have further questions please contact me, also when you cannot read my english and want it in german ;-) Thank you.
Created attachment 50732 [details] odt-file example of justified arabic text
Created attachment 50733 [details] arabic font
MRU->HDU: see attached document (and ttf font). It looks, that the justification of Arabic text on Linux is slightly wrong. IME the space characters are smaller than on Windows (cemu, please correct me for the case I am wrong, thanks!).
*** Issue 83220 has been marked as a duplicate of this issue. ***
Created attachment 50741 [details] arabic text correct justified under windows
In file "arabictextwinstretched.png" is an example of the correct justification mechanism in the windows version. Arabic text is not justified by stretching the space caracters, its justified by making the words longer. When making the words longer there will be inserted some "unnamed thing" like in the red circles between some letters in the attached file. And maybe this makes the problems under linux together with the vocalization. I do not know if the space letters smaller or not.
Use Format->Character to set the text language to Arabic in order to get kashida justification with todays OOo. Future versions of OOo Writer will do the justification depending on the automatically detected script type. See issue 28203 for details. Use Tools->Options->Languages to set the language to Arabic for "complex text layouted". For new text this voids the need to set the language manually it via the Format->Character dialog. For other CTL text related justification for BiDi scripts see issue 77976. *** This issue has been marked as a duplicate of 28203 ***
Thank you, but I'am sorry to tell you setting the language options to arabic and also the character formatting does not help. It shows that this only helps when using no vocalization marks like in the screenshots from issue 28203, but not in my case (and/or with my OO version?), because I have over nearly every letter such vocal markings. I testet my file without the vocal markings and it works well (justification corrct), but with vocal marking no chance (misaligned justification) :-/
Created attachment 50757 [details] comparision between vocalized and nonvocalized justified arabic linux
@cemu: Great illustration of the problem, thanks! => accepting I'm afraid this needs a looong debug session.
Thank you for accepting :), but I am think the debug session may be not so long, in the windows version this problem don't exist (there the justification is with and without vocalization correct), but under windows there is another problem (see issue 85089).
Created attachment 50760 [details] ODT-File: comparision vocalized and unvocalized arabic text
The debug session will be long: On windows the Uniscribe engine handles layout and justification, on unix ICU handles layout but does not implement justification, so OOo has to do it itself. The bug with kashida justified vocalized text is exactly in that area, where even good layout engines don't like to venture, even though they specialize in the layout topic...
This bug is also in the 3.0 RC2 version of Openoffice unfixed (posted from version 2.4.1). Additionaly I think this bug should become a priority P2 (and target milestone OOo 3.1), because arabic is one of the six international languages supported by the United Nations, and for example an uncomplete language support will make it impossible to replace Microsoft Office in really international environments and e.g. in the arabic, urdu and farsi speeking world. (At this time I can not recommened OpenOffice to someone who needs bidi language features and I must refer unfortunately!! to MS-Office and Windows-OS) Thank you very much indeed.
I split off issue 97108 for Writer using the wrong justification method for vocalized text. The remainder of this issue is that justified arabic text should look on X11 platforms as it looks on other platforms. I updated the issue summary accordingly.
@hdu: Could you please post current screenshots for this issue with current kashidafix builds? Unfortunately, I still haven't tried to build OOo on Linux yet (: , and probably this won't happen very soon. But still I might have some useful thoughts: First of all, I did an (WinXP) PDF export from the bugdoc of issue 97108. First of all, I almost fainted. Kashidas all over the place, but not where they belong. Then I changed the font back to Arial, and everything looked fine. (Relief) The kashida glyphs seem to jump on the y-Axis, which can be noticed also in the old screenshots of this issue. So probably you'll have to split this issue once more: 1. Remaining X11 kashida problems (maybe also for non-vocalized text) 2. Sheherazade font specific issues (also other platforms, when doing glyph injection) Attaching the PDFs...
Created attachment 58772 [details] PDF export (WinXP) Sheherazade
Created attachment 58773 [details] PDF export (WinXP) Arial
> Could you please post current screenshots for this issue with current (UNX/X11) kashidafix builds? When working on this I already found several problems in our UNX platforms cluster handling for these complex cases. E.g. in RTL contexts the vocalization glyphs were assigned to the left consonant instead of the right, where they seem to belong. Fixing all this ICU+OOo+RTL interaction without breaking any of the other CTL scripts is the main task of this issue. > The kashida glyphs seem to jump on the y-Axis This seems to happen when the kashida glyph is positioned relative to the wrong glyph, which in the current problem case is a vocalization glyph. On the UNX platform this should be fixed by the improved handling of glyph clusters mentioned above, for the USP case we can probably fix it by tweaking UniscribeLayout::GetNextGlyphs() to use the proper base glyph. Separating that second task into an own issue is a good idea. I plan to open a new CWS as soon as CWS kashidafix is integrated into a released MWS.
Created attachment 58783 [details] screenshot on UNX with CWS kashidafix+i97108fix and improved cluster handling
@hdu: looks nice... The switching of the extra justification width to the vowel glyph I'd done for usp puzzled me, and I thought that probably something similar is needed for this case. You probably covered that now with the cluster handling.
Fixed in CWS kashidafix02.
Looks very well :-), but I found an error in the screenshot. I marked it in the next attachment. The position of "la" (لا) is not correct, its too much left (red line). It must be above the green line is. "La" is merged from the to letters alif and lam (a + l = la). I don't see the bug earlier and I do not know if the issue is to reopen or this is a new problem and needs a new issue. It is possible to download somewhere an OOO development version with kashidafix for testing? Thank you.
Created attachment 59970 [details] screenshot on UNX with CWS kashidafix+i97108fix and improved cluster handling with "La" (لا) displacment
I found another error in the screenshot, one letter upon the other. Look in the orange circles in the next attachment. In the word "wa ila" (وإلى) the 2nd (Alif with hamza) and the 3rd (Lam) letter are upon the other. The "word" has 4 letters. I think its best to reopen the bug.
Created attachment 59971 [details] screenshot on UNX with CWS kashidafix+i97108fix and improved cluster handling with "La" (لا) and "wa ila" (وإلى) displacment
Created attachment 59979 [details] screenshot on UNX with CWS kashidafix02
@cemu: thanks for checking. The issue you highlighted in unx_current_2.png is already fixed and I'm quite sure the other one in unx_current_3.png also looks much better. Please see my latest attached screenshot. The CWS kashidafix02, which has all these fixes is about to be integrated and the next milestone announced on http://blogs.sun.com/gullfoss/ should have it. Please test it once it becomes available. If you find remaining problems there please file new issues for them instead of reopening this one. Once the changes for an issue got into the code trunk appending new aspects of the problem to the old issue makes work quite more difficult.
Setting issue status back to latest status in CWS (http://wiki.services.openoffice.org/wiki/ChildWorkSpace#The_Process_Flow *16)
Got into OOO310_m1 with CWS kashidafix02. Closing.