Apache OpenOffice (AOO) Bugzilla – Issue 71804
PDF: Wrong kashida export
Last modified: 2009-04-22 10:41:11 UTC
Kashidas (in justified alignment) are not exported correctly to pdf, see attached documents.
Created attachment 40790 [details] bugdoc
Created attachment 40791 [details] pdf
Accepting. Problem on Win only.
set target 3.x
Fixed in CWS kashidafix.
@hdu: haven't noticed that there was a separate issue for this (: seems that there are still kashida-gaps in PDF export. Attaching sample.
Created attachment 57537 [details] Sample pdf file
@hennerdrewes: thank you for checking this. Is this with the latest CWS code? Can you attach the corresponding document?
@hdu: I checked out from CVS this morning. Attaching source file. From my previous tryouts, it always seemed a good practice to change formatting a lot in the source documents (fonts, font size, line width) to discover possible problems. Therefore my source document is not the same as the previously attached PDF, but you will see the problem also in this version.
Created attachment 57539 [details] kashida document
Indeed, if there was not enough space for a full injected glyph it wasn't emitted. This is fixed now with the new commit to winlayout.cxx in CWS kashidafix. The more general problem that the extra space should be filled with something even when the minimal kashida width is way to big for that extra space is almost unsolvable. Any ideas? In the PDF-case one could stretch the font matrix but that is an IMHO horrible workaround and the effort would be quite enormous, especially compared to the benefit.
This is what I was trying to accomplish in KashidaWordFix. Straightening out the character spacing as much as possible. But maybe this is all what can be done. Another horrible scenario came to my mind: Applying expanded or condensed character spacing...
The stupid bug was that the glyph injection only worked correctly when the extra space was an exact multiple of the injected-glyph width. This is fixed now and works if there is reasonable room for it. AFAIK KashidaWordFix also stopped when there was not sufficient space for rearranging? The scenario you mentioned is indeed horribly complex... but thankfully it is not nearly as important to fix than the changes needed for the PDF export that we have now in that CWS.
I know I am a (what they call in Hebrew) nudnik (Nervensäge)... The glyph overlapping works quite well for e.g. Arial. With TraditionalArabic (with its huge kashida glyph widths) you still get a lot of gaps.
@nudnik: ;-) @hennerdrewes: Thanks for checking! We really appreciate your expertise and your patches. When the minimal kashida width is still way too big to fill the gaps then what to do? Uniscribe's ScriptTextOut() seems to have the exact same problem. AFAIT it was not introduced by your Kashida*Fix() rearrangement. Do you happen to know how related apps handle this? Do they use line-drawing instead?
The TraditionalArabic font reports at point size 24 a minimal kashida width of 20! To compare, other fonts (Arial, ArabicTypesetting) report 4 at the same size. ScriptTextOut has problems with justification widths larger than the glyph advance width (usually one or two extra pixels don't cause much harm) and smaller than the minimal kashida width. These cases are pretty much covered by the writer engine and by the KashidaWordFix. If we now have an extra width of let's say 22, we need to put out two overlapping kashida glyphs, one justified to the left and one justified to the right in this 22 pixel space, to cover the two pixel gap. I can only guess, that this is what ScriptTextOut does.
> ScriptTextOut has problems with justification widths larger than the glyph > advance width (usually one or two extra pixels don't cause much harm) and > smaller than the minimal kashida width. These cases are pretty much covered by > the writer engine and by the KashidaWordFix. I'm seeing gaps in the screen output for that case too (for some formattings). Did I break KashidaWordFix? > If we now have an extra width of let's say 22, we need to put out two > overlapping kashida glyphs, one justified to the left and one justified to the > right in this 22 pixel space, to cover the two pixel gap. Yes, this is done exactly that way in the current version (now resynced to m32, i.e. on SVN). The problem is if the gap is e.g. less than 5 pixels.
> I'm seeing gaps in the screen output for that case too > (for some formattings). Did I break KashidaWordFix? This is not good... Well this happened all the time, and I did many rounds of debug sessions to improve things. I haven't encountered the problems in the current version though. But I am not surprised that it still can happen. > Yes, this is done exactly that way in the current version > (now resynced to m32, i.e. on SVN). > The problem is if the gap is e.g. less than 5 pixels. I am not sure if I fully understand everything in the GetNextGlyphs() (yet), but why is there this if( 4*nExtraWidth >= mnMinKashidaWidth ) ? It looks like you won't get the needed overlap, if the nExtraWidth is small. Apart from the gaps the PDF export looked fine. Now I tested it again on my Vista machine, and everything turns out pretty garbled. (: Attaching sample.
Created attachment 57606 [details] PDF export on Vista
Now also fixed for vista on the CWS kashidafix (caveat: CWS is on subversion now)
@hdu: wait a second...
Created attachment 58316 [details] small fix
@hennerdrewes: nice one, thanks! Of course glyph injection should fill the whole kashida-space even if the last injected glyph needs to overlap with its predecessor.
with this fix, the pdf output of the bugdos looks much better. With Arial, there were still a lot of tiny but noticable gaps. They seem to be gone completely now. With TraditionalArabic however, there are much less problems, but still, once in a while, the gaps are there. Attaching samples.
Created attachment 58318 [details] bugdoc
Created attachment 58319 [details] result of pdf export
@hdu: for now, I cannot see a reason why this is happening. The glyph injection code seems to be fine now.
I suspect a discrepancy between the MinKashidaWidth and the actual glyph width for this font, testing...
What is strange: both kashida stretched words at the beginning of the line are expanded by exactly the same value (as expected). But the glyph to the right of the kashida is much wider in the problematic case. Maybe the advance width is incorrect (to wide)? And then the glyph is right aligned (XP) to far away from the kashida? I don't think that there is a problem with MinKashidaWidth in this case.
Created attachment 58322 [details] pdf by ghostscript
some more thoughts: It may be interesting to look at the pdf produced by ghostscript. What I find most peculiar: I opened both pdfs in the reader and enlarged the zoom to 800%. Measuring the widths of words and glyphs I find that both outputs are matching, besides the problematic word. There is definitely an additional width. Where does it come from?
The output between the two PDFs is matching indeed except there is this offset since the problematic glyph (initial form of U+0644). Since the width of the isolated U+0644 is quite different, the offset is in the same range as their width-difference... @pl: since CWS glyphadv there is an optimization in the PDF export which might be related: The registerGlyphs() method checks the glyph width for the corresponding isolated codepoint instead of the actual glyph?
@hdu: you are mistaken. That happens only for embeddable -> type1 fonts where GF_ISCHAR is set on glyph ids anyway.
Created attachment 58334 [details] OOo pdf on Vista: surprise
@hdu: It looks as if we have to have to focus on the manual right alignment. On Vista everything looks perfect.
Yup, there was a thinko in the glyph injection case. This code-complexity-reduction didn't trigger reliably...
@hi/@sba: please verify on XP and Vista, please check for regressions on other platforms
Verified in CWS kashidafix.
Verified with 3.1rc1