Issue 124375 - Active optional hyphens invisible in exported tagged-PDF on Mac
Summary: Active optional hyphens invisible in exported tagged-PDF on Mac
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: printing (show other issues)
Version: 4.1.0-beta
Hardware: Mac Mac OS X, all
: P3 Major (vote)
Target Milestone: 4.1.0
Assignee: hdu@apache.org
QA Contact:
URL:
Keywords: regression
Depends on: 123951
Blocks:
  Show dependency tree
 
Reported: 2014-03-07 10:46 UTC by Gerhard Ochsenfeld
Modified: 2017-05-20 10:35 UTC (History)
4 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---
jsc: 4.1.0_release_blocker+


Attachments
optional syllabification/hyphen not shown in PDF/print (455.99 KB, image/jpeg)
2014-03-07 10:46 UTC, Gerhard Ochsenfeld
no flags Details
zip with 2 documents (odt + pdf) (39.21 KB, application/x-zip-compressed)
2014-03-17 08:21 UTC, Gerhard Ochsenfeld
no flags Details
patch to force soft-hyphen visibility with CoreText (1.10 KB, patch)
2014-03-21 13:29 UTC, hdu@apache.org
jsc: review+
Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description Gerhard Ochsenfeld 2014-03-07 10:46:47 UTC
Created attachment 82813 [details]
optional syllabification/hyphen not shown in PDF/print

At "version" I clicked "4.1.0-dev", while there is no option for the latest beta. This page forced me to chose any "version"! In reality I discribe a problem with AOO 4.1.beta, that I installed this early morning (date 2014-03-07).
As I could find, this is a problem that once was fixed and returned with AOO 4.
See attachment: I wrote syllabification!

Mac OS X 10.9.2:
Problem with the optional hyphens. Please see jpg-attachment: the results with AOO 3.0.0 and AOO 4.1.beta.
The result with AOO 4.1.beta is, that ANY (!) hyphen is not shown in print (or PDF, that is necessary for printing orders or book prints). But only those hyphens, not necessary while a word is not positioned at a lines end, should be suppressed.
Comment 1 Oliver-Rainer Wittmann 2014-03-07 11:48:10 UTC
Thanks Gerhard for pointing out that a corresponding entry for field version is missing.

Thanks Herbert for new entry '4.1.0-beta'
Comment 2 Oliver-Rainer Wittmann 2014-03-07 11:55:29 UTC
I checked the described scenario in AOO 4.1.0 Beta under Windows 7 and the optional hyphen is visible in the exported PDF.

Currently, it looks like a platform-dependent defect on OS X --> putting Herbert in CC
Comment 3 Oliver-Rainer Wittmann 2014-03-07 12:54:22 UTC
I could not reproduce the described defect under Ubuntu 11.10 (32bit) in AOO 4.1.0 Beta
Comment 4 jsc 2014-03-07 13:03:37 UTC
I have tested it on my 10.7.5 system and can't reproduce the problem with the Beta build 

OO410m14(Build:9760)  -  Rev. 1573601
2014-03-03 17:49:03 (Mon, 03 Mar 2014)

I will do a further test with my 10.9 system at home later
Comment 5 jsc 2014-03-10 07:19:41 UTC
I have tested the Beta RC at home on my 10.9 system and was not able to reproduce this problem.
Comment 6 Rainer Bielefeld 2014-03-10 10:43:00 UTC
I see a differencebetween 3.0.0 and 4.1 in reporter's screenshot: 4.1 test is "Justified". But also with a "Justified" text NOT eproducible with server installation of "AOO 4.1.0-Beta – German UI / German locale - [AOO410m14(Build:9760)  -  Rev. 1573601 2014-03-03 17:47:48]" on German WIN7 Home Premium (64bit)", own separate user profile.

@Gerhard Ochsenfeld
May be it will take some time until we find a Mac tester, so please 
b) Attach a sample document.odt (odf source, as short as possible) 
  If you want to attach a test kit with multiple documents zip them into
  a single testkit.zip and attach the  testkit.zip
d) Attach PDF result from sample document above
f) add information 
  f1) Whether problem persists after you have changed font to 
      something different
  f8) Whether your problem persists after you have renamed your user profile 
     (Quit Quickstart before!) before you launch AOO (please see
     <http://www.openoffice.org/development/releases

Please do not cite these hints in your reply, but cite the items like:
f7): Desktop-icon for soffice.exe / AOO File menu -> Open
Comment 7 Rainer Bielefeld 2014-03-10 10:53:04 UTC
@ Gerhard Ochsenfeld
Only for the sake of completeness: do you see the problem also with Daw and Calc with auto-hyphenation?
Comment 8 Gerhard Ochsenfeld 2014-03-17 08:21:59 UTC
Created attachment 82881 [details]
zip with 2 documents (odt + pdf)
Comment 9 Gerhard Ochsenfeld 2014-03-17 08:24:07 UTC
Comment on attachment 82881 [details]
zip with 2 documents (odt + pdf)

You asked for a document, encluding the problem with the optional hyphenation. Inside the zip: "Writer"-document (.odt) and PDF-result.
Comment 10 hdu@apache.org 2014-03-17 14:15:07 UTC
Thanks for the sample. What's interesting about it is the use of the "Nueva Std" font which is reponsible for most of the text. Only the page-number uses something different (Times New Roman). NuevaStd's 'endash' glyph is embedded but is not visible for some reason.

AcrobatReader complaining about the PDF with "Cannot extract the embedded font NuevaStd. Some characters may not display or print correctly."

It would be interesting if fonts other than NuevaStd have the same problem.
Comment 11 hdu@apache.org 2014-03-17 14:34:44 UTC
Update: not the endash (U+2013) is causing the problem but the subsetting of NuevaStd's U+00AD seems to have failed. As I don't have the font I can't debug it.
Comment 12 Oliver-Rainer Wittmann 2014-03-20 15:46:45 UTC
Opening the PDF offers may be an important information - it is PDF/A.

Could someone on the Mac OS X platform check the export to PDF/A with an optional hyphen?
Comment 13 jsc 2014-03-20 15:51:23 UTC
ok Oliver noticed that the export was to pdf-a, with this option I can reproduce the problem
Comment 14 jsc 2014-03-20 15:52:09 UTC
not font specific, I used the default font
Comment 15 Rainer Bielefeld 2014-03-20 16:17:46 UTC
NOT reproducible with server installation of "AOO 4.1.0-Beta – English UI / German locale - [AOO410m14(Build:9760)  -  Rev. 1577236 2014-03-15]" on German WIN7 Home Premium (64bit)", own separate user profile, export-option A-1a, 80%, 150dpi

@jsc
Reproduced with what OS and version?
Really new 4.1 problem or already a problem with 4.0?
Comment 16 jsc 2014-03-20 16:22:08 UTC
it seems to be MacOS only, I reproduced it on MacOS 10.7
Comment 17 hdu@apache.org 2014-03-21 10:05:11 UTC
It shows that the problem happens on CoreText-layouted text for the case of tagged PDF export. The root cause are the different interpretation on whether a soft-hyphen should be visible or not [1].

http://www.cs.tut.fi/~jkorpela/shy.html

CoreText follows the modern interpretation of keeping it invisible. With Writer's "portion concept" it is impossible to know for lower layers whether a portion is at the line end. So Writer usually doesn't emit the soft-hyphen at all, but replaces it with a dash. Writer doesn't do this replacement if it exports to the tagged PDF file format, though.

The solution to consolidate these different interpretations on what each layer of AOO and the platform should do is to either
- force Writer to emit the same unicode independent on whether the target is a display, a printer, a plain PDF or a tagged PDF
- or force CoreText to treat the soft-hyphen the same as a dash character
Comment 18 hdu@apache.org 2014-03-21 10:43:40 UTC
The link referenced above [1] points to the PDF specification's chapter on soft hyphens (14.8.2.2.3) and the conclusion is that
"Thus, in PDF, the soft hyphen has unambiguously a meaning as a visible character, i.e. the meaning presented in this document as the original one, and incompatible with its Unicode semantics."

With the incompatible semantics between the Unicode specification and the PDF specification on that soft-hyphen topic and Writer doing its own thing the most risc-less and least inconvenient solution is to have CoreText treat the soft-hyphen as a dash while still pretending to the software layers above that the soft-hyphen remains unmolested. That's a difficult balancing act but it seems to be possible.
Comment 19 hdu@apache.org 2014-03-21 13:29:28 UTC
Created attachment 82940 [details]
patch to force soft-hyphen visibility with CoreText
Comment 20 jsc 2014-03-24 07:40:09 UTC
grant showstopper flag because it is a regression
Comment 21 jsc 2014-03-24 07:41:25 UTC
Comment on attachment 82940 [details]
patch to force soft-hyphen visibility with CoreText

reviewed, the fix looks good to me
Comment 22 SVN Robot 2014-03-24 07:48:45 UTC
"hdu" committed SVN revision 1580779 into trunk:
#i124375# force soft-hyphen visibility for CoreText to meet Writer+EEng expec...
Comment 23 SVN Robot 2014-03-24 07:51:47 UTC
"hdu" committed SVN revision 1580780 into branches/AOO410:
#i124375# force soft-hyphen visibility for CoreText to meet Writer+EEng expec...
Comment 24 hdu@apache.org 2014-03-24 07:56:27 UTC
Fixed with the commits above on trunk and in the AOO 4.1 release branch.