Issue 126883 - PDF export: Invalid characters depending on fonts
Summary: PDF export: Invalid characters depending on fonts
Status: CONFIRMED
Alias: None
Product: General
Classification: Code
Component: ui (show other issues)
Version: 4.1.2
Hardware: PC Linux 64-bit
: P5 (lowest) Normal with 3 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
: 127866 (view as issue list)
Depends on:
Blocks:
 
Reported: 2016-03-23 16:55 UTC by david.vogt
Modified: 2018-08-30 09:22 UTC (History)
7 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Document for reproducing the issue (9.61 KB, application/vnd.oasis.opendocument.text)
2016-03-23 16:55 UTC, david.vogt
no flags Details
Broken PDF (with font embedding enabled) (80.23 KB, application/pdf)
2016-03-23 16:56 UTC, david.vogt
no flags Details
Working PDF (font embedding disabled) (1.66 KB, application/pdf)
2016-03-23 16:56 UTC, david.vogt
no flags Details
Test document and PDF output coming from reproduction of the steps in Comment 11 (331.35 KB, application/zip)
2016-06-07 13:23 UTC, david.vogt
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description david.vogt 2016-03-23 16:55:22 UTC
Created attachment 85364 [details]
Document for reproducing the issue

There is an issue with 4.1.2 on Linux, where apostrophes (') are rendered as copyright signs (©) in the generated PDF output.

It seems to be related to font embedding. If it is activated, the problem happens, if not, the PDF looks clean.

This only seems to happen with some fonts, but not others. I've successfully reproduced this with the official AOO build on CentOS 6.7 with the "Helvetica" font. Other fonts did not exhibit the problem however.

I'm attaching a test document as well as the generated PDF for reference.

This *could* be related to #125012, but I'm not 100% sure, so I'm creating a new issue instead. Feel free to merge if it is indeed the same.
Comment 1 david.vogt 2016-03-23 16:56:05 UTC
Created attachment 85365 [details]
Broken PDF (with font embedding enabled)
Comment 2 david.vogt 2016-03-23 16:56:53 UTC
Created attachment 85366 [details]
Working PDF (font embedding disabled)
Comment 3 oooforum (fr) 2016-03-29 08:38:45 UTC
I was not able to reproduce this problem with 4.1.2 and Win7.
Are you sure that Helvetica font is installed on your CentOS?
Comment 4 mroe 2016-03-29 13:39:02 UTC
Confirmed with AOO 4.1.2 at Ubuntu 64bit.
The issue does not occur with AOO 4.1.1, so it is a regression.
Comment 5 oooforum (fr) 2016-04-01 10:09:33 UTC
(In reply to mroe from comment #4)
> Confirmed with AOO 4.1.2 at Ubuntu 64bit.
Helvetica font is not free and not exit under Linux so, could you explain how you have make?
Comment 6 mroe 2016-04-01 17:22:17 UTC
Hmm. It's strange.
I have installed AOO 4.1.1 and 4.1.2 both with the same printer settings.
In 4.1.1 it shows me Helvetica, Times and some other fonts as installed printer fonts. But in 4.1.2 it does not. :-( (I verified it only in 4.1.1 ... Sorry.)
I must search for the reason. So I erase the regression.

But I see the problem: Why is there no warning for not installed fonts?
It is a long time wish of mine, that the font listbox shows me if a font is installed or not (maybe with an other colour or a coloured background).
Comment 7 Kay 2016-04-01 21:06:48 UTC
(In reply to mroe from comment #6)
> Hmm. It's strange.
> I have installed AOO 4.1.1 and 4.1.2 both with the same printer settings.
> In 4.1.1 it shows me Helvetica, Times and some other fonts as installed
> printer fonts. But in 4.1.2 it does not. :-( (I verified it only in 4.1.1
> ... Sorry.)
> I must search for the reason. So I erase the regression.
> 
> But I see the problem: Why is there no warning for not installed fonts?
> It is a long time wish of mine, that the font listbox shows me if a font is
> installed or not (maybe with an other colour or a coloured background).

Are you saying you want AOO to give a warning that a document which is specifying a font which is NOT installed on the user's system should tell the user that the font used in the document is not available?
Comment 8 mroe 2016-04-02 09:07:36 UTC
(In reply to Kay from comment #7)
> (In reply to mroe from comment #6)
> > But I see the problem: Why is there no warning for not installed fonts?
> > It is a long time wish of mine, that the font listbox shows me if a font is
> > installed or not (maybe with an other colour or a coloured background).
> 
> Are you saying you want AOO to give a warning that a document which is
> specifying a font which is NOT installed on the user's system should tell
> the user that the font used in the document is not available?

Yes. The user expect the same visual result if one export the document as PDF whether a font is embedded or not.

So there are 2 choices: AOO embeds the used available font with a warning or it embeds nothing (and exports nothing) with the warning that the user should disable embedding or reformat the document with available fonts.

But for the last point the user needs a visual hint that a used font is not installed. (I have searched whether there exists an issue for that. But it seems that I have sent this wish long time ago to StarDivision/Sun.)
Comment 9 Kay 2016-04-03 21:18:30 UTC
(In reply to mroe from comment #8)
> (In reply to Kay from comment #7)
> > (In reply to mroe from comment #6)
> > > But I see the problem: Why is there no warning for not installed fonts?
> > > It is a long time wish of mine, that the font listbox shows me if a font is
> > > installed or not (maybe with an other colour or a coloured background).
> > 
> > Are you saying you want AOO to give a warning that a document which is
> > specifying a font which is NOT installed on the user's system should tell
> > the user that the font used in the document is not available?
> 
> Yes. The user expect the same visual result if one export the document as
> PDF whether a font is embedded or not.
> 
> So there are 2 choices: AOO embeds the used available font with a warning or
> it embeds nothing (and exports nothing) with the warning that the user
> should disable embedding or reformat the document with available fonts.
> 
> But for the last point the user needs a visual hint that a used font is not
> installed. (I have searched whether there exists an issue for that. But it
> seems that I have sent this wish long time ago to StarDivision/Sun.)

This MAY be possible but we would need to ascertain difficulty. Many applications work in much this same way -- attempting to find the "closest" match for a non-available font,.  

In any case, there is an outdated, though still useful wiki page that delves into this very issue. The page could use some updates. I think this situation might apply to more platforms and not just *nix.

https://wiki.openoffice.org/wiki/Font-FAQ#What_is_Font_Fallback_in_OpenOffice.org_2.3F
Comment 10 orcmid 2016-04-03 21:44:15 UTC
(In reply to Kay from comment #9)
> (In reply to mroe from comment #8)
> > (In reply to Kay from comment #7)
> > > (In reply to mroe from comment #6)
> This MAY be possible but we would need to ascertain difficulty. Many
> applications work in much this same way -- attempting to find the "closest"
> match for a non-available font,.  
> 
> In any case, there is an outdated, though still useful wiki page that delves
> into this very issue. The page could use some updates. I think this
> situation might apply to more platforms and not just *nix.
> 
> https://wiki.openoffice.org/wiki/Font-
> FAQ#What_is_Font_Fallback_in_OpenOffice.org_2.3F

It's also a little weird in this case because PDF's are expected to have a soft-substitution for classes of fonts too.  So there may be more going on in how PDFs are being produced as well, or there is a settings error in the export options.
Comment 11 david.vogt 2016-06-07 13:20:55 UTC
Hi all

I've just reproduced it on a plain CentOS 6.7 machine and RPM packages from upstream to verify that no other influence causes the problem.

Reproduced as follows:

Preparations
============

1) Ensure Helvetica font is NOT installed
2) Install 4.1.1 packages (Source: [1])
3) Create test document with several test characters (' " ` ´), and set the font deliberately to Helvetica (despite it not being installed)


Test case 1
===========

1) Export as PDF with default (no font embedding, no pdf/1a)
2) Export as PDF with font embedding
3) Export with pdf/1a


Test case 2
===========

1) Uninstall all 4.1.1 packages (yum remove openoffice-*)
2) Install 4.1.2 packages (Source: [2])
3) Repeat steps from test case 1 above


Results
=======

4.1.1 - defaults      - all look good
4.1.1 - pdf1a         - all look good
4.1.1 - font embedded - all look good
4.1.2 - defaults      - all look good
4.1.2 - pdf1a         - broken chars
4.1.2 - font embedded - broken chars


Please let me know if you need anything more to move this to "CONFIRMED" stage.



[1]: http://downloads.sourceforge.net/project/openofficeorg.mirror/4.1.1/binaries/en-US/Apache_OpenOffice_4.1.1_Linux_x86-64_install-rpm_en-US.tar.gz
[2]: http://downloads.sourceforge.net/project/openofficeorg.mirror/4.1.2/binaries/en-US/Apache_OpenOffice_4.1.2_Linux_x86-64_install-rpm_en-US.tar.gz
Comment 12 david.vogt 2016-06-07 13:23:56 UTC
Created attachment 85565 [details]
Test document and PDF output coming from reproduction of the steps in Comment 11
Comment 13 david.vogt 2016-06-14 07:30:09 UTC
I searched the changes between 4.1.1 and 4.1.2 and found this one that could be the culprit:

   http://svn.apache.org/viewvc?view=revision&revision=1705192

I'll do some tests to verify this.
Comment 14 Oliver Sauder 2016-06-16 12:13:37 UTC
I have tested this when unapplying change of http://svn.apache.org/viewvc?view=revision&revision=1705192 this issue disappears as well.

Change of revision 1705192 seems to have caused this regression.
Comment 15 Kay 2016-06-27 15:51:29 UTC
Thanks you all for your work on this issue. I will investigate what r1705192 was attempting to fix and see what can be done.
Comment 16 oooforum (fr) 2016-11-03 08:53:10 UTC
(In reply to Kay from comment #15)
> Thanks you all for your work on this issue. I will investigate what r1705192
> was attempting to fix and see what can be done.
@Kay: did you have found a fix?
Comment 17 c.kruk 2018-08-23 11:30:49 UTC
I checked valid and invalid PDFs using pdffonts program. They differ in encoding. Valid ones use Builtin encoding while invalid ones use WinAnsi one. It seems I found the culprit.

$ pdffonts URW_Palladio_L-OpenOffice_4.1.1.pdf 
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
URWPalladioL-Bold                    Type 1            Builtin          yes no  yes     19  0
URWPalladioL-Roma                    Type 1            Builtin          yes no  yes     24  0
URWPalladioL-Ital                    Type 1            Builtin          yes no  yes      9  0
URWPalladioL-BoldItal                Type 1            Builtin          yes no  yes     14  0

$ pdffonts URW_Palladio_L-OpenOffice_4.1.2.pdf 
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
URWPalladioL-Bold                    Type 1            WinAnsi          yes no  yes     19  0
URWPalladioL-Roma                    Type 1            WinAnsi          yes no  yes     24  0
URWPalladioL-Ital                    Type 1            WinAnsi          yes no  yes      9  0
URWPalladioL-BoldItal                Type 1            WinAnsi          yes no  yes     14  0

$ pdffonts URW_Palladio_L-OpenOffice_4.1.5.pdf 
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
URWPalladioL-Bold                    Type 1            WinAnsi          yes no  yes     19  0
URWPalladioL-Roma                    Type 1            WinAnsi          yes no  yes     24  0
URWPalladioL-Ital                    Type 1            WinAnsi          yes no  yes      9  0
URWPalladioL-BoldItal                Type 1            WinAnsi          yes no  yes     14  0
Comment 18 c.kruk 2018-08-23 11:32:39 UTC
Sorry. I posted the above in the wrong thread.
Comment 19 Matthias Seidel 2018-08-23 11:39:12 UTC
(In reply to c.kruk from comment #18)
> Sorry. I posted the above in the wrong thread.

No problem, both issues are related.
Comment 20 oooforum (fr) 2018-08-30 09:22:06 UTC
*** Issue 127866 has been marked as a duplicate of this issue. ***