Apache OpenOffice (AOO) Bugzilla – Issue 10313
OO fails to create proper ToUnicode tables when exporting to PDF
Last modified: 2013-08-07 15:00:23 UTC
Proper ToUnicode tables are essential for creation of workable PDF documents (documents that support searching, bookmarking, copy/paste,conversion to other formats and touchUp editing). When exporting PDF documents via the built in PDF tool, ToUnicode tables are not created properly, thus breaking document functionality. An easy way to test if the tables are proper: Open the PDF document in Adobe Acrobat 5 ME. Select a Hebrew word, and right click. Choose "create bookmark". If the bookmark title is displayed properly in Hebrew- everything is OK. If you get junk or blanks- it is broken. To repro this bug: * create a Hebrew document in OO. * Use the "PDF Export" button to create a pdf (attached here as bugzilla_oo.pdf) * from the same OO document, print to Acrobat 5 ME distiller (attached here as bugzilla_acrobat_me.pdf) * Open both in Acrobat 5 ME, and run the simple bookmark test above. * Note the diffrences- the distillted document is currect, the OO native export is not. For more information about creating ToUnicode tables, see: Adobe Systems Incorporated, "PDF Reference, Third Edition, Version 1.4", p. 368 http://partners.adobe.com/asn/developer/acrosdk/docs/filefmtspecs/PDFReference.zip Adobe Systems Incorporated, "ToUnicode Mapping File Tutorial", Technical Note #5411 http://partners.adobe.com/asn/developer/pdfs/tn/5411.ToUnicode.pdf
Created attachment 4162 [details] the file created directly from OO pdf export
Created attachment 4163 [details] The file created via print to adobe acrobat 5 me distiller
DL->SBA: Would you please takeover?
=> new
SBA->HI: Please take over. An enhancment or a defect?
This feature is not defect. It will not supported in current.
Contrary to what hi said, ToUnicode tables are part of our PDF export. I'll have a look whether they are broken. But they cannot be that bad since searching works flawlessly.
Pasting from an Ooo document also works as it should. Therefore the ToUnicode tables cannot be wrong, since the fonts contained are subsets and the PDF viewer therefore cannot get the information what glyph means which character from anywhere else. Wherever your problem comes from i doubt it is the ToUnicode tables.
> But they cannot be that bad > since searching works flawlessly Did it work properly for you with Hebrew on Acrobat ME (Not Acrobat US, as it does not support proper searching in Hebrew)? Here both search and copy/paste of Hebrew is completely broken.
Created attachment 5032 [details] When exported directly from OO- search fails for Hebrew text
Created attachment 5033 [details] When exported using Acrobat ME- Search works correctly with Hebrew
fixed in vcl06
sorry, wrong issue
i tried some more: i entered "lalelu" (which undoubtedly is nonsense in hebrew :-) ) while hebrew IME was active, exported the result to PDF and tried to search in the resulting document on Windows XP as well as MacOSX in Acrobat Reader 5.05; both were able to find "la" inside the document. I tried with my current version as well as OpenOffice 644m4. What version did you use ?
pl->hi: please try to reproduce. We'll have to either reproduce or close it.
reassign
We had similar issues which has been fixed.
.