Apache OpenOffice (AOO) Bugzilla – Issue 104735
PDF files sizes too large, filter options have no effect
Last modified: 2013-02-24 21:07:32 UTC
***************************************** I'm unsure what issue type to set for this, but PDF files exported from the API are too large. Here is comparison to other exports: Word Template File 75264 bytes Word Export 28729 bytes Word Print to Adobe (B&W) 23638 bytes Word Print to Adobe (Color) 27390 bytes OOo Writer Export 136327 bytes OOo Print to Adobe (Color) 87062 bytes The above example all use a Word Template File (dot). The below files use OTT files previously converted from DOT files. ***************************************** I'm unable to reduce the size of my PDF files. Below is code and comparison sizes. BTW, I'm testing with OOo 2.4.1. Without any filter data information: PropertyValue[] storeProps = new PropertyValue[3]; storeProps[0] = new PropertyValue(); storeProps[0].Name = "FilterName"; storeProps[0].Value = "writer_pdf_Export"; storeProps[1] = new PropertyValue(); storeProps[1].Name = "Pages"; storeProps[1].Value = "All"; storeProps[2] = new PropertyValue(); storeProps[2].Name = "Overwrite"; storeProps[2].Value = Boolean.TRUE; With compression: PropertyValue[] filterData = new PropertyValue[5]; filterData[0] = new PropertyValue(); filterData[0].Name = "UseLosslessCompression"; filterData[0].Value = Boolean.FALSE; filterData[1] = new PropertyValue(); filterData[1].Name = "Quality"; filterData[1].Value = new Integer(50); filterData[2] = new PropertyValue(); filterData[2].Name = "ReduceImageResolution"; filterData[2].Value = Boolean.TRUE; filterData[3] = new PropertyValue(); filterData[3].Name = "MaxImageResolution"; filterData[3].Value = new Integer(150); filterData[4] = new PropertyValue(); filterData[4].Name = "ExportFormFields"; filterData[4].Value = Boolean.FALSE; PropertyValue[] storeProps = new PropertyValue[4]; storeProps[0] = new PropertyValue(); storeProps[0].Name = "FilterName"; storeProps[0].Value = "writer_pdf_Export"; storeProps[1] = new PropertyValue(); storeProps[1].Name = "Pages"; storeProps[1].Value = "All"; storeProps[2] = new PropertyValue(); storeProps[2].Name = "Overwrite"; storeProps[2].Value = Boolean.TRUE; storeProps[3] = new PropertyValue(); storeProps[3].Name = "FilterData"; storeProps[3].Value = filterData; File # Compressed (bytes) Uncompressed (bytes) Original OTT 1 73,639 71,807 23,476 2 67,858 66,726 41,727 3 56,849 56,849 23,478 4 66,551 65,994 13,454 5 71,648 71,648 25,728 6 60,724 60,724 22,919 7 53,569 53,162 14,382 8 63,805 63,805 14,425 9 61,136 61,136 18,023 10 56,550 56,550 13,269 11 57,676 57,676 12,832 12 66,222 66,222 16,106 13 71,423 71,104 19,140 14 60,393 60,393 13,663 15 59,455 58,786 12,770 16 65,497 65,497 17,668 17 61,422 61,422 13,459 Total: 1,074,417 1,014,221 316,519 As you can see, my files are about the same. What's more striking is the fact that my uncompressed size totals best their compressed versions by 60,196 bytes. So, I'm better without compression. All these files are single page, with the exception of three files which are 2 pages. As a comparison, I have a PDF from Apple (the Objective-C language) that is 133 pages, but only weighs in at 1,185,911 bytes. There are 3 types of PDFs with embedded fonts: #1 CourierNewPSMT (Embedded Subset) Type: TrueType Encoding: Built-in TimesNewRomanPS-ItalicMT (Embedded Subset) Type: TrueType Encoding: Built-in TimesNewRomanPSMT (Embedded Subset) Type: TrueType Encoding: Built-in #2 Symbol Type: Type1 Encoding: Built-in Actual Font: Symbol Actual Font Type: Type 1 TimesNewRomanPS-BoldMT (Embedded Subset) Type: TrueType Encoding: Built-in TimesNewRomanPSMT (Embedded Subset) Type: TrueType Encoding: Built-in #3 TimesNewRomanPS-BoldMT (Embedded Subset) Type: TrueType Encoding: Built-in TimesNewRomanPSMT (Embedded Subset) Type: TrueType Encoding: Built-in Below are my files and their type: File # Type 1 1 2 2 3 3 4 3 5 3 6 3 7 3 8 3 9 3 10 2 11 2 12 2 13 2 14 2 15 2 16 2 17 3
It looks like it's the embedded font which are the problem. Apparently, OOo embeds all non-generic fonts into generated PDF file. This issue should be closed and marked invalid. Two questions: Does anyone have documentation on the font OOo doesn't export? Would it worth it to ask for an enhancement to force exclusion of non-generic fonts, or force conversion of fonts to their generic counter parts if one exists? Thanks.
invalid -> closed