Bug 52255 - RFE: XWPFPictureData should allow registration of new image formats
Summary: RFE: XWPFPictureData should allow registration of new image formats
Alias: None
Product: POI
Classification: Unclassified
Component: XWPF (show other bugs)
Version: 3.8-dev
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2011-11-28 13:06 UTC by pslaby
Modified: 2012-02-26 06:38 UTC (History)
0 users

Sample with various image formats (40.88 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2011-11-29 14:06 UTC, pslaby

Note You need to log in before you can comment on or make changes to this bug.
Description pslaby 2011-11-28 13:06:54 UTC
I use XWPF to create docx files. In addition to the image file formats hardcoded in XWPFPictureData, I need to be able to insert tiff images and possibly others into it. The hardcoded array of known image formats is not complete. Best would be if XWPFPictureData provides an API for listing known formats and adding new ones. Up to 3.8 beta 3 I used to create the image reference by myself and add it to the pictures List, but in 3.8 beta 4 the pictures field is package private in XWPFHeaderFooter.
Comment 1 Nick Burch 2011-11-28 13:42:56 UTC
We need to take care around IDs, relation types, content types etc. However, these are generally known constants

I think the proper fix is probably just to identify all the other image types that can be supported, and add these in to the constants list

Are you able to identify which ones we're missing, and supply the appropriate types?
Comment 2 pslaby 2011-11-28 14:08:38 UTC
Personally, I am mainly missing tiff. From formats listed at http://support.microsoft.com/kb/320314/en-us the following are missing:

But I am not sure whether it is possible to enhance the available formats by installing an image format filter into Word.
Comment 3 Nick Burch 2011-11-28 17:50:37 UTC
Any chance you could upload a file with all those image types in it? That'd make the work of identifying the details of the types very quick!
Comment 4 pslaby 2011-11-29 14:06:38 UTC
Created attachment 27999 [details]
Sample with various image formats

I have tried to insert a Windows Bitmap (BMP), Tiff, Encapsulated Postscript (EPS), WordPerfect Graphics (WPG) and Compressed Enhanced Metafile (EMZ) in Word 2007, the result is attached. My version of Word did not accept PCX. I was not able to find or produce examples of the other image formats originally listed in my post. 

It seems that, with the exception of tiff, Word immediately converts the images to emf or png. Tiff is the only one that remains as tiff in the docx file and (probably) the only missing in XWPFPictureData constants (<Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.tiff"/>).
Comment 5 Yegor Kozlov 2012-02-26 06:38:55 UTC
As of r1293748, POI supports TIFF, EPS, BMP and WPG images. The fix applies to all OOXML modules: XSSF, XWPF and XSLF.
The full list of supported formats is emf|wmf|pict|jpeg|png|dib|gif|tiff|eps|bmp|wpg. DXF, CGM and CDR are not supported by MS Office 2007 / 2010 by default, you have to install an image filter to import those formats and internally MS Office converts them to either PNG or EMF. So if you need to import a file from Autocad or CorelDraw, convert it first to one of the supported formats by POI.

> It seems that, with the exception of tiff, Word immediately converts the images
> to emf or png.

yes, MS Office does so, but if you programmatically insert images in a raw form then MS Office is fine to show them. I guess internally MS Office  handles all vector formats as EMF/WMF. Other formats are converted either when inserting or on the fly.