Bug 60406 - Image extensions getting altered.
Summary: Image extensions getting altered.
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: XWPF (show other bugs)
Version: 3.15-FINAL
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2016-11-23 06:38 UTC by Subhra Jyoti Lahiri
Modified: 2016-11-25 15:16 UTC (History)
0 users

Added the output snapshot which validates the bug. (3.20 KB, image/png)
2016-11-23 06:38 UTC, Subhra Jyoti Lahiri

Note You need to log in before you can comment on or make changes to this bug.
Description Subhra Jyoti Lahiri 2016-11-23 06:38:42 UTC
Created attachment 34468 [details]
Added the output snapshot which validates the bug.

I am using Apache POI library to extract images from a word(docx) file. The issue that I am facing is regarding extension(image type) extraction and is described as follows:
   * I am getting wrong extension for images(embedded via insert object from insert tab).
   * When the image is embedded via drag and drop then the extension is perfectly extracted

Steps to reproduce :
1. Create a docx file with two images embedded to it.
   a. Embed first image (assume it to be 1.jpeg) via (Insert Tab -> Insert Object -> Create from file -> Add object).
   b. Embed second image (assume it to be 2.png) via drag and dropping in the word file.

2. Create a java program which accepts the file created and processes the file and extracts the image file and shows the "image type" and "name" in the console.

Expected Result:
5 => //Document.PICTURE_TYPE_JPEG = 5
6 => //Document.PICTURE_TYPE_PNG = 6

Current Result:
2 => //Document.PICTURE_TYPE_EMF = 2
6 => //Document.PICTURE_TYPE_PNG = 6

Code used for extracting the image information:

InputStream content = null;   
content = new BufferedInputStream(new FileInputStream(filePath));
XWPFDocument doc = new XWPFDocument(content);
List<XWPFPictureData> pics = doc.getAllPictures();
for(XWPFPictureData pic : pics)