Bug 53568

Summary: Pictures in XSSF have NULL anchors
Product: POI Reporter: Toni Helenius <toni.helenius>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: onionhead0708
Priority: P2    
Version: 3.8-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Test Excel file
Temporary fix
Example for wrong part.getPackageRelationship().getId()

Description Toni Helenius 2012-07-19 05:49:28 UTC
Hello,

I'm converting some HSSF specific code to use the generic SS one. I keep getting NPEs from XSSFPicture functions (using the Interface, Picture) since they don't contain anchors. Anchors are null. This generic code works just fine on HSSF.

I also need the locations of the images, so the getAllImages() just doesn't cut it.

FileInputStream file = new FileInputStream(new File("c:\\temp\\test.xlsx"));
Workbook wb = WorkbookFactory.create(file);
Sheet sheet = wb.getSheetAt(0);

XSSFDrawing drawing = ((XSSFSheet)sheet).createDrawingPatriarch();

// Get the shapes
for (XSSFShape shape : drawing.getShapes()) {
	if (shape instanceof Picture) {
		ClientAnchor anchor = picture.getPreferredSize(); //Fails on XSSF, NPE
		}
	}
Comment 1 Toni Helenius 2012-07-19 06:02:00 UTC
Created attachment 29077 [details]
Test Excel file

It contains 2 pictures of a cute walrus (our swamp soccer team logo :) ). Both PNG & JPEG. Both result in null anchors.
Comment 2 OnionHead 2012-10-03 13:08:41 UTC
Created attachment 29440 [details]
Temporary fix


As I've also encounter the same problem, I've tried to fix and seems work in my case. 

Attached please find the temporary fix for the mentioned problem.

Please replace the XSSFDrawing.class and XSSFPicture.class in the poi-ooxml-3.8-20120326.jar

Followings are the changes I've made (Code changes are based on version: poi-ooxml-3.8-20120326.jar)

In folder src\ooxml\java\org\apache\poi\xssf\usermodel:

File: XSSFPicture.java
Add new Constructor: protected XSSFPicture(final XSSFDrawing drawing, final CTPicture ctPicture, final CTTwoCellAnchor anchor)
Description: assign also the anchor to the protected class variable

Update function: public XSSFPictureData getPictureData()
Changes: Before looping to find the part in the drawing.relations collection, use getDrawing().getRelationById(blipId) to find the docPart. 
Doing this because seems sometimes the part.getPackageRelationship().getId() return a wrong relation Id.

File: XSSFDrawing.java
Update function: public List<XSSFShape>  getShapes()
Changes: If the obj is CTPicture and the parent XML object is CTTwoCellAnchor, call add a new XSSFPicture using the anchor info
Comment 3 Yegor Kozlov 2012-10-04 11:28:09 UTC
Thanks for this, I applied your patch in r1393992 with some tweaks. The problem with null anchors was generic - when reading from a existing drawing all shapes had null anchor: pictures, text boxes, lines, etc. So the fix should apply to all shapes, not only to pictures.


> Update function: public XSSFPictureData getPictureData()
> Changes: Before looping to find the part in the drawing.relations
> collection, use getDrawing().getRelationById(blipId) to find the docPart. 
> Doing this because seems sometimes the part.getPackageRelationship().getId()
> return a wrong relation Id.
> 


Can you upload a unit test demonstrating that part.getPackageRelationship().getId() can be wrong ? 

Yegor
Comment 4 OnionHead 2012-10-04 17:55:27 UTC
Created attachment 29447 [details]
Example for wrong part.getPackageRelationship().getId()

Hi, Yegor,

Thanks for applying the fix.

Attached please find the testing case (sample-image.xlsx) that I found wrong on the part.getPackageRelationship().getId() 

I use my testing class TestReadExcelImage to manually check the problem.

Please check the worksheet: IMAGE4
There are 3 pictures inside which reference to 2 images.
When getting the pictureData of the first picture using picture.getPictureData(), a null value is return:

>>output: picture.getPictureData() is null, try finding in relations

Then I loop the picture.getDrawing().getRelations() to output and check the part.getPackageRelationship().getId():

>>output: part.getPackageRelationship().getId()=rId2
>>output: part.getPackageRelationship().getId()=rId3

I then extract the excel file and check the XML file: \xl\drawings\_rels\drawing3.xml.rels

The id of relationship are: rId2, rId1 which doesn't match with the one using part.getPackageRelationship().getId()

So, I think that the part.getPackageRelationship().getId() must be wrong.

I then use the picture.getDrawing().getRelationById(embedId) to get the corresponding picutureData. This time, the information returned is correct:

>>output: pictureName=/xl/media/image2.jpeg
>>output: Fm position = [20, 3]
>>output: To position = [22, 3]

Afterward, I get the pictureData of the 2nd and 3rd picture using the picture.getPictureData(). Again, the information returned is incorrect:

>>output: pictureName=/xl/media/image2.jpeg
>>output: Fm position = [2, 4]
>>output: To position = [12, 7]
>>output: pictureName=/xl/media/image2.jpeg
>>output: Fm position = [15, 5]
>>output: To position = [25, 8]

The name of the image should be image3.jpg instead of image2.jpeg

After I applied the fix to the picture.getPictureData(), there seems to be no problem then. 
I still don't know why the part.getPackageRelationship().getId() return a wrong value but fixing picture.getPictureData() using the getDrawing().getRelationById(blipId) seems to be a quick fix.

Hope this help. Thanks.
Comment 5 Yegor Kozlov 2012-10-10 10:50:21 UTC
Thanks for the good catch, applied in svn in r1396539

Regards,
Yegor

> 
> Attached please find the testing case (sample-image.xlsx) that I found wrong
> on the part.getPackageRelationship().getId() 
> 
> I use my testing class TestReadExcelImage to manually check the problem.
> 
> Please check the worksheet: IMAGE4
> There are 3 pictures inside which reference to 2 images.
> When getting the pictureData of the first picture using
> picture.getPictureData(), a null value is return:
> 
> >>output: picture.getPictureData() is null, try finding in relations
> 
> Then I loop the picture.getDrawing().getRelations() to output and check the
> part.getPackageRelationship().getId():
> 
> >>output: part.getPackageRelationship().getId()=rId2
> >>output: part.getPackageRelationship().getId()=rId3
> 
> I then extract the excel file and check the XML file:
> \xl\drawings\_rels\drawing3.xml.rels
> 
> The id of relationship are: rId2, rId1 which doesn't match with the one
> using part.getPackageRelationship().getId()
> 
> So, I think that the part.getPackageRelationship().getId() must be wrong.
> 
> I then use the picture.getDrawing().getRelationById(embedId) to get the
> corresponding picutureData. This time, the information returned is correct:
> 
> >>output: pictureName=/xl/media/image2.jpeg
> >>output: Fm position = [20, 3]
> >>output: To position = [22, 3]
> 
> Afterward, I get the pictureData of the 2nd and 3rd picture using the
> picture.getPictureData(). Again, the information returned is incorrect:
> 
> >>output: pictureName=/xl/media/image2.jpeg
> >>output: Fm position = [2, 4]
> >>output: To position = [12, 7]
> >>output: pictureName=/xl/media/image2.jpeg
> >>output: Fm position = [15, 5]
> >>output: To position = [25, 8]
> 
> The name of the image should be image3.jpg instead of image2.jpeg
> 
> After I applied the fix to the picture.getPictureData(), there seems to be
> no problem then. 
> I still don't know why the part.getPackageRelationship().getId() return a
> wrong value but fixing picture.getPictureData() using the
> getDrawing().getRelationById(blipId) seems to be a quick fix.
> 
> Hope this help. Thanks.