Hello, I'm converting some HSSF specific code to use the generic SS one. I keep getting NPEs from XSSFPicture functions (using the Interface, Picture) since they don't contain anchors. Anchors are null. This generic code works just fine on HSSF. I also need the locations of the images, so the getAllImages() just doesn't cut it. FileInputStream file = new FileInputStream(new File("c:\\temp\\test.xlsx")); Workbook wb = WorkbookFactory.create(file); Sheet sheet = wb.getSheetAt(0); XSSFDrawing drawing = ((XSSFSheet)sheet).createDrawingPatriarch(); // Get the shapes for (XSSFShape shape : drawing.getShapes()) { if (shape instanceof Picture) { ClientAnchor anchor = picture.getPreferredSize(); //Fails on XSSF, NPE } }
Created attachment 29077 [details] Test Excel file It contains 2 pictures of a cute walrus (our swamp soccer team logo :) ). Both PNG & JPEG. Both result in null anchors.
Created attachment 29440 [details] Temporary fix As I've also encounter the same problem, I've tried to fix and seems work in my case. Attached please find the temporary fix for the mentioned problem. Please replace the XSSFDrawing.class and XSSFPicture.class in the poi-ooxml-3.8-20120326.jar Followings are the changes I've made (Code changes are based on version: poi-ooxml-3.8-20120326.jar) In folder src\ooxml\java\org\apache\poi\xssf\usermodel: File: XSSFPicture.java Add new Constructor: protected XSSFPicture(final XSSFDrawing drawing, final CTPicture ctPicture, final CTTwoCellAnchor anchor) Description: assign also the anchor to the protected class variable Update function: public XSSFPictureData getPictureData() Changes: Before looping to find the part in the drawing.relations collection, use getDrawing().getRelationById(blipId) to find the docPart. Doing this because seems sometimes the part.getPackageRelationship().getId() return a wrong relation Id. File: XSSFDrawing.java Update function: public List<XSSFShape> getShapes() Changes: If the obj is CTPicture and the parent XML object is CTTwoCellAnchor, call add a new XSSFPicture using the anchor info
Thanks for this, I applied your patch in r1393992 with some tweaks. The problem with null anchors was generic - when reading from a existing drawing all shapes had null anchor: pictures, text boxes, lines, etc. So the fix should apply to all shapes, not only to pictures. > Update function: public XSSFPictureData getPictureData() > Changes: Before looping to find the part in the drawing.relations > collection, use getDrawing().getRelationById(blipId) to find the docPart. > Doing this because seems sometimes the part.getPackageRelationship().getId() > return a wrong relation Id. > Can you upload a unit test demonstrating that part.getPackageRelationship().getId() can be wrong ? Yegor
Created attachment 29447 [details] Example for wrong part.getPackageRelationship().getId() Hi, Yegor, Thanks for applying the fix. Attached please find the testing case (sample-image.xlsx) that I found wrong on the part.getPackageRelationship().getId() I use my testing class TestReadExcelImage to manually check the problem. Please check the worksheet: IMAGE4 There are 3 pictures inside which reference to 2 images. When getting the pictureData of the first picture using picture.getPictureData(), a null value is return: >>output: picture.getPictureData() is null, try finding in relations Then I loop the picture.getDrawing().getRelations() to output and check the part.getPackageRelationship().getId(): >>output: part.getPackageRelationship().getId()=rId2 >>output: part.getPackageRelationship().getId()=rId3 I then extract the excel file and check the XML file: \xl\drawings\_rels\drawing3.xml.rels The id of relationship are: rId2, rId1 which doesn't match with the one using part.getPackageRelationship().getId() So, I think that the part.getPackageRelationship().getId() must be wrong. I then use the picture.getDrawing().getRelationById(embedId) to get the corresponding picutureData. This time, the information returned is correct: >>output: pictureName=/xl/media/image2.jpeg >>output: Fm position = [20, 3] >>output: To position = [22, 3] Afterward, I get the pictureData of the 2nd and 3rd picture using the picture.getPictureData(). Again, the information returned is incorrect: >>output: pictureName=/xl/media/image2.jpeg >>output: Fm position = [2, 4] >>output: To position = [12, 7] >>output: pictureName=/xl/media/image2.jpeg >>output: Fm position = [15, 5] >>output: To position = [25, 8] The name of the image should be image3.jpg instead of image2.jpeg After I applied the fix to the picture.getPictureData(), there seems to be no problem then. I still don't know why the part.getPackageRelationship().getId() return a wrong value but fixing picture.getPictureData() using the getDrawing().getRelationById(blipId) seems to be a quick fix. Hope this help. Thanks.
Thanks for the good catch, applied in svn in r1396539 Regards, Yegor > > Attached please find the testing case (sample-image.xlsx) that I found wrong > on the part.getPackageRelationship().getId() > > I use my testing class TestReadExcelImage to manually check the problem. > > Please check the worksheet: IMAGE4 > There are 3 pictures inside which reference to 2 images. > When getting the pictureData of the first picture using > picture.getPictureData(), a null value is return: > > >>output: picture.getPictureData() is null, try finding in relations > > Then I loop the picture.getDrawing().getRelations() to output and check the > part.getPackageRelationship().getId(): > > >>output: part.getPackageRelationship().getId()=rId2 > >>output: part.getPackageRelationship().getId()=rId3 > > I then extract the excel file and check the XML file: > \xl\drawings\_rels\drawing3.xml.rels > > The id of relationship are: rId2, rId1 which doesn't match with the one > using part.getPackageRelationship().getId() > > So, I think that the part.getPackageRelationship().getId() must be wrong. > > I then use the picture.getDrawing().getRelationById(embedId) to get the > corresponding picutureData. This time, the information returned is correct: > > >>output: pictureName=/xl/media/image2.jpeg > >>output: Fm position = [20, 3] > >>output: To position = [22, 3] > > Afterward, I get the pictureData of the 2nd and 3rd picture using the > picture.getPictureData(). Again, the information returned is incorrect: > > >>output: pictureName=/xl/media/image2.jpeg > >>output: Fm position = [2, 4] > >>output: To position = [12, 7] > >>output: pictureName=/xl/media/image2.jpeg > >>output: Fm position = [15, 5] > >>output: To position = [25, 8] > > The name of the image should be image3.jpg instead of image2.jpeg > > After I applied the fix to the picture.getPictureData(), there seems to be > no problem then. > I still don't know why the part.getPackageRelationship().getId() return a > wrong value but fixing picture.getPictureData() using the > getDrawing().getRelationById(blipId) seems to be a quick fix. > > Hope this help. Thanks.