Bug 58325 - XSSFDrawing.getShapes() returns zero if sheet has more than one embedded OLE object
Summary: XSSFDrawing.getShapes() returns zero if sheet has more than one embedded OLE ...
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.12-FINAL
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-03 13:29 UTC by ltamura
Modified: 2017-06-04 19:05 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ltamura 2015-09-03 13:29:35 UTC
I am trying to find out the position (sheet, row, column) of each embedded file in my workbook.

I am testing my code with two different files both created and handled exclusively using Excel 2010:
1) lt.xlsx : only one embedded file in a single sheet
2) db.xlsx : three embedded files in a single sheet

Code:
	public static void bugcheck(String path) throws EncryptedDocumentException, InvalidFormatException, FileNotFoundException, IOException{
		Workbook wbook = WorkbookFactory.create(new FileInputStream(path));
		if (wbook instanceof XSSFWorkbook ) {
			//test each sheet
			for (int n = 0; n < wbook.getNumberOfSheets(); n++){
				XSSFSheet sheet = ((XSSFWorkbook)wbook).getSheetAt(n);
				System.out.print("sheet " + sheet.getSheetName() + " - ");
				XSSFDrawing drawing = ((XSSFSheet)sheet).getDrawingPatriarch();
				//drawing = ((XSSFSheet)sheet).createDrawingPatriarch();

				List<XSSFShape> shapes = drawing.getShapes();
				System.out.println("drawing.getShapes().size() = " + drawing.getShapes().size());
				Iterator<XSSFShape> it = shapes.iterator();
				while(it.hasNext()) {           
					XSSFShape shape = it.next();
					System.out.println("Col1:"+((XSSFClientAnchor)shape.getAnchor()).getCol1());
					System.out.println("Col2:"+((XSSFClientAnchor)shape.getAnchor()).getCol2());
					System.out.println("Row1:"+((XSSFClientAnchor)shape.getAnchor()).getRow1());
					System.out.println("Row2:"+((XSSFClientAnchor)shape.getAnchor()).getRow2());

				} 
			}
		}
	}


Code returns the following:
Testing: lt.xlsx
sheet MetasNM001 - drawing.getShapes().size() = 1
Col1:1
Col2:1
Row1:2
Row2:2

Testing: db.xlsx
sheet MetasNM001 - drawing.getShapes().size() = 0
sheet Plan1 - drawing.getShapes().size() = 0


The tested files are attached to this bug submission
Comment 1 ltamura 2015-09-03 13:33:40 UTC
The attachments can be found at: https://dl.dropboxusercontent.com/u/24010836/excel%20files.zip
Comment 2 Dominik Stadler 2016-01-02 21:14:14 UTC
Some initial analysis:
* Usually shapes are stored in elements of <xdr:twoCellAnchor editAs="oneCell">
* The provided document with multiple shapes uses some VML/DrawingML compatibility structure via an "AlternateContent" structure:
<mc:AlternateContent xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006">
    <mc:Choice Requires="a14" xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main">
      <xdr:twoCellAnchor editAs="oneCell">

So in order to support these types of documents, we would need to add support for the alternate content section, however the version of the XML Schema that we use does not seem to have these. See also the spec at 2.17.4 "Roundtripping Alternate Content" in the spec "WordprocessingML Reference Material" for some related description.

I have added a disabled unit test in r1722665 which can be used as reproducer.
Comment 3 PJ Fanning 2017-06-04 16:34:43 UTC
The tests in TestUnfixedBugs.java for this test case are passing now.
Comment 4 PJ Fanning 2017-06-04 16:51:18 UTC
https://github.com/apache/poi/pull/57 enables the tests
Comment 5 Javen O'Neal 2017-06-04 19:03:09 UTC
This appears to have been fixed.
Moved disabled unit tests into production in r1797600 and r1797602.

Thanks for the help!