Bug 57423 - shiftRows() produces a corrupted xlsx file
Summary: shiftRows() produces a corrupted xlsx file
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.11-FINAL
Hardware: PC All
: P2 blocker with 22 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 59581 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-01-07 15:35 UTC by Andrew Wdziekonski
Modified: 2019-04-09 17:14 UTC (History)
3 users (show)



Attachments
java test code (2.76 KB, text/x-java-source)
2015-01-07 15:35 UTC, Andrew Wdziekonski
Details
xlsx file used for testing (9.21 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2015-01-07 15:37 UTC, Andrew Wdziekonski
Details
JUnit test class. (1.92 KB, text/plain)
2017-12-10 16:51 UTC, Luca
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Wdziekonski 2015-01-07 15:35:34 UTC
Created attachment 32353 [details]
java test code

XSSF implementation of Sheet.shiftRows() causes corrupted output xlsx file when the shift value is bigger than the number of row(s) being shifted. This is regardless of whether the shift value is negative or positive.

Excel 2010 on opening the output file says:
"Excel found unreadable content" and offers recovering the file by removing the unreadable content.

This can be observed in examples like the following:
negative shift of 1 row by less than -1
negative shift of 2 rows by less than -2
positive shift of 1 row by 2 or more 
positive shift of 2 rows by 3 or more

Please find java test code attached along with xlsx template file used for testing.
Comment 1 Andrew Wdziekonski 2015-01-07 15:37:07 UTC
Created attachment 32354 [details]
xlsx file used for testing
Comment 2 Dominik Stadler 2015-03-01 18:40:49 UTC
In a quick test on Linux with LibreOffice I could not reproduce this, the shifted rows looked fine with (4, 5, -3), let's try if it is reproducible on Windows with Excel...
Comment 3 Andrew Wdziekonski 2015-03-06 09:50:43 UTC
This is the behaviour I am having on windows excel 2010.
Comment 4 Dominik Stadler 2015-04-30 09:11:47 UTC
I verified that Excel complains about such a sheet. Analysis indicates that the cause is that we do not fix the order of rows in the data structure XSSFSheet->worksheet->sheetdata, which holds an array of rows. 

The shifting currently leaves this array unordered, which seems to be fine for LibreOffice/OpenOffice, but causes Excel to complain.

Unfortunately fixing seems to be a bit harder as the XmlBeans interfaces does not easily allow to change array contents without actually cloning the whole object, let's see if there is a way to do it correctly.
Comment 5 Dominik Stadler 2015-04-30 11:55:18 UTC
I have now added a testcase to class TestUnfixedBugs which verifies this bug, unfortunately this is complicated to fix because XmlBeans makes it hard to reorder things in an array...
Comment 6 Mark Murphy 2016-05-19 19:40:02 UTC
*** Bug 59581 has been marked as a duplicate of this bug. ***
Comment 7 Dominik Stadler 2016-11-08 21:27:12 UTC
Probably one of the XmlCursor.move... methods could be useful here... 

See https://xmlbeans.apache.org/docs/2.0.0/reference/org/apache/xmlbeans/XmlCursor.html#moveXml(org.apache.xmlbeans.XmlCursor)
Comment 8 Luca 2017-12-08 21:07:28 UTC
Not fixed in 3 years?
Comment 9 Javen O'Neal 2017-12-09 00:36:11 UTC
Feel free to submit a patch! Everything is open source.
Comment 10 Luca 2017-12-10 16:51:46 UTC
Created attachment 35597 [details]
JUnit test class.

Add JUnit test class.
Comment 11 Javen O'Neal 2017-12-13 05:15:03 UTC
Before:    After:
  A        A
1 a        <empty>
2 b        <empty>
3 c        c
4          a
5          b

sh.createRow(0).createCell(0).setCellValue("a");
sh.createRow(1).createCell(0).setCellValue("b");
sh.createRow(2).createCell(0).setCellValue("c");
sh.shiftRows(0, 1, 3); //move "a" and "b" 3 rows down

Output: xl/sheet1.xlsx
<?xml version="1.0" encoding="UTF-8"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <dimension ref="A3:B5"/>
  <sheetViews><sheetView workbookViewId="0" tabSelected="true"/></sheetViews>
  <sheetFormatPr defaultRowHeight="15.0"/>
  <sheetData>
    <row r="4"><c r="A4" t="s"><v>0</v></c></row> <!-- "a" -->
    <row r="5"><c r="A5" t="s"><v>1</v></c></row> <!-- "b" -->
    <row r="3"><c r="A3" t="s"><v>2</v></c></row> <!-- "c" -->
  </sheetData>
  <pageMargins bottom="0.75" footer="0.3" header="0.3" left="0.7" right="0.7" top="0.75"/>
</worksheet>
Comment 12 Javen O'Neal 2017-12-13 06:23:07 UTC
(In reply to Luca from comment #10)
> Created attachment 35597 [details]
> JUnit test class.
> 
> Add JUnit test class.

Thanks for the unit test. Applied in r1817975 using `sh.shiftRows(0, 1, 3)`
Comment 13 David Gauntt 2018-09-21 17:04:33 UTC
This bug has reappeared with version 4.0.0.  The following code demonstrates the bug, and demonstrates a workaround.  If bFixBug is set to false, Excel 2011 for Mac indicates that the destination file is corrupted.  If bFixBug is set to true, the file opens without trouble.

	public static void doUnitTest(File file) {
		final XSSFWorkbook workbook = new XSSFWorkbook();
		final XSSFSheet sheet = workbook.createSheet();
		final boolean bFixBug = false;
		final int numRows = 5;
		final int numCols = 5;

		System.out.println("doUnitTest: starting");

		for (int nRow = 0; nRow < numRows; ++nRow) {
			final XSSFRow row = sheet.createRow(nRow);
			for (int nCol = 0; nCol < numCols; ++nCol) {
				final XSSFCell cell = row.createCell(nCol);
				cell.setCellType(CellType.STRING);
				cell.setCellValue(String.format("Row %d col %d", nRow, nCol));
			}
		}

		final int nFirstRow = 3;
		final int nLastRow = 4;
		final int nNumToShift = 1;
		final int nFirstDstRow = nFirstRow + nNumToShift;
		final int nLastDstRow = nLastRow + nNumToShift;

		sheet.shiftRows(nFirstRow, nLastRow, nNumToShift);
		if (bFixBug) {
			for (int nRow = nFirstDstRow; nRow <= nLastDstRow; ++nRow) {
				final XSSFRow row = sheet.getRow(nRow);
				if (row != null) {
					String msg = "Row[rownum=" + row.getRowNum()
							+ "] contains cell(s) included in a multi-cell array formula. "
							+ "You cannot change part of an array.";
					for (Cell c : row) {
						((XSSFCell) c).updateCellReferencesForShifting(msg);
					}
				}
			}

		}

		try (OutputStream fileOut = new FileOutputStream(file)) {
			workbook.write(fileOut);
		} catch (Exception e) {
			System.err.println(e.getMessage());
		} finally {
			try {
				workbook.close();
			} catch (IOException e) {
				System.err.println(e.getMessage());
			}
		}
		System.out.println("doUnitTest: done");
	}