Bug 62711 - Calling shiftRows corrupts file in POI 4.0
Summary: Calling shiftRows corrupts file in POI 4.0
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 4.0.0-FINAL
Hardware: PC All
: P2 regression with 3 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-12 14:39 UTC by Norman
Modified: 2019-04-02 20:06 UTC (History)
2 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Norman 2018-09-12 14:39:38 UTC
Hi

This code produces a corrupt XLSX with POI 4.0 (works fine in 3.17):

ByteArrayOutputStream os = new ByteArrayOutputStream();
try (Workbook workbook = new XSSFWorkbook()) {
   Sheet worksheet = workbook.createSheet(WorkbookUtil.createSafeSheetName("Export"));
   for (int rowNum = 0; rowNum < 900; rowNum++) {
      Row row = worksheet.createRow(rowNum);
      int colNum = 0;
      for (Object cellValue : Arrays.asList(new Object(),0,0,0,0,0,0,0,0)) {
         row.createCell(colNum++);
      }
   }
   worksheet.shiftRows(0, worksheet.getLastRowNum(), 3);
   workbook.write(os);
}

If I leave out the shiftRows, everything works fine.
It doesn't matter if I write values into the cells or not, the result is the same.

running Java 8, POI 4.0, Windows 10, Excel 2016

Error when opening in Excel:
Removed Records: Cell information from /xl/worksheets/sheet1.xml part
Comment 2 Wolfgang Fahl 2019-02-13 12:54:43 UTC
while at it id love to see a proper removeRow function as outline in the stackoverflow answer. 50.000 people have viewed that question already so I assume there is quite some need.
Comment 3 bschuhmacher 2019-02-27 11:53:57 UTC
I can confirm this bug. Excel started complaining about generated files after we upgraded to 4.0. 

I've been digging to see what was wrong with the file. For Norman's example, in xl/worksheets/sheet1.xml you will find : 
    <row r="4">
      <c r="A1"/>
      <c r="B1"/>
      <c r="C1"/>
      <c r="D1"/>
      <c r="E1"/>
      <c r="F1"/>
      <c r="G1"/>
      <c r="H1"/>
      <c r="I1"/>
    </row>

Which is obviously wrong.

This looks related to this other corruption bug connected to shiftRows : https://bz.apache.org/bugzilla/show_bug.cgi?id=57423

Although I do also confirm that this was working fine in 3.17 whereas this other bug is way older.
Comment 4 Jesper Jørgensen 2019-03-04 11:24:05 UTC
It seem that in XSSFRow:

    protected void shift(int n) {
        int rownum = getRowNum() + n;
        String msg = "Row[rownum=" + getRowNum() + "] contains cell(s) included in a multi-cell array formula. " +
                "You cannot change part of an array.";
        for(Cell c : this){
            ((XSSFCell)c).updateCellReferencesForShifting(msg);
          }
        setRowNum(rownum);
    }

the setRowNum(rownum) is called after the updateCellReferencesForShifting and this causes the cells to have a reference to the old row number and this is what Excel is complaining about. This means that the row might have a row number 5 but the cells could have a cell reference that points to A1. Whether this is the entire issue I cannot say, but moving the setRowNum call up before the for loop will create a document the Excel can read and which have the rows shifted correctly.