Bug 57494 - Discrepancy between using XSSFWorkbook.getSheetAt(0).getRowIterator() and XSSFWorkbook.getSheetAt(0).getRow(#)
Summary: Discrepancy between using XSSFWorkbook.getSheetAt(0).getRowIterator() and XSS...
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.8-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-26 01:50 UTC by Hugo Tagle
Modified: 2015-01-28 14:53 UTC (History)
0 users



Attachments
10 row spreadsheet. Second row missing data (9.12 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2015-01-26 01:50 UTC, Hugo Tagle
Details
Four top rows contain data. 158 empty ones at the bottom (11.47 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2015-01-26 14:45 UTC, Hugo Tagle
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hugo Tagle 2015-01-26 01:50:45 UTC
Created attachment 32396 [details]
10 row spreadsheet. Second row missing data

The spreadsheet attached is missing the first row of data after the header row. Creating a XSSFWorkbook(FileInputStream fis) object eliminates the empty row. The sheet rowsIterator contains only the header and the rest of valid rows in the spreadsheet. However, trying to access the rows using the sheet getRow(#) method throws a NullPointerException when trying getSheetAt(0).getRow(1).

Used a testcase to quickly show the issue:

        FileInputStream fis = new FileInputStream(new File("2nd_Row_Missing.xlsx"));
        
        Workbook wb = new XSSFWorkbook(fis);                

        logger.debug("rowIterator()");

        Iterator x = wb.getSheetAt(0).rowIterator();
        int i = -1;
        while (x.hasNext()) {
            i++;
            Row row = (Row) x.next();

            Iterator y = row.cellIterator();
            
            y.next(); // don't care for first numeric cell
            Cell cell = (Cell) y.next(); // I want the second one (string)
            
            logger.debug("row = " + i + ", cell = 1, value =  " + cell.getStringCellValue());
        }

        logger.debug("getRow()");
        logger.debug("row = 0, cell = 1, value = " + wb.getSheetAt(0).getRow(0).getCell(1).toString());
        logger.debug("row = 1, cell = 1, value = null; throws NullPointerException");
        logger.debug("row = 2, cell = 1, value = " + wb.getSheetAt(0).getRow(2).getCell(1).toString());
        logger.debug("row = 3, cell = 1, value = " + wb.getSheetAt(0).getRow(3).getCell(1).toString());
        logger.debug("row = 4, cell = 1, value = " + wb.getSheetAt(0).getRow(4).getCell(1).toString());
        logger.debug("row = 5, cell = 1, value = " + wb.getSheetAt(0).getRow(5).getCell(1).toString());
        logger.debug("row = 6, cell = 1, value = " + wb.getSheetAt(0).getRow(6).getCell(1).toString());
        logger.debug("row = 7, cell = 1, value = " + wb.getSheetAt(0).getRow(7).getCell(1).toString());
        logger.debug("row = 8, cell = 1, value = " + wb.getSheetAt(0).getRow(8).getCell(1).toString());
        logger.debug("row = 9, cell = 1, value = " + wb.getSheetAt(0).getRow(9).getCell(1).toString());


==Standard Output==
rowIterator()
row = 0, cell = 1, value =  ADDR
row = 1, cell = 1, value =  1220 STATE CAPITOL
row = 2, cell = 1, value =  121 S 3RD ST
row = 3, cell = 1, value =  1419 ADAMS BLVD
row = 4, cell = 1, value =  408 NORFOLK AVE
row = 5, cell = 1, value =  704 PROSPECT AVE
row = 6, cell = 1, value =  U S HWY 30 WEST
row = 7, cell = 1, value =  1600 GREAT PLAINS CENTRE
row = 8, cell = 1, value =  1623 HARNEY ST
getRow()
row = 0, cell = 1, value = ADDR
row = 1, cell = 1, value = null; throws NullPointerException
row = 2, cell = 1, value = 1220 STATE CAPITOL
row = 3, cell = 1, value = 121 S 3RD ST
row = 4, cell = 1, value = 1419 ADAMS BLVD
row = 5, cell = 1, value = 408 NORFOLK AVE
row = 6, cell = 1, value = 704 PROSPECT AVE
row = 7, cell = 1, value = U S HWY 30 WEST
row = 8, cell = 1, value = 1600 GREAT PLAINS CENTRE
row = 9, cell = 1, value = 1623 HARNEY ST
Comment 1 Andreas Beeker 2015-01-26 08:33:16 UTC
This is by design ... please have a look at the API:
http://poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFSheet.html#rowIterator()

And the NullPointerException is thrown because ... well getRow(1) is null, as you mentioned it, so getCell(1) fails
Comment 2 Hugo Tagle 2015-01-26 14:45:30 UTC
Created attachment 32398 [details]
Four top rows contain data. 158 empty ones at the bottom
Comment 3 Hugo Tagle 2015-01-26 14:46:08 UTC
Great. Thx for the quick resolution.

As I understand then, the attached spreadsheet is 9 physical rows which is the result of 10 logical rows minus the empty one.

Why would a second spreadsheet with 4 physical rows and 158 empty ones at the bottom return 162? Should the 158 empty ones at the bottom NOT be considered physical ones as in the case where the empty one appeared in the second logical row?

wb.getSheetAt(0).getPhysicalNumberOfRows() = 162

I attached the second spreadsheet I'm referring to just in case.
Comment 4 Nick Burch 2015-01-26 16:26:44 UTC
POI tells you the values that Excel wrote into the file, you'd need to ask Microsoft if you're struggling to make sense of what POI is telling you the file claims to have...
Comment 5 Hugo Tagle 2015-01-28 14:53:58 UTC
Ok. That's fair. 

I assumed, from POI perspective, that if an empty row in the middle of a spreadsheet was not considered a physical row, empty rows at the end would not be either.

Nothing else from me about this bug.

Best regards.