Bug 51781 - Blank rows vs content deleted rows
Summary: Blank rows vs content deleted rows
Status: RESOLVED INFORMATIONPROVIDED
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-07 12:37 UTC by Murthy
Modified: 2022-02-21 17:29 UTC (History)
1 user (show)



Attachments
This file have empty rows (12.80 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2022-02-03 13:42 UTC, Anupriya
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Murthy 2011-09-07 12:37:12 UTC
Hi,

We have an issue while reading the Excel file.

If a Row is deleted by right click and selecting delete from options retrieved, the row is not being retrieved while reading. (I assume this is the correct way)

But if the complete row is selected and clicked on Delete button (from keyboard) ( i mean content deletion), the row is being treated as Empty values and this row is being considered. (I assume this is wrong)

Is there anyway the later section be work similar to first way?

Thanks
Murthy
Comment 1 Nick Burch 2011-09-09 14:51:10 UTC
POI gives you the data that is actually present in the file. What excel chooses to write / not write is up to it

Look at MissingCellPolicy and MissingRowPolicy if you want more control of how empty or missing rows/cells are treated by the API
Comment 2 Anupriya 2022-02-03 13:02:38 UTC
Where is this MissingRowPolicy ? 
I could only found MissingCellPolicy. 

I am also facing the same issue. Row with BLANK Cells should not be considered in getPhysicalNumberOfRows() API call but it is being considered. 
So at the end, getLastRowNum() and getPhysicalNumberofRows() are returning same data in my case and hence not fulfilling my usecase.
Comment 3 PJ Fanning 2022-02-03 13:18:51 UTC
Can you add a test case to prove that getPhysicalNumberOfRows() is returning wrong value?

Have you looked at the XML inside the xlsx? Perhaps, there are rows with no cells or rows with cells that only have styles but blank data.
Comment 4 Anupriya 2022-02-03 13:42:52 UTC
Created attachment 38184 [details]
This file have empty rows

This file has an empty row. and total of 11 rows
So I am expecting below response :
1. getLastRowNum() --> 10 (as the count starts from 0 + it do consider the empty rows as mentioned in the doc : https://poi.apache.org/apidocs/dev/org/apache/poi/xssf/usermodel/XSSFSheet.html#getLastRowNum--) 

2. getPhysicalNumberOfRows() --> 11 (actual output) but 10 (expected) as this should not consider the row with empty data. Otherwise there is no diff in these both instead of starting value. 

Empty row is not physical row, right?

Also, while fetching data in a loop for a row, it is giving the empty row as null and so the NPE.
                Row currentRow = sheet.getRow(rowNum); // this is null for empty row.

My point is: if the row is null, why it is considered in count for physical rows.
Comment 5 Anupriya 2022-02-03 13:48:05 UTC
Hi Fanning,

Thank you for your quick response. In addition to my above comment, I would like to mention that I do understand there can be any use case where you actually want to fetch empty row (may be I can not see any use case though) but there should be any API that should return only nonEmptyRowCount.
Comment 6 PJ Fanning 2022-02-03 13:53:02 UTC
In XSSFWorkbook, we have this:

private final SortedMap<Integer, XSSFRow> _rows = new TreeMap<>();

public int getPhysicalNumberOfRows() {
    return _rows.size();
}

You have not provided a test case - just a vague description of something that gives us very little to work with. Even an xlsx file would be a start.
Comment 7 Anupriya 2022-02-03 13:56:59 UTC
I have attached the xlsx file, in attachment section. 

Following is my test case : 

@Test(expectedExceptions = TransformationException.class)
    public void ExcelRowValidationTest() throws IOException, TransformationException {
        Sheet sheet = new XSSFWorkbook(new FileInputStream(getFile(TransformationConstants.TEST_MAPPING_FILE_ROWS_INVALID))).getSheetAt(0);
        Mockito.doReturn(sheet).when(utilSpy).getExcelSheetAt(Mockito.anyString(), Mockito.anyInt());
        Assert.assertNull(utilSpy.convertExcelColumnsToMap(TransformationConstants.TEST_MAPPING_FILE_ROWS_INVALID, TransformationConstants.COLUMN_3, TransformationConstants.COLUMN_4, TransformationConstants.CALLER_ID_MAPPING));
    }


API being called up :

public Map<String, String> convertExcelColumnsToMap(String excelFileName, int keyColIndex, int valueColIndex, String callerId) throws TransformationException {

        try {
            logger.info("Entering convert from Excel to Map API");
            Map<String, String> map = new HashMap<>();
            Sheet sheet = getExcelSheetAt(excelFileName, 0);
            int rowCounter = 1; //because rowCounter in excel starts from 0

            System.out.println("last : " + sheet.getLastRowNum());
            System.out.println("physical rows : " + sheet.getPhysicalNumberOfRows());


            if (StringUtils.equalsIgnoreCase(TransformationConstants.CALLER_ID_MAPPING, callerId) && sheet.getPhysicalNumberOfRows() != TransformationConstants.FIXED_ROWS_COUNT) {
                logger.error("Number of rows in mapping definition are not as per fixed value." + TransformationConstants.FIXED_ROWS_COUNT);
                throw new TransformationException(ValidationMessages.MAPPING_FILE_INVALID_ROW_COUNT + TransformationConstants.FIXED_ROWS_COUNT);
            }
            for (int rowNum = 1; rowNum <= sheet.getLastRowNum(); rowNum++) {
                logger.info("Processing row : " + (rowNum + 1));
                Row currentRow = sheet.getRow(rowNum);

                if (currentRow != null) {
                    logger.info("This api is called up from : " + callerId);
                    //System.out.println(StringUtils.isBlank(currentRow.getCell(keyColIndex).getStringCellValue()));
                    //System.out.println(StringUtils.equals("BLANK",currentRow.getCell(keyColIndex).getCellType().toString()));
                    if (StringUtils.equalsIgnoreCase(TransformationConstants.CALLER_ID_MAPPING, callerId) && currentRow.getCell(keyColIndex) != null && !StringUtils.equals("BLANK",currentRow.getCell(keyColIndex).getCellType().toString()) && currentRow.getCell(valueColIndex) != null) {
                        logger.info("Validating the content in mapping definition");
                        if (TransformationConstants.baseElementList.contains(currentRow.getCell(keyColIndex).getStringCellValue()) && TransformationConstants.standardElementList.contains(currentRow.getCell(valueColIndex).getStringCellValue())) {
                            map.put(currentRow.getCell(keyColIndex).getStringCellValue(), currentRow.getCell(valueColIndex).getStringCellValue());
                        } else {
                            logger.error("Either Base Form element or Standard Form element is invalid");
                            throw new TransformationException(ValidationMessages.MAPPING_FILE_INVALID_ELEMENT + (rowNum + 1) + ", refer input file : " + excelFileName);
                        }
                    } else if (StringUtils.equalsIgnoreCase(TransformationConstants.CALLER_ID_FORM, callerId) && currentRow.getCell(keyColIndex) != null) {
                        logger.info("Setting the Form and mapping file name in a map");
                        String value = null;
                        if (currentRow.getCell(valueColIndex) != null) {
                            value = currentRow.getCell(valueColIndex).getStringCellValue();
                        }
                        map.put(currentRow.getCell(keyColIndex).getStringCellValue(), value);
                    } else {
                        logger.error(ValidationMessages.EMPTY_CELL + (rowNum + 1) + " refer input file : " + excelFileName);
                        throw new TransformationException(ValidationMessages.EMPTY_CELL + (rowNum + 1) + ", refer input file : " + excelFileName);
                    }
                }

                rowCounter++;
            }

            // Double check as getPhysicalNumberofRows() API is considering rows with BLANK Cell Type as well.
            if(rowCounter != TransformationConstants.FIXED_ROWS_COUNT){
                logger.error("Number of rows in mapping definition are not as per fixed value." + TransformationConstants.FIXED_ROWS_COUNT);
                throw new TransformationException(ValidationMessages.MAPPING_FILE_INVALID_ROW_COUNT + TransformationConstants.FIXED_ROWS_COUNT);
            }

            return map;
        } catch (TransformationException transformationException) {
            logger.error(transformationException.getMessage());
            throw new TransformationException(transformationException.getMessage());
        } catch (FileNotFoundException fileNotFoundException) {
            logger.error("An error occurred, file {} doesn't exist in directory", excelFileName, fileNotFoundException);
            throw new TransformationException(ValidationMessages.FILE_NOT_FOUND + excelFileName);
        } catch (IOException ioException) {
            logger.error("Some IOException occurred : " + ioException.getMessage());
            throw new TransformationException(ValidationMessages.GENERAL_EXCEPTION + excelFileName);
        } catch (Exception exception) {
            logger.error(ValidationMessages.GENERAL_CONVERSION_EXCEPTION + excelFileName);
            throw new TransformationException(ValidationMessages.GENERAL_CONVERSION_EXCEPTION + excelFileName);
        }
    }


RESULTS :

D:\openJDK\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:60822,suspend=y,server=n -ea -Didea.test.cyclic.buffer.size=1048576 -javaagent:C:\Users\c-anupriya.garg\AppData\Local\JetBrains\IdeaIC2021.3\groovyHotSwap\gragent.jar -javaagent:C:\Users\c-anupriya.garg\AppData\Local\JetBrains\IdeaIC2021.3\captureAgent\debugger-agent.jar -Dfile.encoding=UTF-8 @C:\Users\c-anupriya.garg\AppData\Local\Temp\idea_arg_file180201787 com.intellij.rt.testng.RemoteTestNGStarter -usedefaultlisteners false -socket60821 @w@C:\Users\c-anupriya.garg\AppData\Local\Temp\idea_working_dirs_testng1.tmp -temp C:\Users\c-anupriya.garg\AppData\Local\Temp\idea_testng1.tmp
Connected to the target VM, address: '127.0.0.1:60822', transport: 'socket'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/c-anupriya.garg/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.30/c21f55139d8141d2231214fb1feaf50a1edca95e/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/c-anupriya.garg/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-simple/1.7.30/e606eac955f55ecf1d8edcccba04eb8ac98088dd/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/c-anupriya.garg/.gradle/caches/modules-2/files-2.1/ch.qos.logback/logback-classic/1.2.3/7c4f3c474fb2c041d8028740440937705ebb473a/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/c-anupriya.garg/.gradle/caches/modules-2/files-2.1/ch.qos.logback/logback-classic/1.2.8/22d21c4dfc77adf6f2f24bf3991846792de50b48/logback-classic-1.2.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[TestNG] Running:
  C:\Users\c-anupriya.garg\AppData\Local\JetBrains\IdeaIC2021.3\temp-testng-customsuite.xml

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.mockito.cglib.core.ReflectUtils$2 (file:/C:/Users/c-anupriya.garg/.gradle/caches/modules-2/files-2.1/org.mockito/mockito-all/1.10.19/539df70269cc254a58cccc5d8e43286b4a73bf30/mockito-all-1.10.19.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of org.mockito.cglib.core.ReflectUtils$2
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
log4j:WARN No appenders could be found for logger (com.metricstream.appstudio.transformation.utils.FileUtils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


last : 10
physical rows : 11

===============================================
Default Suite
Total tests run: 1, Failures: 0, Skips: 0
===============================================

Disconnected from the target VM, address: '127.0.0.1:60822', transport: 'socket'

Process finished with exit code 0
Comment 8 PJ Fanning 2022-02-03 14:14:05 UTC
I wrote a test for your file and there are no null rows.

    @Test
    void rowCount() throws IOException {
        try (XSSFWorkbook wb1 = openSampleWorkbook("FILE_WITH_EMPTY_ROWS.xlsx")) {
            XSSFSheet sheet = wb1.getSheetAt(0);
            assertEquals(11, sheet.getPhysicalNumberOfRows());
            assertEquals(10, sheet.getLastRowNum());
            for (int i = 0; i < sheet.getPhysicalNumberOfRows(); i++) {
                XSSFRow row = sheet.getRow(i);
                assertNotNull(row);
                assertEquals(i, row.getRowNum());
                System.out.println("row " + i + " cells count: " + row.getPhysicalNumberOfCells());
            }
        }
    }

All rows have 5 cells.
Comment 9 PJ Fanning 2022-02-03 14:15:20 UTC
Sorry - I misread the output - rownum 4 (0 based count) has 0 cells but there is a row even though the cell has 0 cells.