Bug 47245 - HSSFSheet - RuntimeException: Unexpected missing row
Summary: HSSFSheet - RuntimeException: Unexpected missing row
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.5-dev
Hardware: PC Linux
: P2 normal with 9 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 49312 54392 (view as bug list)
Depends on:
Blocks: 49312
  Show dependency tree
 
Reported: 2009-05-22 10:32 UTC by Jonathan Holloway
Modified: 2016-02-24 22:40 UTC (History)
4 users (show)



Attachments
Pricelist file from http://www.gigatronshop.com/download/Cenovnik.xls (771.50 KB, application/vnd.ms-excel)
2012-04-12 14:01 UTC, Sime Essert
Details
Problematic file to test (205.85 KB, application/vnd.ms-excel)
2014-08-18 12:30 UTC, Triqui
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Holloway 2009-05-22 10:32:16 UTC
This exception is thrown in HSSFSheet when dealing with some spreadsheets:

Exception caught
java.lang.RuntimeException: Unexpected missing row when some rows already present
	at org.apache.poi.hssf.usermodel.HSSFSheet.setPropertiesFromSheet(HSSFSheet.java:171)
	at org.apache.poi.hssf.usermodel.HSSFSheet.<init>(HSSFSheet.java:118)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:289)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:202)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:184)

Assumptions:
Relates to bug 41187
Row is null - correct
rowRecordsAlreadyPresent = true

At this point in time _rows contains:
[ROW]
    .rownumber      = 0
    .firstcol       = 0x0000
    .lastcol        = 0x0000
    .height         = 0x0000
    .optimize       = 0x0000
    .reserved       = 0x0000
    .optionflags    = 0x0160
        .outlinelvl = 0
        .colapsed   = false
        .zeroheight = true
        .badfontheig= true
        .formatted  = false
    .xfindex        = f
[/ROW]

I can't see anything out of the ordinary with the spreadsheet, if I comment our the 41887 workaround (starting on line 170) that throws a runtime exception the spreadsheet is output sucessfully, starting as follows:

SP041808
effective 04/18/08...

Please let me know if you need any more info.
Comment 1 Yegor Kozlov 2009-05-23 01:35:35 UTC
Please attach the problem file.

Yegor
Comment 2 Jonathan Holloway 2009-05-23 11:47:33 UTC
I can't provide you with the initial file because of client confidentility reasons.  I'll take a look at this bug myself on Monday.  Many thanks.
Comment 3 Yegor Kozlov 2009-05-24 23:19:02 UTC
(In reply to comment #2)
> I can't provide you with the initial file because of client confidentility
> reasons.  I'll take a look at this bug myself on Monday.  Many thanks.

Do you know the origin of this file? Which version of Excel produced it?
Open the file in Excel and do "save as".  Does the problem still persist? 

Yegor
Comment 4 Sime Essert 2012-04-12 14:01:48 UTC
Created attachment 28594 [details]
Pricelist file from http://www.gigatronshop.com/download/Cenovnik.xls

Here is one more file that produces the same reported error.
Comment 5 Triqui 2012-10-11 10:22:05 UTC
Commenting these lines fixes the issue (as has already been stated):

//   if (rowRecordsAlreadyPresent) {
//      // if at least one row record is present, all should be present.
//      throw new RuntimeException("Unexpected missing row when some rows already present");
//   }

The question is that I don't see the reason why if one row record is present, all should be.

Cells already contain row and column indexes, so creating the missing rows should be no problem, and if there's a conflict between an existing row record and a row record created internally from an existing cell, probably the exception should be thrown from HSSFRow.createCellFromRecord or HSSFRow.addCell if the cell being added already exist.
Comment 6 Triqui 2012-10-11 10:25:24 UTC
*** Bug 49312 has been marked as a duplicate of this bug. ***
Comment 7 Matthias Nott 2014-07-22 21:31:49 UTC
I don't know why this runtime exception was thrown; maybe the assumption is to correctly assume this case should never happen - but appears to do with Business Objects. As I ran into this issue, I've removed the error - not understanding it really - to see how POI behaves.

This unblocked my project; see here how I did it:

http://scn.sap.com/message/15236988#15236988
Comment 8 Deepesh Ramrakhyani 2014-08-18 08:16:23 UTC
I am also facing the above mentioned issue.

I tried using the workaround specified above and it works for me.

Does commenting this line can lead to any future complication?

Please advise.

The link from which the file creating the error can be downloaded is
http://www.bursamalaysia.com/market/derivatives/market-statistics/historical-data/

The file creating the problem is under Daily Trading Summary section of the webpage.


Thanks and Regards,
Deepesh Ramrakhyani
Comment 9 Triqui 2014-08-18 09:16:22 UTC
We have had that fix working for some time now (almost 2 years) and haven't found any issue with it.
Comment 10 Nick Burch 2014-08-18 09:18:21 UTC
It would be great if someone could confirm what the problematic section of records are. BiffViewer should be able to help with this

It would also be good if someone could run one of these problematic files through the Microsoft Binary File Format Validator to see if Microsoft consider them valid or not
Comment 11 Triqui 2014-08-18 12:10:08 UTC
The file provided (Cenovnik.xls) gives a different exception when using BiffViewer. I'm gonna attach one of the files metioned by Deepesh.
Comment 12 Triqui 2014-08-18 12:30:00 UTC
Created attachment 31927 [details]
Problematic file to test

This file contains the bug, but runs properly with BiffViewer.
What I found after debugging is that in HSSFSheet.setPropertiesFromSheet there is a line where it looks for the row 149 and the file contains 149 RowRecords.

This line:
                hrow = getRow(cval.getRow());

cval is the CellIterator. The first problematic cval is a LabelSSTRecord with row number 149.

From the javadoc:
HSSFRow org.apache.poi.hssf.usermodel.HSSFSheet.getRow(int rowIndex)

Returns the logical row (not physical) 0-based. If you ask for a row that is not defined you get a null. This is to say row 4 represents the fifth row on a sheet.

So, since there are only 149 RowRecords, and getRow(149) represents the 150th row on the sheet, which doesn't exists, then hrow becomes null and then the exception is thrown.

Maybe it's as easy as changing that line to:

hrow = getRow(cval.getRow() - 1);

[LABELSST]
    .row    = 0x0095
    .col    = 0x0002
    .xfindex= 0x0045
  .sstIndex = 0x0053
[/LABELSST]
Comment 13 Deepesh Ramrakhyani 2014-08-18 13:32:19 UTC
Hi Triqui,

Please Note that poi jar is also used for loading other .xls and .xlsx files also so how will this code change affect other files.

Because after doing the code changes you mentioned above and trying to access zeroth row by using 

Sheet s=workBook.getSheetAt(0);
Row r=s.getRow(0);
Iterator<Cell> ci=r.cellIterator();
while(ci.hasNext()){
	Cell c=ci.next();
System.out.println("Data => "+c);
			}

Please Advise.

Thanks and Regards,
Deepesh Ramrakhyani.
Comment 14 Deepesh Ramrakhyani 2014-08-18 13:39:15 UTC
Hi Triqui,

The file attached has 150 rows so 149th(Logically) exists

Plus after doing the changes you mentioned while i am trying to access the Zeroth row its giving me the content of the second row in the file.

Please have a look into it.

Thanks and Regards,
Deepesh Ramrakhyani.
Comment 15 Deepesh Ramrakhyani 2014-08-18 13:42:49 UTC
Sorry for the similar comment twice.

Thanks and Regards,
Deepesh Ramrakhyani.
Comment 16 Deepesh Ramrakhyani 2014-08-18 13:46:19 UTC
First comment i.e. Comment 13 is incomplete Please consider Comment no 14.
Sorry again.
Comment 17 Triqui 2014-08-18 14:09:30 UTC
Well, I stated my opinion before (comment 5). If no RowRecords are present, POI creates them, but if some are present and some not, then this exception says it won't do it. I can't see the logic behind this. So, to me, it looks like the best solution is to comment out that exception.

On the other hand, since I've been trying to find out what is wrong with those files, I've noticed some problems with the row number and the row index.

You are right. Adding -1 is wrong.
But the file has 149 RowRecords (from 0 to 148). They are stored in a TreeMap with a 0-based KeySet. When I debug it it has the following KeySet [0 ... 148].
Some keys may be missing if a row is empty. But, please note that there is no row 149, which is the one this cell belongs to:
[LABELSST]
    .row    = 0x0095
    .col    = 0x0002
    .xfindex= 0x0045
  .sstIndex = 0x0053
[/LABELSST]

Again, I think this is a problem with the file and somewhat unrelated to the bug. The real question is what to do when a row records is missing? If no row records exists it's ok, POI creates them. But if there are some row records but not all of them, then it's a problem. Is it really a problem?
I think POI can create the missing row records and carry on, but I would like it if someone with more knowledge would comment on this.
Comment 18 Deepesh Ramrakhyani 2014-08-18 14:35:53 UTC
The excel file on BFFValidation gives positive result so the file is proper hence according to me the fix is correct.

And I don't think that if a file has some row records but not all then it should be a problem.

Thanks and Regards,
Deepesh Ramrakhyani.
Comment 19 Dominik Stadler 2016-02-23 13:36:07 UTC
*** Bug 54392 has been marked as a duplicate of this bug. ***
Comment 20 Triqui 2016-02-23 17:18:35 UTC
The bug 41187 was what led to the modification being discussed here.
The patch can be seen here: https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/hssf/usermodel/HSSFSheet.java?r1=655278&r2=655277&pathrev=655278

It's nice, and it's working. Except for the fact it throws an exception when everything suggest that it shouldn't.

The problem appears when there is one or more RowRecords [ROW] but then there are also one or more cells for a row which has not been defined/declared.
For instance, in the document attached there is a single row record:
[ROW]
    .rownumber      = 3
    .firstcol       = 0x0000
    .lastcol        = 0x0000
    .height         = 0x0960
    .optimize       = 0x0000
    .reserved       = 0x0000
    .optionflags    = 0x0140
        .outlinelvl = 0
        .colapsed   = false
        .zeroheight = false
        .badfontheig= true
        .formatted  = false
    .optionsflags2  = 0x000F
        .xfindex       = f
        .topBorder     = false
        .bottomBorder  = false
        .phoeneticGuide= false
[/ROW]

And then there are several LABELSST records for different rows without a corresponding row record.

[LABELSST]
    .row    = 0x0000
    .col    = 0x0000
    .xfindex= 0x0016
  .sstIndex = 0x0000
[/LABELSST]

[LABELSST]
    .row    = 0x0001
    .col    = 0x0000
    .xfindex= 0x0016
  .sstIndex = 0x0001
[/LABELSST]


Every comment in this bug and duplicates say that removing the line throwing the exception fixes the problem with no side effects.

I really can't see a reason to keep that exception.
Maybe Josh Micich could shed some light on this issue. I don't know if he is still involved in the project, though.
Comment 21 Triqui 2016-02-23 17:20:45 UTC
The problem still persists in 3.14-beta1
Comment 22 Triqui 2016-02-24 10:10:55 UTC
I made a mistake yesterday, sorry.

The attachment I was referring to in comment #20 is the file from bug 54392, not the one attached here.
That file contains only one [ROW] record but there are several [LABELSST] records with a different row number.

The file I attached in this bug some years ago is a bit different. There is only one error in the first sheet. It contains 149 [ROW] records, but there is a [LABELSST] record for the 150th row.
As mentioned in the code if there is at least one [ROW] record, all of them should be present too, otherwise it throws an exception. This is what we are challenging here, and I say that there is no need for that exception.
When that throw is removed, the code goes on to create a RowRecord for the 'orphan' cell and everything works perfectly.

In fact, if you open the file with LibreOffice (haven't had a chance to try it in Excel) it shows 150 rows, and I have been working with that line commented out for some years now and haven't seen a single error caused by that change.

I keep patching HSSFSheet for every release, but I would like to stop already.
Is there anything I can do to get this fixed?
Comment 23 Dominik Stadler 2016-02-24 22:40:09 UTC
Thanks for the thorough review of the problem, I have now commented out the exception and verified that our full test-suite still runs fine, so I think it is ok to change this as it makes POI behave more like LibreOffice/Excel do.

So this is applied as r1732235 now.