Bug 49761 - Double.NaN can be written but not read with POI
Summary: Double.NaN can be written but not read with POI
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.6-FINAL
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-17 09:13 UTC by Phil Dunlea
Modified: 2010-11-09 10:35 UTC (History)
0 users



Attachments
Contains 3 unit tests written using your framework to show the bug (12.19 KB, application/octet-stream)
2010-08-17 09:13 UTC, Phil Dunlea
Details
Spreadsheet containing NaN (4.00 KB, application/vnd.ms-excel)
2010-08-17 15:05 UTC, Phil Dunlea
Details
Excel Generated Excel file with NaN (17.00 KB, application/vnd.ms-excel)
2010-08-17 15:10 UTC, Phil Dunlea
Details
This is the xls file I tested with that threw the exception. (77.50 KB, application/vnd.ms-excel)
2010-10-13 03:11 UTC, Theodor Mazilu
Details
3 Unit Tests showing error still occurs (6.16 KB, application/octet-stream)
2010-10-13 11:42 UTC, Phil Dunlea
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Phil Dunlea 2010-08-17 09:13:27 UTC
Created attachment 25896 [details]
Contains 3 unit tests written using your framework to show the bug

Overview: 

Part of our software takes spreadsheets created by a 3rd party scientific device that are generated using POI.   These spreadsheets at times can enter the equivalent of Double.NaN. The files are written successfully, can be read by Excel, saved by excel and still work properly. However, when you try to read these files (written by POI) a RuntimeException is thrown.

This means not all files written by POI can be read by POI.



Steps to Reproduce: 

Read in any cell within an Excel file containing the equivalent of Double.NaN.

Additionally, you could also run the attached Unit Test class.  It saves Double.NaN into an excel file (which passes). It fails to read using the Event based and direct methods of reading the file. 



Actual Results: 

RuntimeException is thrown 



Expected Results: 

We expected Double.NaN to be returned when a cell containing the equivalent was encountered.



Build Date & Platform: 

Every build / platform since.

Date: Sat Oct  4 21:43:48 2008
New Revision: 701747



Additional Information:

While this bug may seem trivial, it is a bit of a blocker for our software. 

When reading in a file using POI that contains Double.NaN, the software specifically throws a RuntimeException during the initial reading that we can not recover from.  

The fix that would help us out the best would be to return Double.NaN instead of throwing the RuntimeException.  Since Double.NaN can be written by POI, you should also be able to read it.

This RuntimeException is thrown  in the following method: 

public double readDouble() {
    long valueLongBits = readLong();
    double result = Double.longBitsToDouble(valueLongBits);
    if (Double.isNaN(result)) {
        throw new RuntimeException("Did not expect to read NaN"); // (Because Excel typically doesn't write NaN
    }

    return result;
}



Log results for Unit test: This was run against 3.6

Testsuite: org.apache.poi.hssf.record.TestDoubleNotANumber
Tests run: 3, Failures: 0, Errors: 2, Time elapsed: 0.009 sec
------------- Standard Output ---------------

the sheet [1]:
------------- ---------------- ---------------

Testcase: testWriteNaNToFileSystem took 0.001 sec
Testcase: testEventBasedDoubleNaNError took 0.004 sec
	Caused an ERROR
Unable to construct record instance
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:64)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:263)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:122)
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:85)
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:56)
	at org.apache.poi.hssf.record.TestDoubleNotANumber$NaNSpreadsheetParser.process(TestDoubleNotANumber.java:206)
	at org.apache.poi.hssf.record.TestDoubleNotANumber.testEventBasedDoubleNaNError(TestDoubleNotANumber.java:68)
Caused by: java.lang.RuntimeException: Did not expect to read NaN
	at org.apache.poi.hssf.record.RecordInputStream.readDouble(RecordInputStream.java:270)
	at org.apache.poi.hssf.record.NumberRecord.<init>(NumberRecord.java:43)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:56)

Testcase: testDirectDoubleNaNError took 0.001 sec
	Caused an ERROR
Unable to construct record instance
org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:64)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:263)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:392)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:276)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:201)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:317)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:298)
	at org.apache.poi.hssf.HSSFTestDataSamples.openSampleWorkbook(HSSFTestDataSamples.java:46)
	at org.apache.poi.hssf.record.TestDoubleNotANumber.testDirectDoubleNaNError(TestDoubleNotANumber.java:83)
Caused by: java.lang.RuntimeException: Did not expect to read NaN
	at org.apache.poi.hssf.record.RecordInputStream.readDouble(RecordInputStream.java:270)
	at org.apache.poi.hssf.record.NumberRecord.<init>(NumberRecord.java:43)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.poi.hssf.record.RecordFactory$ReflectionRecordCreator.create(RecordFactory.java:56)
Comment 1 Nick Burch 2010-08-17 09:22:35 UTC
If you create a file using Excel, and put in that a NaN, can poi read that, or does it fail in the same way as a poi written NaN ?

Also, if you could upload an excel created file with a NaN in it, that'd be great as we can use it for a basis of additional unit tests once this is fixed
Comment 2 Phil Dunlea 2010-08-17 15:05:13 UTC
Created attachment 25898 [details]
Spreadsheet containing NaN

Attached is a POI created excel file containing Double.NaN
Comment 3 Phil Dunlea 2010-08-17 15:10:52 UTC
Created attachment 25899 [details]
Excel Generated Excel file with NaN

This was created using Excel
Comment 4 Phil Dunlea 2010-08-17 15:28:55 UTC
Q:  If you create a file using Excel, and put in that a NaN, can poi read that, or
does it fail in the same way as a poi written NaN ?

A:  It fails the same way as if it was written in POI.

Attached are both a spreadsheet written by POI and one written by Excel for testing purposes.
Comment 5 Nick Burch 2010-08-17 16:54:22 UTC
Thanks for all the digging and the files!

If no-one beats me to it, I'll take a look when I'm next near a computer with eclipse on it
Comment 6 Yegor Kozlov 2010-09-04 08:34:22 UTC
A very interesting case, thanks for your investigations.

The point is that Excel's implementation of floating-point arithmetic does not fully adhere to IEEE 754. In particular, Excel does not support the notion of Positive/Negative Infinities and Not-a-Number (NaN). 

In case of Infinities Excel generates a #DIV/0! error. This typically occurs when you divide by 0.
In case of NaN Excel generates an #NUM! error which indicates invalid number. For example, SQRT(-1) will result in a #NUM! error. 

More details can be found at http://support.microsoft.com/kb/78113


POI allows you to set Double.NaN, but Excel displays an unexpected value of 2.69653970229347E+308. If the result is referenced by a Excel formula then your scientific software may give incorrect results because any math operation involving NaN should result in NaN. 


To make POI compatible with Excel the following rules must be followed:

 - setting a cell value to Double.NaN should change the cell type to CELL_TYPE_ERROR and error value #NUM!
 - setting a cell value to Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY should change the cell type to CELL_TYPE_ERROR and error value #DIV/0!

The rules should work both in HSSF and XSSF.

I applied this fix in r992591.

If you process the generated workbooks in Java you should check type of cells because double can be retrieved only from numeric cells. The code may look as follows:

        double value;
        switch(cell.getCellType()){
            case Cell.CELL_TYPE_ERROR:
                byte errorValue = cell.getErrorCellValue();
                if(errorValue == ErrorConstants.ERROR_DIV_0)
                    value = Double.POSITIVE_INFINITY;
                if(errorValue == ErrorConstants.ERROR_NUM)
                    value = Double.NaN;
                break;
            case Cell.CELL_TYPE_NUMERIC:
                value = cell.getNumericCellValue();
                break;
        }


Regards,
Yegor
Comment 7 Theodor Mazilu 2010-10-12 10:54:53 UTC
Hi
I had the same problem, saw that a bugfix was coming in 3.7 beta3. Now 
I downloaded the fix tried it and it didn't work.


Since English is not my mother tongue either I didn't understand
the fix or this seem to be only a partial fix.

My stack trace would be the same as the reporters but the bug
description would be slightly different.


What I try to do is create a HSSFWorkbook from an existing xls-file:
"
POIFSFileSystem poifs = new POIFSFileSystem(fin);
fin.close();
workbook = new HSSFWorkbook(poifs);
"
This operation fails already in the reading of the InputStream.
As the original reporter posted in the class
org.apache.poi.hssf.record.RecordInputStream
there is the posted readDouble () method which doesn't expect 
Excell to deliver a double with the value NaN.
If somehow it does get a NaN it throws a RuntimeException.
Since the xls file I try to read seems to have a cell containing NaN
the whole process fails and the HSSFWorkbook is not created.

As I understand the bugfix, it fixes the setCellValue(double value)
of the HSSFCell to accept NaN.
But this method is never called since already reading the cell from the
Stream throws a RuntimeException.

I privately rewrote org.apache.poi.hssf.record.RecordInputStream.readDouble()
to not check for NaN and it worked for my case, but I don't know if there are
any sideeffects, so I'd rather post it here.
(
	public double readDouble() {
		long valueLongBits = readLong();
		double result = Double.longBitsToDouble(valueLongBits);
		return result;
	}
)


ps.
Sorry for maybe posting about an already fixed bug, but
the stack trace of the bug is really like mine and it is still
happening with 3.7 beta 3.

Best Regards,
Theo
Comment 8 Nick Burch 2010-10-12 12:10:30 UTC
A unit test was added along with the fix, which shows the problem fixed for the original use case. If you're still having problems, please can you upload a file that demonstrates the problem when running with 3.7 beta 3, then we can use that for further testing + unit tests
Comment 9 Theodor Mazilu 2010-10-13 03:11:04 UTC
Created attachment 26165 [details]
This is the xls file I tested with that threw the exception.

That is the xls file.

Here is a sample code that throws the exception I'll post at the bottom.
The stack trace is identical to the one posted by the creator of the thread.
poi-3.7-beta3-20100924.jar is the jar I have added to my classpath.

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;

public class XlsError {
	public static void main(String[] args) {
		HSSFWorkbook workbook = null;
		FileInputStream fin = null;
		try {
            fin = new FileInputStream("capacityAnalysis.xls");
        } catch (FileNotFoundException fnf){
        	fnf.printStackTrace();
        }
        try{
    	    POIFSFileSystem poifs = new POIFSFileSystem(fin);
            fin.close();
        	workbook = new HSSFWorkbook(poifs);
        }catch(Exception e){
        	e.printStackTrace();
        }
	}
}



org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
	at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:65)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:300)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:270)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:236)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:442)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:263)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:188)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:170)
	at XlsError.main(XlsError.java:18)
Caused by: java.lang.RuntimeException: Did not expect to read NaN
	at org.apache.poi.hssf.record.RecordInputStream.readDouble(RecordInputStream.java:276)
	at org.apache.poi.hssf.record.NumberRecord.<init>(NumberRecord.java:43)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
	at java.lang.reflect.Constructor.newInstance(Unknown Source)
	at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:57)
	... 8 more
Comment 10 Phil Dunlea 2010-10-13 11:42:34 UTC
Created attachment 26170 [details]
3 Unit Tests showing error still occurs

I download from SVN the poi-3.7-beta3 tag the entire file structure and stuck my test into 
src/testcases  org.apache.poi.hssf.record so it could be run as part of "ant test" when building.   

We have a scientific instrument that exports Double.NaN into an excel file.  We can NOT upgrade the software or change how it works.  We must be able to read this file into our application.  The way you are handing Double.NaN is keeping this from happening.  The prior version of POI we were using allowed the reading of Double.NaN.

The error is still occurring in both the Event Based and direct modes of reading an excel file.

Test 1:  testWriteNaNToFileSystem()  

-  Shows that you can create a file with Double.NaN.  Properly saves the file with no errors.  
-  This test passes.

Test 2:  testEventBasedDoubleNaNError()

-  Creates the same file and reads it using the event based model.  Saves the Double read into a variable as part of the listener.   Asserts that the value read in is not null. 
-  This test fails.

Test 3:  testDirectDoubleNaNError()

-  Creates the same file and reads it using a direct approach.  Asserts that the value read in from the cell is equal the the value put into the file
-  This test fails.
Comment 11 David Fisher 2010-10-13 11:59:33 UTC
What version of Apache POI is the scientific instrument using? I am not saying that you must upgrade it, I just want to know which version so we can understand better what happened at r701747 that really causes this trouble.
Comment 12 Phil Dunlea 2010-10-13 14:19:05 UTC
We have these numbers coming from various places

Instrument 1: POI 3.2
Instrument 2: POI 3.6
Instrument 3: Not POI, but gives an excel file we need to read

All of them put the Excel equivalent to Double.NaN into the files.  For excel this is: 2.6965E+308 or 2.69653970229347E+308

An excel file can be created separately with that value and still cause the same exception.

What caused this is a change at some point in POI where instead of returning Double.NaN an exception is thrown.  If you look at the first comment, you can see where the check for Double.NaN was first added.  I'm not sure whether it has changed it is at this point or not.  But an exception for Double.NaN as a value is definitely breaking things for us.
Comment 13 Yegor Kozlov 2010-10-14 05:53:39 UTC
The failing test cases demonstrate expected behavior, they are not bugs.

When dealing with NaNs and Infinities POI mimics Excel, see my comment above. Setting Double.NaN changes cell type to FORMULA and cell.getNumericCellValue() can only be called for numeric cells. The correct version of testDirectDoubleNaNError()  is as follows:



    public void testDirectDoubleNaNError() throws IOException {

        // Write the file with Double.NaN in it
        createFile(SPREADSHEET_FILE_NAME);

        // Read the file
	HSSFWorkbook workbook = HSSFTestDataSamples.openSampleWorkbook(SPREADSHEET_FILE_NAME);
	HSSFCell cell = workbook.getSheet(THE_SHEET).getRow(currentSheetRow).getCell(currentSheetRow);

        Double value = null;
        switch(cell.getCellType()){
            case Cell.CELL_TYPE_ERROR:
                byte errorValue = cell.getErrorCellValue();
                if(errorValue == ErrorConstants.ERROR_DIV_0)
                    value = Double.POSITIVE_INFINITY;
                if(errorValue == ErrorConstants.ERROR_NUM)
                    value = Double.NaN;
                break;
            case Cell.CELL_TYPE_NUMERIC:
                value = cell.getNumericCellValue();
                break;
        }


        // We should be getting back the exact same value we put in.
        Assert.assertTrue(value.equals(VALUE_PRINTED));
    }


testEventBasedDoubleNaNError() fails for the same reason - Setting Double.NaN results in a formula cell and a NumberRecord is not written in the binary stream.


The real issue is that POI prior to 3.7-beta3 allowed writing NaNs and you want to process these files. I'm inclined to comment the exception in RecordInputStream.readDouble, but this fix will come after 3.7-FINAL.

Yegor
Comment 14 Yegor Kozlov 2010-11-09 10:35:57 UTC
I fixed POI to tolerate Double.NaN when reading .xls file. The fix was committed in r1033004. 

The fix is provided for backward compatibility. POI 3.7+ never writes Double.NaN, instead it converts the cell type to error. See my previous posts.

Yegor