Bug 58966

Summary: Row.getCell doesn't retrieve the same value than what VBA gives
Product: POI Reporter: jmclej
Component: SS CommonAssignee: POI Developers List <dev>
Status: REOPENED ---    
Severity: critical CC: jmclej
Priority: P2    
Version: 3.10-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description jmclej 2016-02-03 15:52:54 UTC
Hello,
I am having a problem with retrieving the right data from excel:
I have the cell D5 that contains the formula "=3000000/D10" and D10 cell contains the formula "=1.41973/1.2939".
When I read this cell within VBA macro like this : Worksheets("My sheet").Cells(5, 4), I get 2734111.41555084 but when I read it via POI in Java:
XSSFSheet sheet = workbook.getSheetAt(1);
final Iterator<Row> rowIterator = sheet.iterator();
Cell cell;
while (rowIterator.hasNext()) {
	final Row row = rowIterator.next();
...
	cell = row.getCell(col, Row.RETURN_BLANK_AS_NULL);
...
}
cell contains here 2734111.4155508447 (see the extra 47), which is wrong.
Indeed, when I type in a calculator 3000000/(1.41973/1.2939), the result is 2734111.4155508441745965782226198 ()
I expect to get the same result as VBA gives.
Regards,
Comment 1 Nick Burch 2016-02-03 16:09:40 UTC
The number stored in the file is rarely what Excel shows. To get the value that Excel shows, as a String, you must use something like DataFormatter to apply the formatting rules to the raw cell value
Comment 2 jmclej 2016-02-03 16:25:04 UTC
Independently of what Excel shows, there is definitely a problem with POI as 2734111.4155508441745965782226198 cannot be rounded to 2734111.4155508447 if you look closely at the first number (...844174 cannot give ...8447)
Comment 3 Javen O'Neal 2016-02-03 16:33:38 UTC
(In reply to jmclej from comment #2)
> (...844174 cannot give ...8447)

Sure it can. You're thinking of base 10 rounding, while the computer does base 2 rounding. The binary representation of ...844174 and ...8447 might be adjacent, depending on how the floats are stored, what arithmetic is used to calculate the values, the rounding errors that are accumulated along the way, etc. Even simple 3*(1.0/3) could give an answer like 1.000000000007 or 0.9999999999994, which doesn't round to the correct answer
Comment 4 jmclej 2016-02-03 17:38:48 UTC
(In reply to Nick Burch from comment #1)
> The number stored in the file is rarely what Excel shows. To get the value
> that Excel shows, as a String, you must use something like DataFormatter to
> apply the formatting rules to the raw cell value

In VBA, I have :
Dim MyVal As Double
MyVal = Worksheets("My sheet").Cells(5, 4)
And I am talking about the number I see in the file into which I print MyVal without transformation or that I look at MyVal in the VBA spy window.
So I am expecting to get the same number via POI without the need to apply any DataFormatter in Java, which I couldn't know which it is anyway as I am using the default behavior of VBA. POI should reproduce the default behavior of Excel as well.
Comment 5 Javen O'Neal 2016-02-03 19:06:49 UTC
(In reply to jmclej from comment #4)
> (In reply to Nick Burch from comment #1)
> > The number stored in the file is rarely what Excel shows.

> POI should reproduce the default behavior
> of Excel as well.

Re-read Nick's comment. POI does have rhe samw behavior as Excel when you use the DataFormatter class.
https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.html
Comment 6 jmclej 2016-02-04 10:19:58 UTC
Ok, Indeed, I didn't get that DataFormatter was provided by POI.
But when I am doing this now :
XSSFSheet sheet = workbook.getSheetAt(1);
final Iterator<Row> rowIterator = sheet.iterator();
Cell cell;
while (rowIterator.hasNext()) {
	final Row row = rowIterator.next();
...
	cell = row.getCell(col, Row.RETURN_BLANK_AS_NULL);
	DataFormatter df = new DataFormatter();//same with DataFormatter(true)
	org.apache.poi.ss.usermodel.CellValue cv = evaluator.evaluate(cell);
	String val = df.formatCellValue(cell, evaluator);
...
}
I still get val that is equal to 2734111.4155508447
In https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.html, it says that the default format for decimal numbers is "#.##########", but it I can't reproduce that. Or should I use a different code than the one above to use DataFormatter ?
I also tried (new java.text.DecimalFormat("#.##########")).format(cell.getNumericCellValue()) but it gives also 2734111.4155508447 instead of 2734111.41555084 on Excel.
Comment 7 jmclej 2016-02-04 10:26:40 UTC
I forgot to declare and define evaluator :
final FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
Comment 8 jmclej 2016-02-04 11:26:56 UTC
Also, workbook is of type XSSFWorkbook. Can it have an effect on my issue ?
Comment 9 jmclej 2016-02-04 11:30:00 UTC
Sorry about the different small comments, I am trying to give as much relevant information as possible...
So I must say that I read a xlsm file.
Comment 10 jmclej 2016-02-04 11:41:14 UTC
(In reply to jmclej from comment #8)
> Also, workbook is of type XSSFWorkbook. Can it have an effect on my issue ?

I have tried using WorkBook instead of XSSFWorkbook with WorkbookFactory.create method but it still gives me the same result.
Comment 11 jmclej 2016-02-04 16:26:06 UTC
Additional info : I am using Excel 2013
Comment 12 jmclej 2016-02-04 17:02:32 UTC
Version of poi : 3.10.1
Comment 13 Javen O'Neal 2016-02-04 17:43:56 UTC
> text.DecimalFormat("#.##########"))
>               2734111.4155508447

Looks like the DataFormatter is working as expected. The number of digits is conaistenr with the format string.

We could add another 2 # characters to the format string, but someone would need to prove that every version of Excel on all platforms use the longer format string for any number or formula. I'm guessing that the default format string was chosen to be the length it is for a good reason, and wasn't selected at random.

Until then, if you need your code to return exactly the same value as Excel, you'll need to specify a custom format string.

If you have usage questions, please use the POI Users mailing list [1].

[1] https://poi.apache.org/mailinglists.html
Comment 14 jmclej 2016-02-05 14:21:58 UTC
When you read this for instance : https://support.microsoft.com/en-us/kb/269370
We understand that Excel never gives more than 15 digits, so neither should POI, whatever the format put with more than 15 '#'. It should only add non-significative zeros.
But in my case POI gives 17 digits (2734111.4155508447) instead of the 15 that I am talking about (2734111.41555084).
Comment 15 Javen O'Neal 2016-02-05 18:25:18 UTC
More authoritatively, [1] which specifies

> Feature            Maximum limit
> Number precision   15 digits

[1] https://support.office.com/en-us/article/Excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3
Comment 16 jmclej 2016-02-09 14:54:32 UTC
Hello,
Now that the bug has been prooved, how does it work? On which version can we expect to have the correction and do we have an idea when it will be delivered ?
Thanks for your work,
Comment 17 Javen O'Neal 2016-02-09 16:15:49 UTC
Refer to http://poi.apache.org/who.html

Please always remember: nobody is paid to work on POI, the team is a bunch of volunteers who look at things in their free time. Because of that developers might choose to work on things based on a different priority than yours! Especially the quality and maturity of bug reports will affect if somebody decides to look at it. So the best way to help a bug report see progress is to provide more information if available or supply patches together with unit-tests.