Bug 51339

Summary: Formula evaluation and floating point rounding problem
Product: POI Reporter: Michal Letynski <ml>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major CC: kowalski.grzes, ml
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: quick start
Patch fixing rounding issue, including a test case (test case fails with current POI rounding)

Description Michal Letynski 2011-06-08 09:12:39 UTC
Created attachment 27130 [details]
quick start

Hi.
We have a problem with different results from excel and POI calculation. Problem is connected with floating points and rounding.

Im attaching a quick start which reproduce that problem. We are suming 12 rows, multiply it by 50% and then rounding. In excel sheet we have 275,62 in POI 275,61.

(With sum we get some floating points like 0,499999995 etc.)

We have a lot of calculation and data precision is very important for us. So there is huge ddiference with 0,62 and 0.61.
I saw that #46156 cover similar problem but i can reproduce it also for HSSF implementation (which uses NumberToTextConverter).

Any clues how we can fast fix this without touching excel file ?
Comment 1 Nick Burch 2011-06-08 11:27:12 UTC
As discussed in bug #46156 both Excel and POI use floating points internally for everything. Excel tries to hide this from you, but it's still there

It's possible that excel and poi use slightly different precision floating point numbers

I'd suggest you start by breaking your problem down, and checking all the intermediate results in both excel and poi. When do they start to diverge? In excel, make sure you set your cells to show as many digits as possible, so you can see what's really there (instead of what excel pretends!)
Comment 2 Michal Letynski 2011-06-08 11:34:05 UTC
Yeah i know that i can fix this by modifing an excel formulas but this is not the case. If excel is hiding it for me, why POI cant just do the same ?
I already broke a problem down and provided a simple sumimg which shows this buggy effect.
Comment 3 Nick Burch 2011-06-08 11:55:57 UTC
Could you please use this problem file to identify where the error comes in? A difference of 0.01 on only a few operations seems rather large

You'll likely want to get excel to store some of the intermediate parts of your formula so you can look at them, figure out what the values really are (not what excel displays!), and then compare that with what poi calculates. Where does this large error creep in?

POI will happily format a numeric cell as text with the same rules as excel, see DataFormatter. This won't help if POI and Excel disagree on the underlying numeric value though!
Comment 4 Yegor Kozlov 2011-06-08 12:02:34 UTC
Firstly, you didn't provide enough information to research the problem. The attached file contains a Maven project which actually consists of the following code:

      InputStream is = new FileInputStream("c:\\excelbugs\\bug.xlsm");
      XSSFWorkbook wb = new XSSFWorkbook(is);
      FormulaEvaluator poiEvaluator = wb.getCreationHelper().createFormulaEvaluator();
      CellValue cellValue = poiEvaluator.evaluate(wb.getSheet("Sheet1").getRow(0).getCell(2));
      System.out.println("Value is " + cellValue.getNumberValue());

If you assert results from a file, please attach that file too.

Secondly, it is not a bug in POI, rather a way it performs the floating-number math.

POI performs strict 'Double' arithmetics while Excel seems to use a mixed BigDecimal / Double. By 'mixed' I mean that for most formulas Excel uses 'double' math and for some formulas it *may* use BigDecimal, in particular, for financial functions. 

How do you assert the results? In Excel you see 275,62 which is a formatted value. The evaluator returns 275,61 with some trailing flowing points. How do you compare these two values?  

Hint: To get a formatted value in Java you should call DataFormatter#formatCellValue(Cell cell). 

Yegor

(In reply to comment #2)
> Yeah i know that i can fix this by modifing an excel formulas but this is not
> the case. If excel is hiding it for me, why POI cant just do the same ?
> I already broke a problem down and provided a simple sumimg which shows this
> buggy effect.
Comment 5 Michal Letynski 2011-06-08 18:13:55 UTC
Thanks for your comments.

So im doing an assertion in following way. I created a visual basic script which simply do the CSV dump. Values in csv are the same as in excel. Then i load this cvs in java and simply assert it with values from POI. I debug it and when poi read excel before calculation this values does not have any floating points). 
I can't use DataFormatter for wrong calculated value it wont do ,62 from ,61. 

I also found a problwm with round function. Specialy round half up. I saw that POI is using MathX.round method for rounding. Here is a code:

    public static double round(double n, int p) {
        double retval;

        if (Double.isNaN(n) || Double.isInfinite(n)) {
            retval = Double.NaN;
        }
        else {
            if (p != 0) {
                double temp = Math.pow(10, p);
                retval = Math.round(n*temp)/temp;
            }
            else {
                retval = Math.round(n);
            }
        }

        return retval;
    }

Unfortunetly we have different results in excel and POI when we have 5 in the end. Take a look at following examples:

round(2162.615, 2) -> 2162.61
round(262.615, 2) -> 262.62

In first case round down in second round up. Excel always do a round up. To fix it we can change implementation of this function to:

BigDecimal temp = new BigDecimal(String.valueOf(n)).setScale(p, RoundingMode.HALF_UP);
retval = temp .doubleValue();
Comment 6 Nick Burch 2011-06-08 19:47:42 UTC
You still seem to be talking about display values. Before we can discuss those, we need to worry about the underlying raw values. If they're wrong, it doesn't matter how we round things for user facing display, as rounding the wrong thing is unlikely to ever work!

Please start looking at the values Excel actually stores and computes with, rather than what it happens to choose to display to you. Do they match POI? If not, at what point in your calculation chain does it diverge?

Only once we have the correct floating point values is it time to discuss how they get rounded to user facing numbers.
Comment 7 Marcel May 2011-06-09 08:26:00 UTC
I think there's a misunderstanding - its not about displayed values, but about the rounding function implementation of POI.

The ROUND(...) function of Excel uses the rounding mode HALF_UP, also known as 'arithmetic rounding' (see http://support.microsoft.com/kb/196652/EN-US).

The issue is: The current POI implementation does not do arithmetic rounding (Michal pasted the code in a previous comment), im citing the critical part again: 

...
if (p != 0) {
                double temp = Math.pow(10, p);         // NOT CORRECT ARITHMETIC
                retval = Math.round(n*temp)/temp;    // ROUNDING
...

This must be replaced with the previously suggested code using BigDecimal, to get a correct 'arithmetic rounding' as Excel uses:

retval = java.math.BigDecimal.valueOf(2162.615).setScale(2, java.math.RoundingMode.HALF_UP).doubleValue()

Cheers,
Marcel
Comment 8 Marcel May 2011-06-12 07:41:18 UTC
Created attachment 27151 [details]
Patch fixing rounding issue, including a test case (test case fails with current POI rounding)
Comment 9 Yegor Kozlov 2011-06-13 10:37:36 UTC
Good catch! Applied in r1135079.

Regards,
Yegor

(In reply to comment #8)
> Created attachment 27151 [details]
> Patch fixing rounding issue, including a test case (test case fails with
> current POI rounding)
Comment 10 Yegor Kozlov 2011-06-13 10:40:57 UTC
Michal,

Please test your code against the latest build from trunk. Nightly builds can be downloaded from here: http://encore.torchbox.com/poi-cvs-build/

Yegor