Bug 53101 - FormulaEvaluator incorrectly evaluates sum over cell range > 255
Summary: FormulaEvaluator incorrectly evaluates sum over cell range > 255
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.8-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-18 15:52 UTC by malquitob
Modified: 2012-04-23 08:40 UTC (History)
0 users



Attachments
Spreadsheet for use with the test case. (9.19 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2012-04-18 15:52 UTC, malquitob
Details

Note You need to log in before you can comment on or make changes to this bug.
Description malquitob 2012-04-18 15:52:23 UTC
Created attachment 28636 [details]
Spreadsheet for use with the test case.

Overview:

Summing a range of cells greater the 255 gives an incorrect result.

Steps to reproduce:

Create an .xlsx, fill a range of columns, eg B1:IZ1 with a value, say "1"
Insert a formula to sum these cells, ie SUM(B1:IZ1)
Using POI, read the spreadsheet and evaluate the formula.

Expected result: 259.0, actual result: 258.0

Example test case:

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.FormulaEvaluator;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.junit.Test;

import java.io.IOException;
import java.io.InputStream;

import static org.junit.Assert.assertEquals;

public class PoiTest {

  @Test
  public void evaluateExcelUsingPoiApiOnly_expectCorrectEvaluation() {
    // Arrange
    InputStream inputStream = this.getClass().getResourceAsStream("/test.xlsx");
    Workbook workbook = null;
    try {
      workbook = WorkbookFactory.create(inputStream);
    } catch (IOException e) {
      e.printStackTrace();
    } catch (InvalidFormatException e) {
      e.printStackTrace();
    } finally {
      try {
        inputStream.close();
      } catch (IOException e) {
        e.printStackTrace();
      }
    }

    // Act
    // evaluate SUM('Skye Lookup Input'!B2:IZ2), cells in range B2:IZ2 each contain "1"
    FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
    double numericValue = evaluator.evaluate(workbook.getSheetAt(0).getRow(0.getCell(0)).getNumberValue();

    // Assert
    assertEquals(259.0, numericValue, 0.0);
  }

}

Attached is a test.xlsx
Comment 1 malquitob 2012-04-18 16:10:29 UTC
There's a typo here:

evaluator.evaluate(workbook.getSheetAt(0).getRow(0.getCell(0)).getNumberValue();

It should be:

evaluator.evaluate(workbook.getSheetAt(0).getRow(0).getCell(0)).getNumberValue();

apologies...
Comment 2 malquitob 2012-04-18 17:04:21 UTC
I have a fix for this. I've modified pgetRelativeValue(..) in LazyAreaEval as follows:

public ValueEval getRelativeValue(int relativeRowIndex, int relativeColumnIndex) {

  int rowIx = (relativeRowIndex + getFirstRow() ) & 0xFFFF;
  int colIx = (relativeColumnIndex + getFirstColumn() ) & 0x3FFF;

  return _evaluator.getEvalForCell(rowIx, colIx);
}

The existing code contains:

int colIx = (relativeColumnIndex + getFirstColumn() ) & 0x00FF;

which assigns 0 to colIx if relativeColumnIndex + getFirstColumn() > 255, thus cell numbers > 255 are not evaluated.
Comment 3 Nick Burch 2012-04-18 17:11:47 UTC
Would the proposed fix break things for XLS (HSSF) files though?

We do have a set of constants available for the maximum number of rows, columns etc. I wonder if the masking should be set based on the max column count, rather than hard coded?
Comment 4 Yegor Kozlov 2012-04-18 18:40:01 UTC
I confirmed the problem in trunk.

At first glance the fix looks safe, at least it does not break existing tests and fixes evaluation of the attached workbook.

 I'm leaving this bug as NEW. LazyEval is pretty fundamental and I'd like to research it deeper before applying the proposed fix.

Yegor
Comment 5 malquitob 2012-04-19 07:49:36 UTC
It might be a good idea to look at rowIdx as well:

int rowIx = (relativeRowIndex + getFirstRow() ) & 0xFFFF;

as this limits the number of rows to 65535.
Comment 6 malquitob 2012-04-19 09:03:43 UTC
How about removing the &'s altogether?:

public ValueEval getRelativeValue(int relativeRowIndex, int relativeColumnIndex) {

    int rowIx = relativeRowIndex + getFirstRow();
    int colIx = relativeColumnIndex + getFirstColumn();

    return _evaluator.getEvalForCell(rowIx, colIx);
}

This passes all the test as well.
Comment 7 Yegor Kozlov 2012-04-19 11:39:13 UTC
I thought about removing all &'s too. The purpose of these operators is to guarantee that the the result fits in the .xls grid bounds (2^8 cols and 2^16 rows). We don't need them in general case, but I want research the problem and make sure it is really so. 

Yegor

(In reply to comment #6)
> How about removing the &'s altogether?:
> 
> public ValueEval getRelativeValue(int relativeRowIndex, int
> relativeColumnIndex) {
> 
>     int rowIx = relativeRowIndex + getFirstRow();
>     int colIx = relativeColumnIndex + getFirstColumn();
> 
>     return _evaluator.getEvalForCell(rowIx, colIx);
> }
> 
> This passes all the test as well.
Comment 8 Yegor Kozlov 2012-04-21 12:49:20 UTC
Fixed in r1328647, junit added.

Regards,
Yegor
Comment 9 malquitob 2012-04-23 08:40:24 UTC
Cheers!