Bug 62834 - Regression: FormulaEvaluator.evaluateInCell() throws exception
Summary: Regression: FormulaEvaluator.evaluateInCell() throws exception
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 4.0.0-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-18 14:26 UTC by Maarten Coene
Modified: 2018-10-22 07:23 UTC (History)
0 users



Attachments
xslx file having a problem in the A3 cell (9.10 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2018-10-18 14:26 UTC, Maarten Coene
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maarten Coene 2018-10-18 14:26:20 UTC
Created attachment 36199 [details]
xslx file having a problem in the A3 cell

Hi,

we have an xlsx file for which the FormulaEvaluator.evaluateInCell() throws an exception for one particular cell. This used to work with Apache POI v3.17 and now fails after upgrading to v4.0.0.

The file is attached (problem.xlsx), the problematic cell is A3, the other cells are ok. The issue can be reproduced with the following code:

Workbook wb = WorkbookFactory.create(new File(/path/to/problem.xlsx));
Sheet sheet = wb.getSheet("problem");
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
evaluator.evaluateInCell(sheet.getRow(1).getCell(0)); // A2 -> OK
evaluator.evaluateInCell(sheet.getRow(2).getCell(0)); // A3 -> FAIL

The stacktrace is:
Exception in thread "main" java.lang.IllegalStateException: Cannot get a FORMULA value from a STRING cell
	at org.apache.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:1075)
	at org.apache.poi.xssf.usermodel.XSSFCell.getCellFormula(XSSFCell.java:483)
	at org.apache.poi.xssf.usermodel.XSSFCell.getCellFormula(XSSFCell.java:470)
	at org.apache.poi.xssf.usermodel.XSSFSheet.onDeleteFormula(XSSFSheet.java:4639)
	at org.apache.poi.xssf.usermodel.XSSFCell.setCellType(XSSFCell.java:969)
	at org.apache.poi.ss.formula.BaseFormulaEvaluator.setCellType(BaseFormulaEvaluator.java:218)
	at org.apache.poi.ss.formula.BaseFormulaEvaluator.evaluateInCell(BaseFormulaEvaluator.java:140)
	at org.apache.poi.xssf.usermodel.XSSFFormulaEvaluator.evaluateInCell(XSSFFormulaEvaluator.java:85)
	at org.apache.poi.xssf.usermodel.XSSFFormulaEvaluator.evaluateInCell(XSSFFormulaEvaluator.java:34)

kind regards,
Maarten Coene
Comment 1 Greg Woolsey 2018-10-18 23:26:14 UTC
Thanks for the simple test case!  I've added it as a failing unit test to my local build, and will commit it for regression testing once I figure out what's going on.

One interesting point of note in the stack trace, the line in 

XSSFSheet.onDeleteFormula(cell)

is only reached for shared formulas valid for more than one cell.  I don't know much about those (yet), but it seems plausible that calling evaluateInCell() for the "master" formula cell somehow doesn't update the formula references properly in this case.

That code comes from #58106, which was committed after 3.17 was released, so that seems like a likely place to start looking.  I'll keep digging, as I'm likely to need this also once my work project can upgrade to 4.0.
Comment 2 Greg Woolsey 2018-10-19 00:41:39 UTC
Hmm. Along the way I noticed this call flow:

XSSFSheet.onDeleteFormula(XSSFCell) 
calls XSSFCell.getCellFormula()
calls XSSFCell.getCellFormula(XSSFEvaluationWorkbook) 

with a null parameter, so it creates a new one, but only when it needs to call convertSharedFormula().

Looks like the next step is to see just how bad it will be to get the actual evaluation object from the formula evaluator down into that context.  I suspect that's where things really go wrong, using a different evaluation instance than the one in use higher in the call stack.

I'm not a fan of passing nulls, but since part of the code already expects that possible input, getCellFormula(), I suppose it isn't too awful.  Better ideas welcome.

That's not the bug here, though, that's just something that will bite someone down the road if it isn't handled. I'm implementing passing the needed value from the FormulaEvaluator, and adding JavaDoc about invalidating evaluator instances for other callers, without changing the public API for now.
Comment 3 Greg Woolsey 2018-10-19 00:45:45 UTC
And finally, the problem is a strangely formed XLSX file - it has, in its sheet1.xml, the following:

    <row r="3" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A3" s="1" t="str">
        <f t="shared" ref="A3:A5" si="0">A2</f>
        <v>a value</v>
      </c>
    </row>
    <row r="4" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A4" s="1" t="s">
        <v>1</v>
      </c>
    </row>
    <row r="5" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A5" s="1" t="str">
        <f t="shared" si="0"/>
        <v>another value</v>
      </c>
    </row>

Note how A3 defines itself as a shared formula covering A3:A5 with id=0.  A5 references this properly, but A4 defines itself as a plain string cell, not a formula.

Excel seems to handle this just fine, allowing cells to override a shared formula definition.  Looks like POI needs to follow that behavior.

I wonder where else this might cause trouble.
Comment 4 Greg Woolsey 2018-10-19 01:12:57 UTC
Fixed in r1844295.
Comment 5 Maarten Coene 2018-10-22 07:23:40 UTC
Thanks for fixing this so fast! :-)

cheers,
Maarten