Bug 55906 - [PATCH] Correctly evaluate/parse multisheet reference for XLS documents
Summary: [PATCH] Correctly evaluate/parse multisheet reference for XLS documents
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Keywords: PatchAvailable
Depends on:
Reported: 2013-12-18 20:21 UTC by Radoslav
Modified: 2014-07-26 15:20 UTC (History)
0 users

Patch to correct HSSF evaluator/parser (114.85 KB, text/plain)
2013-12-18 20:21 UTC, Radoslav

Note You need to log in before you can comment on or make changes to this bug.
Description Radoslav 2013-12-18 20:21:27 UTC
Created attachment 31132 [details]
Patch to correct HSSF evaluator/parser

This patch modifies the parse/evaluator to enable multisheet references in formulae to be properly evaluated. Such formulae take the form SheetA:SheetB, such as SUM(Sheet1:Sheet3!A1). POI usually ignores the second sheet and only thinks the formulae is SUM(Sheet1!A1).

Steps to reproduce
1. Create a workbook with two sheets (Sheet1, Sheet2)
2. On each tab enter 1 for cell A1
3. On the first sheet enter formula '=SUM(Sheet1:Sheet2!A1)' for the formula for cell A2

Expected Result:
A2 should be equal to 2

Actual Result:
A2 equals 1

Fixed checked with 3.8, 3.9, 3.10-beta1, 3.10-beta2, 3.10-beta3. It works with all versions.

Note: I have only fixed the XLS evaluator/parser (HSSF), the XLSX (XSSF) does not yet work.

Note2: There are unit tests for XSSF that currently as well (they fail since the fix is not ported to XSSF).
Comment 1 Mark Golden 2014-07-01 12:57:48 UTC

I am using poi-3.10-FINAL-20140208.jar and am encountering this 55906 error. I see there is a patch for this problem but I would not have a clue where to start as I'm not a JAVA programmer.

Is there any date set that it will be included in the '.jar' file?

Comment 2 Nick Burch 2014-07-01 14:19:01 UTC
This is a fairly hefty patch, so it needs some review, ideally by someone who knows about formulas

For now, it would be helpful if someone could try applying it to a svn checkout from trunk, then report:
 * Does it apply cleanly to trunk?
 * Do all the unit tests pass with it applied?
 * Do normal formulas still work with it?
 * Do multisheet ones work?

If we know all of those are good, it'll hopefully be quicker for one of our formula experts to then review + apply the patch
Comment 3 Mark Golden 2014-07-01 14:39:35 UTC
Hi Nick,

Thanks for the reply. I'll wait for the patch to be applied.  

Comment 4 Nick Burch 2014-07-24 22:54:58 UTC
I've done some work on this today, cherrypicking bits out of this patch, and doing some fairly liberal refactoring of the formula code too. (Now we have external name / sheet support in the XSSF formula parser, it's much clearer which bits of the hssf external sheet support can be refactored and which ones matter). It's quite a lot of work, but without this patch it's more than I'd have considered even starting!

As of r1613317, POI can now correctly read and render a multi-sheet formula string from a cell, i.e. cell.getCellFormula() now returns the right thing. If you ask FormulaParser to parse a multi-sheet reference, it no longer objects, and returns useful ptgs, but it doesn't yet do enough to support a round-tripping of a formula

Next step - get the round-tripping working, so that the last part of TestFormulaParser.testMultiSheetReference() can be enabled

Future step - evaluator support
Comment 5 Nick Burch 2014-07-25 16:43:59 UTC
As of r1613467, I believe that full support (including evaluation, for those functions which support it) is now in place

The remaining this is to add in the unit tests kindly provided by Radoslav in the patch, and use those to verify it's completely implemented now
Comment 6 Nick Burch 2014-07-26 15:20:32 UTC
Unit tests added in r1613654, thanks for providing such a good set!