Bug 50209 - Evaluation of SUBTOTAL function seems to be incorrect
Summary: Evaluation of SUBTOTAL function seems to be incorrect
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.7-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2010-11-04 05:10 UTC by Toshihiko Saka
Modified: 2011-07-25 16:19 UTC (History)
1 user (show)

sample spreadsheet (10.85 KB, application/zip)
2010-11-04 05:10 UTC, Toshihiko Saka
code, test and excel-sheet (90.00 KB, application/octet-stream)
2011-06-06 14:26 UTC, uliwen
patch (8.67 KB, patch)
2011-06-06 14:27 UTC, uliwen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Toshihiko Saka 2010-11-04 05:10:56 UTC
Created attachment 26253 [details]
sample spreadsheet

Sample spreadsheet:
    A1: 1
    A2: =SUBTOTAL(9,A1)
    A3: =SUBTOTAL(9,A1:A2)

Sample code;
    Workbook wb = WorkbookFactory.create(new FileInputStream("subtotal.xlsx"));
    FormulaEvaluator eval = wb.getCreationHelper().createFormulaEvaluator();
    FileOutputStream fout = new FileOutputStream("subtotal_output.xlsx");

If you execute this sample code, you will get subtotal_output.xlsx.
When opening this file in Excel 2007, you will get next spreadsheet.
    A1: 1
    A2: 1
    A3: 2

Correctly, cell A3 should be 1.
It seems that the cell containing SUBTOTAL (A2) is summed up while evaluating SUBTOTAL in A3, though it should be ignored.
Comment 1 Nick Burch 2010-11-05 14:13:03 UTC
For anyone coming back to this bug later, please see today's discussions on this bug on the user mailing list:
Comment 2 uliwen 2011-06-06 14:24:38 UTC
Bugfix: Evaluation of Subtotals does not consider nested subtotals anymore

- test-files and test-data contained in new-files.tar
- java-subtotals.patch does not include test-files

Attachment: java-subtotals.patch and new-files.tar

1) Double-Values of referenced cells (not subtotals) are collected and used for evaluation in class Subtotal
2) The Method getFunctionIndex is added to classes LazyAreaEval/AreaEvalBase, SheetRefEvaluator and WorkbookEvaluator which seems to be the shortes path to the WorkbookEvaluator
3) Functions Count and CountA are implemented directly in Subtotal.
Comment 3 uliwen 2011-06-06 14:26:37 UTC
Created attachment 27117 [details]
code, test and excel-sheet
Comment 4 uliwen 2011-06-06 14:27:54 UTC
Created attachment 27118 [details]
Comment 5 Yegor Kozlov 2011-07-25 16:19:00 UTC
(In reply to comment #4)
> Created attachment 27118 [details]
> patch

Fixed in r1150673

The fix is based on the proposed patch, but I chose a bit different strategy. Instead of pulling DoubleList and iteration logic from MultiOperandNumericFunction and thus duplicating code, it is better to inject this logic right into MultiOperandNumericFunction.collectValues.
I think I found an elegant way how to do that.

P.S. You patch assumes that the last ptg in nested cells is FuncVarPtg:

+	public int getFunctionIndex( EvaluationCell srcCell ){

+		Ptg[] ptgs = _workbook.getFormulaTokens(srcCell);

+		int index = -1;

+		if( ptgs.length > 0  &&  ptgs[ptgs.length-1] instanceof FuncVarPtg){

+			FuncVarPtg fVar = (FuncVarPtg)ptgs[ptgs.length-1];

+			index = fVar.getFunctionIndex();

+		}

+		return index;

+	}

It is not always so. Consider two use cases:

SUBTOTAL(9, A1:A2)      ptgs: [9, A1:A2, SUBTOTAL]
SUBTOTAL(9, A1:A2) + 1  ptgs: [9, A1:A2, SUBTOTAL, 1, '+']

Your implementation ignores the second case which is wrong. The correct logic is to iterate over all ptgs and check if there is a FuncVarPtg for the SUBTOTAL function.