Bug 50209

Summary: Evaluation of SUBTOTAL function seems to be incorrect
Product: POI Reporter: Toshihiko Saka <toshihiko.saka>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: ulrich.wenzel
Priority: P2    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: sample spreadsheet
code, test and excel-sheet
patch

Description Toshihiko Saka 2010-11-04 05:10:56 UTC
Created attachment 26253 [details]
sample spreadsheet

Sample spreadsheet:
    A1: 1
    A2: =SUBTOTAL(9,A1)
    A3: =SUBTOTAL(9,A1:A2)

Sample code;
    Workbook wb = WorkbookFactory.create(new FileInputStream("subtotal.xlsx"));
    FormulaEvaluator eval = wb.getCreationHelper().createFormulaEvaluator();
    eval.evaluateFormulaCell(wb.getSheetAt(0).getRow(2).getCell(0));
    FileOutputStream fout = new FileOutputStream("subtotal_output.xlsx");
    wb.write(fout);
    fout.close();

If you execute this sample code, you will get subtotal_output.xlsx.
When opening this file in Excel 2007, you will get next spreadsheet.
    A1: 1
    A2: 1
    A3: 2

Correctly, cell A3 should be 1.
It seems that the cell containing SUBTOTAL (A2) is summed up while evaluating SUBTOTAL in A3, though it should be ignored.
Comment 1 Nick Burch 2010-11-05 14:13:03 UTC
For anyone coming back to this bug later, please see today's discussions on this bug on the user mailing list:
http://mail-archives.apache.org/mod_mbox/poi-user/201011.mbox/%3Calpine.DEB.1.10.1011051553340.11599@urchin.earth.li%3E
Comment 2 uliwen 2011-06-06 14:24:38 UTC
Bugfix: Evaluation of Subtotals does not consider nested subtotals anymore

- test-files and test-data contained in new-files.tar
- java-subtotals.patch does not include test-files

Attachment: java-subtotals.patch and new-files.tar

Therefore:
1) Double-Values of referenced cells (not subtotals) are collected and used for evaluation in class Subtotal
2) The Method getFunctionIndex is added to classes LazyAreaEval/AreaEvalBase, SheetRefEvaluator and WorkbookEvaluator which seems to be the shortes path to the WorkbookEvaluator
3) Functions Count and CountA are implemented directly in Subtotal.
Comment 3 uliwen 2011-06-06 14:26:37 UTC
Created attachment 27117 [details]
code, test and excel-sheet
Comment 4 uliwen 2011-06-06 14:27:54 UTC
Created attachment 27118 [details]
patch
Comment 5 Yegor Kozlov 2011-07-25 16:19:00 UTC
(In reply to comment #4)
> Created attachment 27118 [details]
> patch

Fixed in r1150673

The fix is based on the proposed patch, but I chose a bit different strategy. Instead of pulling DoubleList and iteration logic from MultiOperandNumericFunction and thus duplicating code, it is better to inject this logic right into MultiOperandNumericFunction.collectValues.
I think I found an elegant way how to do that.

P.S. You patch assumes that the last ptg in nested cells is FuncVarPtg:


+	public int getFunctionIndex( EvaluationCell srcCell ){

+		Ptg[] ptgs = _workbook.getFormulaTokens(srcCell);

+		int index = -1;

+		if( ptgs.length > 0  &&  ptgs[ptgs.length-1] instanceof FuncVarPtg){

+			FuncVarPtg fVar = (FuncVarPtg)ptgs[ptgs.length-1];

+			index = fVar.getFunctionIndex();

+		}

+		return index;

+	}



It is not always so. Consider two use cases:

SUBTOTAL(9, A1:A2)      ptgs: [9, A1:A2, SUBTOTAL]
SUBTOTAL(9, A1:A2) + 1  ptgs: [9, A1:A2, SUBTOTAL, 1, '+']

Your implementation ignores the second case which is wrong. The correct logic is to iterate over all ptgs and check if there is a FuncVarPtg for the SUBTOTAL function.

Regards,
Yegor