Bug 61859

Summary: Limited support for array operands
Product: POI Reporter: Markus Kirsten <mkirsten>
Component: XSSFAssignee: POI Developers List <dev>
Severity: normal    
Priority: P2    
Version: 3.17-FINAL   
Target Milestone: ---   
Hardware: Macintosh   
OS: All   
Attachments: Simple file to reproduce the bug

Description Markus Kirsten 2017-12-05 15:03:41 UTC
Created attachment 35587 [details]
Simple file to reproduce the bug

POI does not correctly support array operands and collapses e.g., the
expression inside the INDEX function in INDEX(($B$2:$B$11=F2)*$A$2:$A$11) into a number. The formula INDEX expects an array, and instead of evaluating

INDEX({1,2,2,3,0,0,0,4,0,0}, 0)

POI evaluates it as

INDEX(1, 0)

1. Evaluate all cells in the attached file (or even just G4) with Apache POI
2. The following exception is thrown WARNING: Incomplete code - cannot handle first arg of type (org.apache.poi.ss.formula.eval.NumberEval) for cell G4

See original description of the problem below. Identification that this had to do with how POI handles arguments that are arrays was kindly provided by Yegor Kozlov.

For a spreadsheet I am trying to update and evaluate with POI I am having some difficulties. The problem I am trying to solve in its simplest form is the following; From a data sheet with two columns - one with a numeric value, and one with a group, I would like to find the maximum value for each group. So from the table below, I’d like to calculate that max for A is 4, max for B is 10 and max for C is 20.

Value	Group
1	A
2	A
2	A
3	A
5	B
3	B
10	B
4	A
10	C
20	C

One way to do this would be to create the following table, where the column max value is to;


The logic is basically; find where column B (the group) matches the specified group. This is an array with 0/1s. Multiply that array with the value array. Now we have an array with either 0s or the values from the specified group. Take max of that.

Group	Max value
A	4
B	10
C	20

However, when evaluating the formula above, I get the following error. I have uploaded a minimal test file on https://ufile.io/z6qg4 The 3 red cells are the only ones with formulas and it is when I evaluate them that I get this error.

WARNING: Incomplete code - cannot handle first arg of type (org.apache.poi.ss.formula.eval.NumberEval) for cell G4
java.lang.RuntimeException: Incomplete code - cannot handle first arg of type (org.apache.poi.ss.formula.eval.NumberEval)
	at org.apache.poi.ss.formula.functions.Index.convertFirstArg(Index.java:106)
	at org.apache.poi.ss.formula.functions.Index.evaluate(Index.java:50)
	at org.apache.poi.ss.formula.functions.Index.evaluate(Index.java:114)
	at org.apache.poi.ss.formula.OperationEvaluatorFactory.evaluate(OperationEvaluatorFactory.java:132)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateFormula(WorkbookEvaluator.java:523)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:290)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:232)
	at org.apache.poi.xssf.usermodel.BaseXSSFFormulaEvaluator.evaluateFormulaCellValue(BaseXSSFFormulaEvaluator.java:65)
	at org.apache.poi.ss.formula.BaseFormulaEvaluator.evaluateFormulaCellEnum(BaseFormulaEvaluator.java:192)
	at org.apache.poi.ss.usermodel.DataFormatter.formatCellValue(DataFormatter.java:887)
Comment 1 Yegor Kozlov 2017-12-18 16:13:28 UTC
Should be fixed in r1818587

The fix includes two important changes:

1. Support for evaluating relational operators in array mode. Before this fix POI always de-referenced the first value  and $B$2:$B$11=F2 evaluated as $B$2=F2 . Now it should correctly evaluate it to an array of {TRUE, FALSE, ..... }. 

2. Recognize 'array' evaluation mode from the formula context. Normally, array arithmetic is used in array formulas (those saved with Ctrl-Shift-Enter).  The INDEX function seems to be a special case: if row_num or column_num are omitted or set to 0, INDEX returns an array of the entire column or row. I had to tweak the evaluator to recognize it.