Hello All, Description ----------- If an INDEX() formula contains its own location in the data array range, parsing this spreadsheet with POI causes a recursive exception with an eventual java.lang.StackOver. Steps to Reproduce ------------------ 1/ Create a simple spreadsheet as follows A B ------------- 1 | | 1 | 2 | | 2 | 3 | | 3 | 4 | ** | | ** contains formula =INDEX(A1:B4,2,2) 2/ Notice in Excel (2003), cell A4 evaluates to 2. 3/ Parse the Excel file using POI to evaulate cell formulas (i.e. using HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell); for example) 4/ Evaluating cell A4 causes an exception. Actual Results -------------- As stated in 4/ above. Exception is: Exception in thread "main" java.lang.StackOverflowError at java.lang.Character.digit(Character.java:4531) at java.lang.Character.digit(Character.java:4490) at java.lang.Integer.parseInt(Integer.java:445) at java.lang.Integer.parseInt(Integer.java:497) at org.apache.poi.hssf.util.CellReference.<init>(CellReference.java:53) at org.apache.poi.hssf.util.AreaReference.<init>(AreaReference.java:43) at org.apache.poi.hssf.record.formula.AreaPtg.<init>(AreaPtg.java:55) at org.apache.poi.hssf.model.FormulaParser.Ident(FormulaParser.java:290) at org.apache.poi.hssf.model.FormulaParser.Factor(FormulaParser.java:522) at org.apache.poi.hssf.model.FormulaParser.Term(FormulaParser.java:660) at org.apache.poi.hssf.model.FormulaParser.Expression(FormulaParser.java:708) at org.apache.poi.hssf.model.FormulaParser.Arguments(FormulaParser.java:486) at org.apache.poi.hssf.model.FormulaParser.function(FormulaParser.java:358) at org.apache.poi.hssf.model.FormulaParser.Ident(FormulaParser.java:280) at org.apache.poi.hssf.model.FormulaParser.Factor(FormulaParser.java:522) at org.apache.poi.hssf.model.FormulaParser.Term(FormulaParser.java:660) at org.apache.poi.hssf.model.FormulaParser.Expression(FormulaParser.java:708) at org.apache.poi.hssf.model.FormulaParser.parse(FormulaParser.java:812) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.internalEvaluate(HSSFFormulaEvaluator.java:361) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.getEvalForCell(HSSFFormulaEvaluator.java:557) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.internalEvaluate(HSSFFormulaEvaluator.java:361) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.getEvalForCell(HSSFFormulaEvaluator.java:557) <REPEATED 100's OF TIMES> Expected Results ---------------- Same as Excel as stated in 2/ above. Build Date & Platform --------------------- 2008-02-13 on Windows XP Additional Information ---------------------- I can't think of the top of my head what other Excel functions are a) implemented in POI, b) use data ranges, and c) are *legal* in Excel (for instance, you can't use =SUM in a cell which is part of its own range) so as far as I know, this only affects the INDEX() function. Cheers and thanks again! Dave
Created attachment 21518 [details] Excel file showing INDEX function specified within its own range, which causes Java stack overflow
Created attachment 21522 [details] svn diff of 1 changed file (HSSFFormulaEvaluator)
Created attachment 21523 [details] tar bz2 of two new classes and one test case
Bugfix patch submitted. The root problem is that the scratchpad evaluator does(/did) not handle circular references (it crashes with SOE). Some functions like INDEX, OFFSET, VLOOKUP are distinguished by the characteristic that not every cell in the range arguments has to be evaluated. This is in contrast to more basic functions like SUM and COUNTIF where every input cell must be evaluated. Functions like INDEX can be set up to potentially cause cyclic references, but nonetheless behave OK due to careful choice of other parameters (like in this example). POI's evaluation strategy is to fully evaluate all operator arguments prior to calling the operator. So the situation described here causes SOE due to a cyclic reference on cell A4 even though cell A4 should eventually get ignored by this call to INDEX. The patch supplied here allows HSSFFormulaEvaluator to handle cyclic references (and *potential* cyclic references) gracefully. No change was made to the INDEX function. A junit test case was added to show this bug (exactly as described), and also some simpler cyclic errors involving plain cell references.
Thanks for this patch Josh, I've applied to to svn (I also enabled the error code test, since the patch it depends on is now in svn too)