Bug 45376

Summary: HSSFFormulaEvaluator#evaluate functions need to cache intermediate cell values
Product: POI Reporter: Nick Burch <apache>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: mail
Priority: P2    
Version: 3.0-dev   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   

Description Nick Burch 2008-07-10 15:00:15 UTC
From Meandron <Matthias.Igel@gmx.net> on the mailing list:

I've uploaded a sheet called "Test.xls" 
http://www.nabble.com/file/p18357579/Test.xls Test.xls , which I query using
the following program:

import java.io.FileInputStream;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;

public class Test {

        public static void main(String[] args) throws Exception{
                final HSSFWorkbook book = new HSSFWorkbook(new
FileInputStream("c:\\Test.xls"));
                final String SHEET = "Eingabe";
                final int row = 6;
                final HSSFSheet document = book.getSheet(SHEET);

                final HSSFRow excelRow = document.getRow(row);
                for (short i = 4; i < 15; i++) {
                        final HSSFCell excelCell = excelRow.getCell(i);
                        final HSSFFormulaEvaluator evaluator = new
HSSFFormulaEvaluator(book.getSheet(SHEET), book);

                        evaluator.setCurrentRow(excelRow);
                        long now = System.currentTimeMillis();
                        evaluator.evaluate(excelCell);
                        System.out.println("evaluation took: " +
(System.currentTimeMillis() -
now));
                }
        }
}

And here are the results I've received (in millis):

evaluation for cell (6, 4) took: 0
evaluation for cell (6, 5) took: 15
evaluation for cell (6, 6) took: 0
evaluation for cell (6, 7) took: 31
evaluation for cell (6, 8) took: 94
evaluation for cell (6, 9) took: 156
evaluation for cell (6, 10) took: 391
evaluation for cell (6, 11) took: 1406
evaluation for cell (6, 12) took: 5188
evaluation for cell (6, 13) took: 20265
evaluation for cell (6, 14) took: 81766

Perhaps "evaluateInCell" would help me out here, but I think that poi should
be able to evaluate such kind of formulas in an acceptable way.
Comment 1 Nick Burch 2008-07-10 15:23:59 UTC
Are you sure that your formulas aren't recursive? That would explain the timings we're seeing on evaluating them

(I can't actually figure out what your formula does, clearly I don't know enough about what can be done with excel formulas....)

There's a test in src/testcases/org/apache/poi/hssf/usermodel/TestFormulaEvaluatorBugs.java for anyone who's interested
Comment 2 Matthias 2008-07-11 00:12:10 UTC
The formula is indeed a recursive one, but the time needed to evaluate it, especially for column 9 to 15, is not acceptable, I think (compared to the time excel needs to evaluate it, which is almost null).

When you have a look at the sheet, you can see that the formula is also set for columns with index greater than 15. The time to evaluate these formulas would of course grow exponentially. 

What the formula does, can be expressed in the following way:

- get the date (year and month) of your left neighbour cell and add "1" to the month.
- look, if that date is smaller or equal to the date in column (3,4).
- if so, put it the new date into the actual cell
- else, write down #nv

It is clear, that the recursive formula evaluation is executed twice for each cell, but compared to the low complexicity of the formula, that should not result in such a long running evaluation.


Comment 3 Sebastian Brocks 2008-07-15 04:34:36 UTC
In my case, it's also recursive formulas.
Comment 4 Josh Micich 2008-09-04 16:17:49 UTC
Fixed in svn r692300.

Junit modfied slightly and re-enabled