Bug 44636 - [patch] HSSF formula cells not calculating
Summary: [patch] HSSF formula cells not calculating
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.0-FINAL
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2008-03-19 10:22 UTC by J Keller
Modified: 2008-03-20 06:24 UTC (History)
0 users

Input spreadsheet (created in Excel) (13.50 KB, application/vnd.ms-excel)
2008-03-19 10:22 UTC, J Keller

Note You need to log in before you can comment on or make changes to this bug.
Description J Keller 2008-03-19 10:22:08 UTC
Created attachment 21689 [details]
Input spreadsheet (created in Excel)

Using the sample "recalculate all" code from this page:
does recalculate the cells, but does not seem to correctly handle the formulas in all cases.

If you call cell.setCellForumla after evaluating the cell, it seems to work fine.

Here's a simple test case.

1) Run the code below. It will open the attached simple.xls, change one cell, and save it as changed.xls.
2) Open changed.xls in Excel.
3) Change the same cell (C1, which should now contain 25).
4) Note how the calculated cell (D1) does not recalc.

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Iterator;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;

public class Recalc
   public static void main (String[] args)
         File ssFile = new File ("simple.xls");
         FileInputStream ssIn = new FileInputStream (ssFile);
         HSSFWorkbook wb = new HSSFWorkbook (ssIn);
         HSSFSheet sheet = wb.getSheetAt (0);
         HSSFRow row = sheet.getRow (0);
         HSSFCell cell = row.getCell ((short) 2);
         cell.setCellValue (25);
         // recalc
         HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);
         for (Iterator rit = sheet.rowIterator(); rit.hasNext();)
            HSSFRow r = (HSSFRow)rit.next();
            for (Iterator cit = r.cellIterator(); cit.hasNext();)
               HSSFCell c = (HSSFCell)cit.next();
               if (c.getCellType() == HSSFCell.CELL_TYPE_FORMULA)
                  evaluator.evaluateFormulaCell (c);
         FileOutputStream ssOut = new FileOutputStream ("changed.xls");
         wb.write (ssOut);
      catch (Exception x)
         System.err.println (x);
Comment 1 Josh Micich 2008-03-19 11:59:22 UTC
I tried your test code on the latest svn trunk, and got a different error. All formulas (in D1:D6) displayed as '#VALUE!' (my excel is 2007).  I found one difference which was that POI currently writes tRef PTGs involved in a shared formula back as tRefV PTGs.  There is a one line change to RefVPtg that will fix that problem, after which all D1:D6 display ok:

$ svn diff src/java/org/apache/poi/hssf/record/formula/RefVPtg.java
Index: src/java/org/apache/poi/hssf/record/formula/RefVPtg.java
--- src/java/org/apache/poi/hssf/record/formula/RefVPtg.java    (revision 638958)
+++ src/java/org/apache/poi/hssf/record/formula/RefVPtg.java    (working copy)
@@ -32,6 +32,7 @@

   public RefVPtg(int row, int column, boolean isRowRelative, boolean isColumnRelative) {
     super(row, column, isRowRelative, isColumnRelative);
+    setClass(CLASS_VALUE);

There have been a lot of changes in POI formula evaluation since v3.0, so I'm not sure if the same patch will fully solve the problem in that version.  Furthermore, I know I haven't fully isolated this '#VALUE!' problem that I observed in the latest POI trunk, because a simple test case would not reproduce it (i.e. writing a spreadsheet formula just with tRefV PTGs instead of tRef PTGs did not upset excel). There must be some other detail (in combination with the wrong PTGs) that causes '#VALUE!' to appear.

If you attempt to apply this patch directly to v3.0, please post back if if doesn't fix your problem.  

The patch so far is tiny, but there are a few issues that need much more investigation:

1 - Up until now, I had not seen any evidence of why POI bothers with PTG token classes at all.  All junit tests continue to run when that code (setClass/getPtgClass) is disabled.  The attached spreadsheet and test code seems to be the first concrete example of why it might be necessary.  

2 - It does not make sense to have ptg-class based java sub-classes of Ptg in the presence of a method "setClass(byte)" which can change the ptg-lass.

3 - POI unpacks shared formula records, but doesn't seem to re-pack them together when the spreadsheet is re-written.

Comment 2 J Keller 2008-03-19 13:25:35 UTC
I'm actually using version 3.0.2-FINAL.  (3.0 was the closest in the drop-down.)

Do I still need the patch?
Comment 3 J Keller 2008-03-19 13:32:21 UTC
I should also point out that the problem is a bit tricky.  I started with a very large, complex spreadsheet - and kept removing data until the problem went away. The attached xls is the simplest version in which I could reproduce it.

If you simplify that spreadsheet at all - remove a row or column for example - the code works fine.
Comment 4 Josh Micich 2008-03-19 14:05:35 UTC
(In reply to comment #2)
I just tried the patch in 3.0.2-FINAL.  Both the before and after behaviour was as noted above.  I.E. it should work for you.

Perhaps the observation of '#VALUE!' in the formula cells is due to my version of Excel.  Which version are you using?  Just for reference, can you describe more clearly what your Excel does with 'changed.xls'?  My observation of Excel 2007 is:
 - The formulas all appear as '#VALUE!'.
 - The correct formula text is still visible in the formulas.
 - Pressing the <enter> key after selecting the formula causes it to evaluate properly.  This action seems to translate the tRef PTGs into tRefV PTGs (observable after re-saving).

(In reply to comment #3)
> I should also point out that the problem is a bit tricky...
> If you simplify that spreadsheet at all - remove a row or column for example -
> the code works fine.

That makes sense. The specific bug that this one-line-fix addresses is the loss of the ptg-class when translating a 'shared formula'.  From what I remember on previous bugs (bug 44449), excel has a minimum number of rows before it will use a shared.  I noticed a threshold of 6 but I'm not sure if that's universal.  POI does not erroneously disturb the ptg-class when reading/writing non-shared formulas.

Comment 5 Nick Burch 2008-03-20 03:14:39 UTC
I started looking at this last night, but didn't finish before Josh also took a look...

I've added a unit test to svn -

Like Josh, if I start with your file, excel gives #VALUE if I change things. If I start with an empty file, it seems fine.

Interestingly, gnumeric and openoffice have no such problems with the files.

With Josh's one line fix applied, the file from my unit test works fine in excel. So, I've committed Josh's fix to svn
Comment 6 J Keller 2008-03-20 06:24:22 UTC
My Excel behaves exactly the same as yours.  I have version 2003 (11.8206.8202).

I'll try the patch (I don't have the POI source yet; I had just downloaded the jars.  Thanks.