Bug 55747

Summary: nested function in IF's in formulas produce #VALUE! error in excel
Product: POI Reporter: Guoshun Wu <Guoshun.Wu>
Component: HSSFAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: normal CC: dron007, Guoshun.Wu
Priority: P2    
Version: 3.9-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Bug Depends on: 55324    
Bug Blocks:    
Attachments: The code in description produce this file.
File with #VALUE! error
Fixed by Excel file
BiffViewer result of BadFile.xls
BiffViewer result of GoodFile.xls
Biff Viewer results with xls files and source

Description Guoshun Wu 2013-11-06 07:58:32 UTC
Created attachment 31015 [details]
The code in description produce this file.

The formula document on the HSSF website instructed to report all occurrences of #VALUE! when 
attempting to programmatically assign a formula to a cell, so here goes:

IF(A1 > 5, "big number", "small number")

This formula works, and displays correctly when opened in Excel.

#VALUE! is only displayed if the outer IF branches to an inner CONCATENATE and it reference another cell.  For example:

> =IF(ISBLANK(A1)," not blank a1",CONCATENATE(A1," - %s."))

-The cell displays #VALUE!
-but if I type it in directly in Excel, the formula works
-also, clicking in the formula bar of POI-generated formula, then hitting Enter, fixes the formula

Here iis the sample code(in groovy):
==============================Code Start=================================

HSSFWorkbook wb = new HSSFWorkbook()
String ext ="xls";
if(wb instanceof  SXSSFWorkbook) ext +="x";
Sheet sheet =wb.createSheet("Test1")
Row row =sheet.createRow(0)
CellUtil.createCell(row, 0, "Hello world.")
row = sheet.createRow(1)
Cell cell = row.createCell(0)
String refCell = "A1"
String formula = refCell
formula = String.format "IF(isblank(%s),\" not blank a1\",CONCATENATE(%s, \" - %%s.\"))", refCell, refCell

wb.forceFormulaRecalculation = true
wb.write new FileOutputStream( "d:/test/test.${ext}")

==============================Code End=================================
Comment 1 Dominik Stadler 2015-06-01 21:50:20 UTC
Java version of test:

	public void test55747() throws IOException {
	    HSSFWorkbook wb = new HSSFWorkbook();
	    Sheet sheet =wb.createSheet("Test1");
	    Row row =sheet.createRow(0);
	    CellUtil.createCell(row, 0, "Hello world.");
	    row = sheet.createRow(1);
	    Cell cell = row.createCell(0);
	    String refCell = "A1";
	    String formula = refCell;
	    formula = "IF(isblank(" + refCell + "),\" not blank a1\",CONCATENATE(" + refCell + ", \" - %s.\"))";

	    wb.write(new FileOutputStream("C:\\temp\\55747.xls"));
Comment 2 Andrew 2015-07-31 12:06:24 UTC
Have same problem with MID and REPLACE funciton (maybe all text)?
    public static void main(String[] args) throws IOException {
        Workbook wb = new HSSFWorkbook();
        FormulaEvaluator ev = wb.getCreationHelper().createFormulaEvaluator();
        Sheet ws = wb.createSheet();
        Row row = ws.createRow(0);
        Cell cell;
        cell = row.createCell(0);

        cell = row.createCell(1);
        cell.setCellFormula("IF(A1<>\"\",MID(A1,1,2),\" \")");
        OutputStream os = new FileOutputStream("xx.xls");
Some analysis shows, that problem occurs on Windows using 2003 or 2010 excel (cannot test 2007 or 2013) but does not occurs on mac with 2013 excel or windows with open office 4.1.1.
Problem found in POI 3.8, 3.10, 3.12, 3.13dev and is not produced for XSSFWorkbook.
Comment 3 Nick Burch 2015-08-01 00:00:57 UTC
I've added a unit test in r1693674. The bad news is that POI can read the evaluated result just fine

Next step is for someone to generate a simple problematic formula cell with POI, save it, open it in Excel, get Excel to fix the formula, save that, then run BiffViewer against both files. There will be some differences anyway, but the interesting thing will be the cell-related Records for the formula cell in question. How do those differ?
Comment 4 Andrew 2015-08-05 15:53:27 UTC
Created attachment 32970 [details]
File with #VALUE! error
Comment 5 Andrew 2015-08-05 15:53:52 UTC
Created attachment 32971 [details]
Fixed by Excel file
Comment 6 Andrew 2015-08-05 15:54:24 UTC
Created attachment 32972 [details]
BiffViewer result of BadFile.xls
Comment 7 Andrew 2015-08-05 15:54:36 UTC
Created attachment 32973 [details]
BiffViewer result of GoodFile.xls
Comment 8 Andrew 2015-08-05 15:55:15 UTC
I created file "badFile.xls" using my java code (in POI 3.13beta1). After recalculating in Excel saved as "goodFile.xls".
BiffViewer result of "badFile.xls" in "badFileBiff.txt", "goodFile.xls" in "goodFileBiff.txt".
As you expected, there are differences in formula records:
    Ptg[4]=org.apache.poi.ss.formula.ptg.RefPtg [A1]R
in BAD file againist
    Ptg[4]=org.apache.poi.ss.formula.ptg.RefPtg [A1]V
in GOOD.
Comment 9 Nick Burch 2015-08-05 16:04:41 UTC
Looks like Excel wants it to be a Value not a Reference

Are you able to create a few different formulas in POI and Excel, eg if(expr,val,val), if(expr,val,func), if(exp,func,func), if(expr,func,val), if(func_expr,val,val), and check with BiffViewer what kind of Ptgs POI and Excel produces for these cases? (Need to work out if it's always one or the other, or if we need detection logic to work out which)
Comment 10 Andrew 2015-08-06 06:28:14 UTC
Created attachment 32975 [details]
Biff Viewer results with xls files and source

I tried so. Results in attachment as zip archive (source, xls-files, BiffViewer results).
As I can see, each "#VALUE!" cell has same issue:
    org.apache.poi.ss.formula.ptg.RefPtg [A1]R
in BAD file againist
    org.apache.poi.ss.formula.ptg.RefPtg [A1]V
in GOOD.
Comment 11 Dominik Stadler 2015-08-09 20:08:49 UTC
This bug and bug 55324 may be describing the same issue.
Comment 12 Nick Burch 2015-08-12 19:56:15 UTC
I've spent a "fun" evening on this, and I think I've now got a unit test (failing unless bits commented out) describing the required situation. 

In org.apache.poi.hssf.usermodel.TestFormulaEvaluatorBugs#test55747_55324 I've got tests based on your test file, expanded a little bit, checking standalone MID, along with MID-in-IF

As far as I can tell from that and the BiffViewer dumps, the rules seem to be:
 * MID used in IF 1st clause, or straight A1 ref in IF 1st clause: A1 = V
 * MID used in 2nd or 3rd clause that isn't used (evaluates other way): A1 = V
 * MID used in 2nd or 3rd clause that is used (evaluates that way): A1 = R <- thing we're doing wrong

Does that match with your testing? 

If so, we'll need to have HSSFFormulaEvalutor tweak the types on evaluation. (We'd also need to identify any other types like that and fix them too)
Comment 13 Andrew 2015-08-21 12:50:13 UTC
Yes, Nick, it seems to be so.