Bug 52462 - Add support for SUMIFS function
Summary: Add support for SUMIFS function
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.8-dev
Hardware: PC All
: P2 enhancement with 3 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-13 06:24 UTC by Oleg Kuryan
Modified: 2012-02-03 10:21 UTC (History)
1 user (show)



Attachments
initial draft of SUMIFS (11.10 KB, patch)
2012-01-14 15:08 UTC, Yegor Kozlov
Details | Diff
Attachment to show generated file for note working test case (5.00 KB, application/vnd.ms-excel)
2012-01-23 08:17 UTC, Oleg Kuryan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Oleg Kuryan 2012-01-13 06:24:10 UTC
Currently there is no implementation for SUMIFS function in POI library.
http://office.microsoft.com/en-us/excel-help/sumifs-function-HA010047504.aspx

Is it possible to add it?
Comment 1 Yegor Kozlov 2012-01-13 07:44:40 UTC
Of course it is possible, but no-one has volunteered to do it so far... 

POI supports SUMIF which is a close relative of SUMIFS and can share the code. Would you volunteer to write a patch ?

Yegor
Comment 2 Oleg Kuryan 2012-01-13 10:03:46 UTC
Hi Yegor,

I can try, but as I'm new to that lib, could you, please, point to start place.
As I see "SUMIFS" has 2differences from SUMIF:
1) sequence of arguments is different
2) number of input arguments is undefined.

Now I see that there is no Abstract function defined for function with undefined number of arguments. Also I can't find example of implementation for such type of functions to look at.

Could you, please, give a start point?
Comment 3 Yegor Kozlov 2012-01-13 13:06:46 UTC
Sure, I'm going to write an implementation sketch over weekend. 

Yegor

(In reply to comment #2)
> Hi Yegor,
> 
> I can try, but as I'm new to that lib, could you, please, point to start place.
> As I see "SUMIFS" has 2differences from SUMIF:
> 1) sequence of arguments is different
> 2) number of input arguments is undefined.
> 
> Now I see that there is no Abstract function defined for function with
> undefined number of arguments. Also I can't find example of implementation for
> such type of functions to look at.
> 
> Could you, please, give a start point?
Comment 4 Yegor Kozlov 2012-01-14 15:08:08 UTC
Created attachment 28157 [details]
initial draft of SUMIFS
Comment 5 Yegor Kozlov 2012-01-14 15:21:33 UTC
Here is initial implementation of SUMIFS. 

I think I covered most of it. The test case mimics the first example from http://office.microsoft.com/en-us/excel-help/sumifs-function-HA010047504.aspx. Please add Example 2 - Example 5, I added placeholders in TestSumifs.java

A tricky thing is to verify the behavior of SUMIFS against invalid or wrong arguments. Please carefully read the spec and verify that the implementation handles all cases (wrong type of arguments, ranges have different dimensions, etc.). 

Feel free to edit the code if you find any bugs. 

P.S. It would be great if you follow the pattern and add support for other *IFS functions from the AnalysisToolpack: AVERAGEIFS and COUNTIFS, they should be pretty much similar to SUMIFS.

Regards,
Yegor
Comment 6 Yegor Kozlov 2012-01-21 11:51:45 UTC
I committed support for SUMIFS() in r1234305

Yegor
Comment 7 Oleg Kuryan 2012-01-21 15:13:39 UTC
Hi Yegor,

Thank you for implementing this. Unfortunately, I wasn't able to do that this week. I just synchronized from trunk and tested changes. 
I've tried to create excel file with  SUMIFS function using formula cell  cell.setCellFormula

But in newly created excel file formula is saved as string and not formula. Do you know how to fix this?
Comment 8 Oleg Kuryan 2012-01-22 13:35:27 UTC
Need to reopen bug because of the above comments
Comment 9 Yegor Kozlov 2012-01-23 07:35:09 UTC
Please post sample code you are using. Even better if you upload a failing junit test that demonstrates what's wrong. 

I checked with trunk and setting formulas with SUMIFS works OK to me, for both HSSF and XSSF. 

Yegor
Comment 10 Oleg Kuryan 2012-01-23 08:15:52 UTC
I'm Using for verification Excel 2010 in Russian language. Below is code that create excel file with SUM and SUMIFS functions. Just to compare results.

As result I get attached excel file test.xls. Below are screenshots on how formulas looks like in Russian version:
1) http://clip2net.com/s/1vOP4 - note that SUMIFS is not displayed in excel in russian variant ("СУММАЕСЛИМН")
2) http://clip2net.com/s/1vOQ5 - note that here SUM function is properly converted to russian variant "СУММ"

Code used for testing
---------------------
    Workbook wb = new HSSFWorkbook();
    Sheet sheet = wb.createSheet();
    Row row = sheet.createRow((short) 0);
    Cell cell = row.createCell((short) 0);
    cell.setCellValue((String) "Number");
    cell = row.createCell((short) 1);
    cell.setCellValue((String) "Sum");
    cell = row.createCell((short) 2);
    cell.setCellValue((String) "Status");
    cell = row.createCell((short) 3);
    cell.setCellValue((String) "isDeleted");
    short maxSize = 10;
    for(short counter = 1; counter <= maxSize; counter++) {
      row = sheet.createRow(counter);
      cell = row.createCell((short) 0);
      cell.setCellValue(counter);
      cell = row.createCell((short) 1);
      cell.setCellValue(10000);
      cell = row.createCell((short) 2);
      cell.setCellValue("Opened");
      cell = row.createCell((short) 3);
      cell.setCellValue("Yes");

      if(counter == 1) {
        cell = row.createCell((short) 5);
        cell.setCellValue((String) "Total Not Working");
        cell = row.createCell((short) 6);
        String cellFormula = "SUMIFS(B2:B" + (maxSize + 1) + ", D2:D" + (maxSize + 1) + ", \"=Yes\", C2:C" + (maxSize + 1) + ", \"=Opened\")";
        cell.setCellFormula(cellFormula);
      }  else if(counter == 2) {
        cell = row.createCell((short) 5);
        cell.setCellValue((String) "Total Working");
        cell = row.createCell((short) 6);
        String cellFormula =  "SUM(B2:B" + (maxSize + 1) + ")";
        cell.setCellFormula(cellFormula);
      }



    }

    try {
      FileOutputStream out = new FileOutputStream(new File("D:/test.xls"));
      wb.write(out);
      out.flush();
      out.close();
    } catch(IOException e) {
      e.printStackTrace();
    }
Comment 11 Oleg Kuryan 2012-01-23 08:17:02 UTC
Created attachment 28188 [details]
Attachment to show generated file for note working test case
Comment 12 Yegor Kozlov 2012-01-23 09:44:39 UTC
Can you check with XSSF ? Run the same code,only replace
 Workbook wb = new HSSFWorkbook();

with
Workbook wb = new XSSFWorkbook();

Yegor

(In reply to comment #11)
> Created attachment 28188 [details]
> Attachment to show generated file for note working test case

(In reply to comment #11)
> Created attachment 28188 [details]
> Attachment to show generated file for note working test case

(In reply to comment #11)
> Created attachment 28188 [details]
> Attachment to show generated file for note working test case
Comment 13 Oleg Kuryan 2012-01-23 10:05:27 UTC
With XSSFWorkbook everything works fine. So something wrong with HSSFWorkbook.
Comment 14 Yegor Kozlov 2012-02-03 10:21:07 UTC
Unfortunately I cannot easily fix it in HSSF. 

Strictly speaking, the SUMIFS function is not supported by the binary .xls format and this is why POI writes incorrect data and the resulting formula is not recognized by Excel.

SUMIFS is not a 'normal' function like MAX, SUM or SUMIF: it is included in the Excel Analysis Toolpack (ATP) which is an Add-On and installed separately. From this point of view, the Analysis Toolpack is like a VBA module, only in pre-compiled form. 
The primary scope of the support for ATP functions in POI is evaluation, that is, POI recognizes functions with SUMIFS and can evaluate them. Setting a formula with ATP functions is another task and, I should say, a difficult one. The main difficulty is when you create a formula with ATP functions then Excel inserts supplemental data structures - in my test files these were defined named ranges with names like "_xlfn." and these named ranges prefix the ATP function names. To make it clear, the internal name of formula with ATP functions look like _xlfn.SUMIFS(A2:A9,B2:B9,"=A*"). POI writes "pure" tokens and the formula is written as SUMIFS(A2:A9,B2:B9,"=A*"). Evidently it is not enough to make Excel happy.

This feature is not documented in the .xls spec and I don't see how to easily implement it. 


Regards,
Yegor