Bug 48703 - sum spanning multiple worksheets
Summary: sum spanning multiple worksheets
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.5-FINAL
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords: PatchAvailable
: 51941 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-02-08 11:05 UTC by Thomas Lane
Modified: 2016-10-19 07:39 UTC (History)
4 users (show)



Attachments
Trintech_POI examples.zip (874.19 KB, application/x-zip-compressed)
2010-03-26 21:14 UTC, dave.sprague
Details
multiple sheet SUM spanning, made with Excel. Both SUM syntaxes are shown (9.13 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2011-05-23 19:28 UTC, Thomas Lane
Details
XLS version of attachment 27048, created with Excel 2010 (23.50 KB, application/vnd.ms-excel)
2011-06-07 13:46 UTC, Thomas Themel
Details
.xls file showing the type of formulas which can't be evaluated (22.50 KB, application/vnd.ms-excel)
2011-10-03 20:16 UTC, mikkel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Lane 2010-02-08 11:05:36 UTC
I've discovered a problem with formulas spanning multiple sheets, specifically, the sum operation.

I've got a workbook I need to build that contains 35 sheets.  I need to have a sum operation that spans all the sheets.  I tried to use SUM(sheet1:sheet35!A1) syntax, but setCellFormula throws an error with this syntax.  I have confirmed that his syntax is valid for Excel, because I can enter it in manually

I am successfully able to use the SUM(sheet1!A1,sheet2!a1,...) syntax, but this will not work for my problem, because SUM allows a maximum of 30 arguments.  I could compute two sums, and then sum the results as a workaround, but for the number of items I need to calculate, this can get really messy.

Is this syntax [SUM(sheet1:sheet35!A1)] going to be supported in a future build, or is there a workaround to avoid the error being thrown?

Thanks,
Thomas Lane
Comment 1 dave.sprague 2010-03-18 22:40:21 UTC
I've discovered an issue (that appears to be similar to this issue) within an application that we have build (using POI 3.6). I can provide examples as needed, but is this specific issue slated for inclusion within POI 3.7?

Thanks,

Dave Sprague
Director, Product Management
Trintech
Comment 2 dave.sprague 2010-03-18 22:40:39 UTC
I've discovered an issue (that appears to be similar to this issue) within an application that we have build (using POI 3.6). I can provide examples as needed, but is this specific issue slated for inclusion within POI 3.7?

Thanks,

Dave Sprague
Director, Product Management
Trintech
Comment 3 dave.sprague 2010-03-26 21:14:12 UTC
Trintech is supplying a zip file with examples. Please review file entitled
23255 Deferred revenue.xls to determine if this issue is related.
Comment 4 dave.sprague 2010-03-26 21:14:45 UTC
Created attachment 25195 [details]
Trintech_POI examples.zip
Comment 5 Peter 2011-05-20 20:09:37 UTC
Anyone look at this?  Seems that the multisheet reference is being parsed as a Ref3DPtg, which is wrong.  Actually, I don't even see a Ptg type that would correspond to this kind of reference with multiple sheets in it.
Comment 6 Nick Burch 2011-05-20 20:14:31 UTC
Are you able to use BiffViewer to identify what Ptg / Ptgs excel is using to encode this?
Comment 7 Peter 2011-05-23 17:56:06 UTC
It is coming in as 0x3A, which POI maps to Ref3DPtg.  

Microsoft's documents for PtgRef3d have the following:  
"The PtgRef3doperand specifies a reference to a single cell on one or more sheets.

If the formula containing this structure is part of a revision as specified in the Formulas overview, then there MUST be a RevExtern in the RgbExtra corresponding to this PtgRef3d, which specifies those sheets."

http://msdn.microsoft.com/en-us/library/dd921344(v=office.12).aspx

Is that the piece that's missing?  Does POI parse those "RevExtern" structs?
Comment 8 Nick Burch 2011-05-23 19:13:08 UTC
Any chance someone could upload a very small file produced with Excel that contains a cross-sheet sum? (They current example file is a bit big for unit testing with)
Comment 9 Nick Burch 2011-05-23 19:15:31 UTC
(In reply to comment #7)
> Is that the piece that's missing?  Does POI parse those "RevExtern" structs?

I don't think it does yet. The first step would likely be to work up a patch to read these from existing files, then we can look at what'd be needed to write them.
Comment 10 Thomas Lane 2011-05-23 19:28:56 UTC
Created attachment 27048 [details]
multiple sheet SUM spanning, made with Excel.  Both SUM syntaxes are shown
Comment 11 Thomas Lane 2011-05-23 20:04:24 UTC
(In reply to comment #8)
> Any chance someone could upload a very small file produced with Excel that
> contains a cross-sheet sum? (They current example file is a bit big for unit
> testing with)

I uploaded a small file for you.
Comment 12 Nick Burch 2011-05-23 20:25:54 UTC
Could you do the file as a .xls too? 

For the .xlsx file, the formula is stored nice and simply, which is good!
      <c r="B2">
        <f>SUM(Sheet1:Sheet4!C1)</f>
        <v>20</v>
      </c>
Comment 13 Thomas Themel 2011-06-07 13:46:38 UTC
Created attachment 27126 [details]
XLS version of attachment 27048 [details], created with Excel 2010
Comment 14 Nick Burch 2011-07-08 15:18:36 UTC
(In reply to comment #13)
> Created attachment 27126 [details]
> XLS version of attachment 27048 [details], created with Excel 2010

This seems to be another XLSX file, we really need a XLS (binary, OLE2) file
Comment 15 Nick Burch 2011-10-03 17:30:43 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > Created attachment 27126 [details]
> > XLS version of attachment 27048 [details], created with Excel 2010
> 
> This seems to be another XLSX file, we really need a XLS (binary, OLE2) file

Just a reminder that we need a .xls (old style binary) test file for this, before we can look into it more
Comment 16 Nick Burch 2011-10-03 18:42:09 UTC
*** Bug 51941 has been marked as a duplicate of this bug. ***
Comment 17 mikkel 2011-10-03 20:16:06 UTC
Created attachment 27680 [details]
.xls file showing the type of formulas which can't be evaluated

Here is an .xls attachement showing the bug.
Try to evaluate the formula on sheet1 and you should see the error
Comment 18 Nick Burch 2012-02-10 15:03:36 UTC
Looking at the supplied file, BiffViewer sees it as:

[FORMULA]
    .row    = 0x0000
    .col    = 0x0000
    .xfindex= 0x000F
  .value         = 4.0
  .options   = 0x0000
    .alwaysCalc= false
    .calcOnLoad= false
    .shared    = false
  .zero      = 0xFC002006
    Ptg[0]=org.apache.poi.ss.formula.ptg.Ref3DPtg [sheetIx=0 ! A1]R
    Ptg[1]=org.apache.poi.ss.formula.ptg.AttrPtg [sum ].
[/FORMULA]

If you fetch the Cell Formula back in POI, you get just: SUM(Sheet2!A1)

So, it looks like our handling of Ref3DPtg isn't coping with the multiple sheets case. Someone will need to fix that first, then when we can correctly handle the formula string, we can look at the evaluation. Finally, how XSSF formulas are evaluated can be sorted based on this.

I've added disabled unit tests for HSSF and XSSF in r1242807, they should hopefully help someone with working on the issue
Comment 19 Radoslav 2012-12-18 19:58:36 UTC
After debugging this issues - the fix for the old binary format is not that hard. The InternalWorkbook already contains these external references and they are already created during leading. So the fix becomes the following:

1. Handle the parsing of the formula format (now it think it is a named range)
2. Correctly use the external indices (now it just uses the first sheet index)
3. Iterate over in the evaluation routine (now it only does over a single sheet).
4. Fix for creating new formulae that span multiple sheets.

Unfortunately, this does not work with the ooxml as there are no such external indices (points 2/4 and 3 as a result of the missing indices).

So far I have a produced almost working fix for the binary files but not for the ooxml files. However, I am asking if anyone has any insights how to solve this for the ooxml files format.
Comment 20 Yegor Kozlov 2012-12-20 07:31:08 UTC
> 
> So far I have a produced almost working fix for the binary files but not for
> the ooxml files. However, I am asking if anyone has any insights how to
> solve this for the ooxml files format.

If you already implemented (4) and can create formulas spannig across multiple sheets then the OOXML part is almost done .  The only thing you need to to parse the formula string into Ptg[] and then re-use functionality from HSSF. 

I will be happy to help  with this task. Please go ahead and submit a patch for the binary format and I will try to figure out what to do with ooxml.  

Yegor
Comment 21 Radoslav 2013-12-18 20:27:52 UTC
I have added a patch to fix this for the HSSF formulae (not XSSF). If I anyone wants to look into it they are welcome to.

Bug 55906
Comment 22 Nick Burch 2014-07-26 15:31:22 UTC
As of r1613654 this is now supported, for both HSSF and XSSF.
Comment 23 Javen O'Neal 2016-10-19 07:39:52 UTC
I removed the @Ignore decoration from these tests in r1765544 so that this stays fixed.