Bug 59673 - HSSFWorkbook#setSheetOrder does not update external sheet indexes in formula references with ranges rather than single cells
Summary: HSSFWorkbook#setSheetOrder does not update external sheet indexes in formula ...
Status: NEEDINFO
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 regression (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-08 09:44 UTC by Matthew Wightman
Modified: 2016-07-05 10:23 UTC (History)
0 users



Attachments
Patch making treatment of Area3DPtg and Ref3DPtg consistent (4.07 KB, patch)
2016-06-08 09:44 UTC, Matthew Wightman
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Wightman 2016-06-08 09:44:06 UTC
Created attachment 33926 [details]
Patch making treatment of Area3DPtg and Ref3DPtg consistent

Since POI 3.11, the setSheetOrder method on HSSFWorkbook changes formulas and named ranges which refer to areas rather than single cells to point to the wrong sheet.

This happens because the move now changes the external sheet indexes for sheets, but when FormulaShifter is used to update formulas and named ranges that refer to those sheets, only references which consist of a single cell (Ref3DPtg) are updated, leaving references to an area (Area3DPtg) unchanged but pointing at the wrong external sheet index.

For example:

=========
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFName;

public class Demo {

  public static void main(String ... args) throws Exception {
    final HSSFWorkbook test = new HSSFWorkbook();
    HSSFSheet sheetA = test.createSheet("A");
    HSSFSheet sheetB = test.createSheet("B");

    HSSFName name = test.createName();
    name.setNameName("cellsOnA");
    name.setRefersToFormula("B!A1:A2");
    HSSFName name2 = test.createName();
    name2.setNameName("cellOnA");
    name2.setRefersToFormula("B!A1");

    System.out.println("Formulas before sheet re-ordering:"); 
    System.out.println(name.getRefersToFormula());
    System.out.println(name2.getRefersToFormula());

    test.setSheetOrder("B", 0);

    System.out.println("Formulas after sheet re-ordering:"); 
    System.out.println(name.getRefersToFormula());
    System.out.println(name2.getRefersToFormula());
  }
}
=========

outputs

=========
Formulas before sheet re-ordering:
B!A1:A2
B!A1
Formulas after sheet re-ordering:
A!A1:A2
B!A1
=========

demonstrating that the behaviour for references to cells and references to ranges of cells are inconsistent.

Attached is a patch which changes the behaviour for FormulaShifter#adjustPtgDueToSheetMove to be consistent for Area3DPtg and Ref3DPtg, adding a unittest based on that for Bug 58746. However, I think this needs more work - it looks like HSSFWorkbook#setSheetOrder is passing *internal* sheet indexes in to FormulaShifter's constructor, but the values being passed to and from the Ptg appear to be *external* sheet indexes, and the two are not guaranteed to be in sync.
Comment 1 Nick Burch 2016-06-08 12:30:01 UTC
At first glance the patch looks good

Would you like us to fully review it and aim to apply, or hold off while you dig into the sheet indicies handling some more?
Comment 2 Matthew Wightman 2016-06-09 08:58:35 UTC
I have raised a new bug (Bug 59677) for the external indices issue with
reproduction steps now I have been able to find some.

Depending on the fix implemented for that bug, this patch may be
redundant (except for the additional test case coverage), but depending
on approach taken for that bug might still be needed.

As there is not yet a patch/fix for that bug, I would advise you to
review and apply this patch - the external indexing issue only applies
to some workbooks (where the two indices are not in sync), and corrects
the behaviour on all other workbooks.

For the workbooks with complicated external indices, the current
behaviour for area references is that they may end up pointing to the
wrong sheet after a change in sheet order. This patch will change which
sheet is incorrectly referenced, and can make some of the broken
references point to no sheet at all, instead becoming #REF, but this is
the same as the current treatment of single cell references (as reported
on the new bug).
Comment 3 Nick Burch 2016-06-10 08:55:07 UTC
Having reviewed the surrounding code some more, I'm wondering if we need to change the adjustFormula method to take both the old and the new external sheet indexes. That way, we'd have more context available for the re-order case, and might make the removal case simpler too

It would then be up to the calling Workbook class to work out the new external sheet index, fixing up records as needed, before doing the shift

Any thoughts?
Comment 4 Matthew Wightman 2016-07-05 10:23:49 UTC
That sounds plausible as an approach.

Sorry, but I am unlikely to have any time for further investigation of
this issue in the near future. I believe this bug should not be marked as blocked on needing more information from me?