Bug 45798

Summary: HSSFCell.toString() With 3D Refrence Causes IndexOutOfBoundsException
Product: POI Reporter: Skye Shaw <sshaw>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: Excel 2002 spreadsheet demonstrating a bug caused by cells with 3D references

Description Skye Shaw 2008-09-12 18:52:19 UTC
Created attachment 22561 [details]
Excel 2002 spreadsheet demonstrating a bug caused by cells with 3D references

Calling toString() on a cell with a 3D reference containing another sheet in the workbook causes an IndexOutOfBoundsException.


Example:

 FileInputStream fis = new FileInputStream(argv[0]);
 POIFSFileSystem fs  = new POIFSFileSystem(fis);
 HSSFWorkbook wb = new HSSFWorkbook(fs); 
 HSSFSheet sheet = wb.getSheetAt(0);
 HSSFRow row = sheet.getRow(1);
 HSSFCell cell = row.getCell((short)1);  
 System.out.println(cell);

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
        at java.util.ArrayList.RangeCheck(Unknown Source)
        at java.util.ArrayList.get(Unknown Source)
        at org.apache.poi.hssf.model.Workbook.getSheetName(Workbook.java:534)
        at
org.apache.poi.hssf.model.Workbook.findSheetNameFromExternSheet(Workbook.java:1867)
        at org.apache.poi.hssf.model.Workbook.getSheetReferences(Workbook.java:1849)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.getSheetReferences(HSSFWorkbook.java:788)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.getSheetReferences(HSSFWorkbook.java:79)
        at org.apache.poi.hssf.record.formula.Ref3DPtg.getSheetName(Ref3DPtg.java:169)
        at org.apache.poi.hssf.record.formula.Ref3DPtg.toFormulaString(Ref3DPtg.java:182)
        at org.apache.poi.hssf.model.FormulaParser.toFormulaString(FormulaParser.java:925)
        at org.apache.poi.hssf.model.FormulaParser.toFormulaString(FormulaParser.java:858)
        at org.apache.poi.hssf.usermodel.HSSFCell.getCellFormula(HSSFCell.java:686)
        at org.apache.poi.hssf.usermodel.HSSFCell.toString(HSSFCell.java:1031)
        at java.lang.String.valueOf(Unknown Source)
        at java.io.PrintStream.println(Unknown Source)
        at bs.main(bs.java:69)


The exception is raised in POI-3.1-FINAL and POI-3.5-beta2.

The problem originates in Workbook.findSheetNameFromExternSheet(), 
This method calls LinkTable.getIndexToSheet():

public short getIndexToSheet(short num) {
  return _externSheetRecord.getREFRecordAt(num).getIndexToFirstSupBook();
}

ExternSheetSubRecord.getIndexToFirstSubBook() returns 2, which is then used to get the 2nd element of the Workbook’s boundsheets, a list of length 2.

5 ExternSheetSubRecords are created from the attached spreadsheet, they are:

   supbookindex =0
   1stsbindex   =0
   lastsbindex  =0

   supbookindex =1
   1stsbindex   =2
   lastsbindex  =2

   supbookindex =2
   1stsbindex   =1
   lastsbindex  =1

   supbookindex =0
   1stsbindex   =-1
   lastsbindex  =-1

    supbookindex =0
   1stsbindex   =1
   lastsbindex  =1

It seems that the original DocumentInputStream could be off, 3 non -1 supbookindices are created, and the 2nd 1stsbindex returns 2...
Comment 1 Josh Micich 2008-09-14 12:04:54 UTC
Fixed in svn r695264.

Junit added.

The main problem seems to be that ref #1 in the EXTERNSHEET record, which refers to sheet index 2, even though the worbook has only 2 sheets:
        
Offset 0x3954 (14676)
recordid = 0x17, size = 32
[EXTERNSHEET]
   numOfRefs	 = 5
refrec		 #0: extBook=0 firstSheet=0 lastSheet=0
refrec		 #1: extBook=1 firstSheet=2 lastSheet=2
refrec		 #2: extBook=2 firstSheet=1 lastSheet=1
refrec		 #3: extBook=0 firstSheet=-1 lastSheet=-1
refrec		 #4: extBook=0 firstSheet=1 lastSheet=1
[/EXTERNSHEET]

        		
The fix for this bug involved removing the eager initialisation of the extern SheetReferences.  As it turns out, the formula in cell B1 has a Ref3DPtg with externSheetIndex=4 which resolves to sheetIndex 1.
        
Some investigation was done to see how Excel would handle a formula having externSheetIndex=1 (sheetIndex 2).  Excel displays the formula as ''!$A2, but also prompts the user about broken links.

Note - this fix is just for the sample code provided.  It's quite possible (since the LinkTable is not fully understood) that POI may still fail in other ways with this spreadsheet.

An unrelated issue was that BiffViewer crashed on the DrawingGroupRecord.  This was not investigated any further.