Bug 44548 - MSExcel files with formula fields.
Summary: MSExcel files with formula fields.
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.0-dev
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-06 08:36 UTC by Sheetal
Modified: 2008-08-06 01:38 UTC (History)
0 users



Attachments
File which fails to be read with POI (666.50 KB, application/vnd.ms-excel)
2008-03-06 09:04 UTC, Sheetal
Details
Macro less file (633.50 KB, application/vnd.ms-excel)
2008-03-06 10:30 UTC, Sheetal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sheetal 2008-03-06 08:36:02 UTC
Hi,

Our application reads the contents of the MSExcel file uploaded at the server and generates HTML Table of each sheet in the excel file.

If the MSExcel sheet contains formula fields then, POI is not able to create the workbook.

But if the same MSExcel file is opened in Open Office and saved and then fed to POI then it works absolutely fine.

Do you have any idea why doesnt POI is able to read MSExcel sheet without it being opened and saved in Open Office.
Comment 1 Nick Burch 2008-03-06 08:42:33 UTC
POI should have no problems reading files containing formulas

Could you please upload one of your problem files, and a code snippet that's currently failing for this file? We'll also need to know what version of excel generated the file.
Comment 2 Sheetal 2008-03-06 09:04:47 UTC
Created attachment 21640 [details]
File which fails to be read with POI

This is the file which fails when read with POI 3.1
But if the same file is opened and saved in Open office and then fed to POI , then it read fine.
Comment 3 Sheetal 2008-03-06 09:05:37 UTC
WARN. Unread 77 bytes of record 0x18
WARN. Unread 103 bytes of record 0x18
WARN. Unread 103 bytes of record 0x18
WARN. Unread 103 bytes of record 0x18
WARN. Unread 104 bytes of record 0x18
WARN. Unread 77 bytes of record 0x18
WARN. Unread 103 bytes of record 0x18
WARN. Unread 103 bytes of record 0x18
WARN. Unread 7 bytes of record 0x1c1
WARN. Unread 6589 bytes of record 0xeb
WARN. Unread 8223 bytes of record 0xfc
WARN. Unread 7986 bytes of record 0x3c
Cannot create Workbook so reject this file.
java.lang.IndexOutOfBoundsException: Index: 571, Size: 0
        at java.util.ArrayList.RangeCheck(ArrayList.java:546)
        at java.util.ArrayList.get(ArrayList.java:321)
        at org.apache.poi.util.IntMapper.get(IntMapper.java:81)
        at org.apache.poi.hssf.record.SSTRecord.getString(SSTRecord.java:195)
        at org.apache.poi.hssf.model.Workbook.getSSTString(Workbook.java:729)
        at org.apache.poi.hssf.usermodel.HSSFRichTextString.<init>(HSSFRichTextString.java:57)
        at org.apache.poi.hssf.usermodel.HSSFCell.<init>(HSSFCell.java:202)
        at org.apache.poi.hssf.usermodel.HSSFRow.createCellFromRecord(HSSFRow.java:193)
        at org.apache.poi.hssf.usermodel.HSSFSheet.setPropertiesFromSheet(HSSFSheet.java:177)
        at org.apache.poi.hssf.usermodel.HSSFSheet.<init>(HSSFSheet.java:131)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:226)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:148)
        at com.ch2m.gsri.wizards.ExcelSheetBean.getExcelSheetNames(ExcelSheetBean.java:37)
        at org.apache.jsp.validate_jsp._jspService(validate_jsp.java:255)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:328)
        at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315)
        at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)
        at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
        at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
        at java.lang.Thread.run(Thread.java:595)
Comment 4 Nick Burch 2008-03-06 09:11:43 UTC
The errors tend to indicate a very messed up file. Where did it come from?
Comment 5 Sheetal 2008-03-06 09:14:16 UTC
This file is frm our customers.
What special thing does Open office does, which makes sense to POI, any idea?
Comment 6 Nick Burch 2008-03-06 09:27:31 UTC
OpenOffice may be more tollerant of borderline corrupt files than POI is

Unless we know how the file was created, and that it's really valid to be in that messed-up a state, it's unlikely that we'll hack poi with workarounds for a file this broken

Please re-open the bug if you're able to confirm this file really came from excel, and hasn't been corrupted since then
Comment 7 Sheetal 2008-03-06 09:40:56 UTC
Microsoft Excel 97 - Excel 2003 & 5.0/95 Workbook 

Microsoft Office Excel Workbook.

These are the two forms in which the file could be saved as.
It is really valid to be in that messed up state.


Another thing is, if the same file (sheet) is copied and do a paste special with values only option then also even if it is not touched by open office POI reads is fine.

our apologies, but please let us know what best can u do for us.
Comment 8 Nick Burch 2008-03-06 09:49:59 UTC
You still haven't said how the file was created. What version of excel? Is this problem reproducable with other files, or is it just this one file?
Comment 9 Sheetal 2008-03-06 09:55:23 UTC
The file was created using MS Office Excel 2003 and is reproducable in all other files which contains the formulae  e.g. - Cell: H22 in PAF sheet.

Comment 10 David Fisher 2008-03-06 10:09:30 UTC
I opened the attached file using Excel 2004 for Mac.

It contains macros, so I clicked to open the file with Macros disabled. Excel then reported that the file contains Excel 4.0 Macros which cannot be disabled.

I don't think that POI supports the Excel file format from before Excel 5.0.

This could be the problem. What happens if you remove the macros?
Comment 11 Sheetal 2008-03-06 10:17:28 UTC
I deleted all the macros for this file and saved it and then fed to POI but again the same error. Problem persists even if i remove all the macros.
Comment 12 David Fisher 2008-03-06 10:28:49 UTC
No promises about progress, but would you attach the "macro-less" file as well?
Comment 13 Sheetal 2008-03-06 10:30:11 UTC
Created attachment 21642 [details]
Macro less file

Hi,
sending you the same file after deleting the macros.
Comment 14 David Fisher 2008-03-06 10:56:13 UTC
There are still macros on this sheet. This is easy to see. Tighten up you macro security so that it asks if you want to enable.

I believe remaining macros are on your "Site Details" worksheet attached to two buttons.

It looks like this work order template has had a long history and I think it is likely that there are old records within. You might try the BiffReader and look at the records that are unrecognized to see if they are described in any of the newly released documentation.
Comment 15 Sheetal 2008-03-07 01:19:25 UTC
I didnot understand the Biff Reader part. can u please elaborate.
Comment 16 Sheetal 2008-03-07 07:28:40 UTC
If i uncomment the code for unknownrecord(records.add(record)) then the Excel file is read by POI but fails to evaluate formula and to retrieve strings from the sheet.

in RecordFactory.java
----------------------
if (lastRecord instanceof UnknownRecord) {
    //Gracefully handle records that we dont know about,
   //that happen to be continued
     records.add(record);
} 
else {
  // throw new RecordFormatException("Unhandled Continue Record");
}

But if i again comment it out ie.
//records.add(record)

then POI fails to read my excel file.
Comment 17 David Fisher 2008-03-07 07:40:26 UTC
I'm sorry I should have written BiffViewer.

See http://poi.apache.org/hssf/how-to.html and look for org.apache.poi.hssf.dev.BiffViewer

This should help you see those records and what is in them. It does sound like they are some kind of embedded macro in an older form. Maybe the POI developers have never seen these before. Only now has Microsoft provided documentation.

Good luck.
Comment 18 Josh Micich 2008-08-06 01:38:18 UTC
Re-tested OK in 3.1 and svn trunk (r683126)

Both attachment (id=21640) and attachment (id=21642) can be read OK by POI.

Some things are not quite right though.  Comment #2 mentions POI version '3.1' but in March, only version 3.0.2 was available.  Perhaps '3.0.1' was intended.  I tried both attachments against various earlier versions (3.0, 3.0.1, 3.0.2) of POI, and none of them fail with the same error described in comment #3.  It seems that the most common error was "Unknown grbit '16'".  This seems to be related to bug 42564 which was fixed in May.  

If I've made a mistake, please specify the exact POI version and attachment that produce the error.