To reproduce: 1. Create an xls file using POI. This can be done by copying a regular xls file using a POI program E.g. Use the Copier program below to copy (from infile.xls to outfile.xls) 2. Create a regular Excel file (e.g. linkfile.xls) 3. Create, inside linkfile.xls: i) a link to the infile.xls (a regular xls file) ii) a link to the outfile.xls (xls file create by POI) 4. Close all files and reopen linkfile.xls to see if all links OK. Note that the 2nd link will fail. (problem here) ** The link will work again once the POI xls file is opened/saved in Excel. Copier program: import org.apache.poi.poifs.filesystem.POIFSFileSystem; import org.apache.poi.hssf.usermodel.*; import java.io.*; // Copy from infile.xls to outfile.xls public class Copier { public static void main(String[] args) throws IOException { if (args.length != 2) { System.err.println("Usage: " + Copier.class + " infile.xls outfile.xls"); System.exit(1); } String infile = args[0]; String outfile = args[1]; POIFSFileSystem pfs = new POIFSFileSystem(new FileInputStream(infile)); FileOutputStream fos = new FileOutputStream(outfile); HSSFWorkbook wb = new HSSFWorkbook(pfs); // Write the output to a file wb.write(fos); fos.close(); } }
I tried to reproduce this but was unable. 1) What do you mean by create a link? Do you mean Insert | Hyperlink or Insert | Object | Create from File, then check 'Link to File'? I tried it both ways and it worked in both cases 2) What version of Excel (I have Excel 2000 here) 3) Did you really name the files with the '.xls' extension? I have however managed to create what may be the same problem with different steps. 1) Create a file in POI. Any file will do, but it must be created by POI (as opposed to copied from an existing template which was created by Excel) 2) Rename the file created by POI to have either no extension, or an extension that is not recognized by Excel. For example, I used .xxx 3) Open a Word document and try to Insert | Object | Create From File, and choose the POI generated file (that does not have a .xls extension). The insert works, but it only gives a generic icon. 4) Now open the file in Excel. Excel does not mind that it has a non-standard extension, and the file opens fine. 5) Save the file from Excel with another non-standard extension. (I used .yyy) 6) Go back into the Word document, Insert | Object | Create From File, and choose the copy of the file which was saved from Excel. Now the file will open as an embedded Excel spreadsheet (even though the extension is non-standard) The reason for this behavior is because HSSF (and in particular POIFS) does not put in Excel's 'Storage Class ID' when the file is created from scratch by POI. Since the file has a non-standard extension, there is no way that Word can recognize the contents as an Excel object. However, once the file is opened and saved in Excel, even though you make no changes, Excel updates the Storage Class ID. This tells CompObj that the application which handles the root stream is Excel (as long as Excel is installed on the machine).
Thanks Michael for your quick response; I was away and didn't check till now. I'm new to Bugzilla hope I can get back to you via "Additional Comments", otherwise please let me know and I'll try again. To further follow up (in response to your questions): 1. To create a link: I probably used the wrong term, I meant to reference another Excel sheet using the formula syntax "=[file.xls]Sheet1!$A$1" (Will try to upload some sample files in a minute, please see Cell B1 and B2 of file linkfile.xls) 2. I used Excel 97 (will try with Excel 2000 a bit later) 3. the Excel files have standard extension .xls I didn't try the 2nd part as you said, since I think my problem will not be related to that. Anyway will try to upload my Excel file and hope that will clarify my question. Thanks. Daniel
Created attachment 8103 [details] Sample file/code to help depict the problem
I think I see the problem. Could this be because HSSF does not write out DBCELL records? However, it does write out INDEX records. But INDEX records contain a pointer to one or more DBCELL records. The INDEX record written by HSSF does contain a DBCELL pointer. But the pointer is invalid (it doesn't even point to the beginning of a record, and there are no DBCELL records in the output). If this is the problem, this report may be a duplicate of http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9576
I looked at bug # 9576 and didn't see similarity from the problem description, though may be the underlining factors are the same, but I'm not familiar with DBCell and Index record (may investigate a bit further later). Anyway, I was able to tmp work around the problem by re-openning and saving the Excel file using a small non-POI app (e.g. PB/OLE), and the problem goes away. The open/save operation likely restored some internal Excel file structure (sorry, I'm novice to Excel file format and can't really comment on that)
You might want to try changing POI to not output any INDEX record. I believe it is optional (though recommended).
I checked through POI JavaDoc and didn't see how to turn off index record. From 9576, it seems like I have to download Jason's patch. I tried, but got a diff output listing instead. Would appreicate if someone tell me if that's the way to go, and how to get the patch. I would like to give it a try.
What I mean is I ran BiffViewer on the file you attached, which was generated by POI/HSSF. Here is the output. ... Offset 0x992 (2450) recordid = 0x20b, size =20 [INDEX] .firstrow = 0 .lastrowadd1 = 1 .dbcell_0 = a82 [/INDEX] ... Offset 0xa78 (2680) recordid = 0x203, size =14 [NUMBER] .row = 0 .col = 1 .xfindex = f .value = 222.0 [/NUMBER] ============================================ Offset 0xa8a (2698) recordid = 0x203, size =14 [NUMBER] .row = 0 .col = 2 .xfindex = f .value = 333.0 [/NUMBER] ... The INDEX record has a DBCELL pointer to offset 0xa82. (There could be more than one DBCELL pointer in each INDEX record. I don't think BiffViewer necessarily shows them all). But you can see that there is a NUMBER record beginning at 0xa78, and another NUMBER record beginning at 0xa8a. So the DBCELL pointer in the INDEX record is bad -- it doesn't even point to the beginning of any record. After you save the file from Excel, the INDEX records are fixed up, and the DBCELL records are inserted. BiffViewer will show the INDEX record pointing to the beginning of the DBCELL record, as expected.
Linking to external files works properly with the latest sources. Try POI-3.5-beta4 or download daily builds from http://encore.torchbox.com/poi-svn-build/ Yegor