Bug 9576

Summary: [PATCH] DBCELL, INDEX EXTSST (was Acess 97 import)
Product: POI Reporter: Guenther Grau <apache>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: jaBernet
Priority: P3    
Version: 2.0-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Patch to allow Access, Word Mail Merge import the sheet
Final WORKING patch for DBCell, Index and EXTSST records!
Unified Diff Patch. This doesnt address the concerns that people had with the patch. I will do this shortly
Test file generated by HSSF
HSSF generated file loaded in Excel and saved

Description Guenther Grau 2002-06-03 14:39:49 UTC
I created a simple XLS file using the attached code.
The file is loaded by Excel 97 without any problems.
The same file imported into Access 97 fails with the
error message "External table ins't in the expected format"
If I load the file into Excel, save it in Excel and
the import it into Access, it works fine. If you need
more information or want me to test anything, please
let me know.

Thanx,

  Guenther


package test;

import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.hssf.usermodel.*;

public class PoiTest {

  public static void main(String[] args) {
  	HSSFWorkbook workbook = new HSSFWorkbook();
  	HSSFSheet sheet = workbook.createSheet("TestSheet");
  	
  	HSSFRow row = sheet.createRow((short)0);
	HSSFCell cell = row.createCell((short)0);
	cell.setCellValue("Hallo");
	cell = row.createCell((short)1);
	cell.setCellValue(7);
	
  	row = sheet.createRow((short)1);
	cell = row.createCell((short)0);
	cell.setCellValue(42);
	cell = row.createCell((short)1);
	cell.setCellValue("end");
	
	try {
	  FileOutputStream outputStream = new FileOutputStream("c:\\test.xls");
	  workbook.write(outputStream);
	  outputStream.close();
	}
	catch (IOException ioException) {
	  ioException.printStackTrace();
	}
  }
}
Comment 1 Andy Oliver 2002-06-03 15:23:23 UTC
I don't plan to strive for MS Access compatibility (Excel/OpenOffice/Gnumeric
only), but if you'd like to track this down and contribute, we'll be happy to
accept contributions.
Comment 2 Andy Oliver 2002-09-20 01:32:20 UTC
Please confirm but I think this is fixed in 1.8  I think it was the same thing
that corrupted macros...
Comment 3 Jason Height 2003-01-24 03:51:13 UTC
Sorry but this is not fixed.

I am trying to perform a mail merge with a POI generated sheet and am getting a 
very similar error. 

The real reason for this failing is the DBCell and Index Records are not fully 
implemented. I will be looking into this shortly BUT the firewall at my 
employer currently blocks CVS access :-( So i cant download the source just yet.

Dont suppose someone has coincidently written DBCell support properly and not 
committed it to CVS ;-) ?

Jason
Comment 4 Andy Oliver 2003-01-24 03:58:48 UTC
no but I can certainly tar.gz an extract up and stick it on my server.  It would
be chill to have Index/DBCell support.  The catch is that its kind of a pain,
when you add a DBCell you have to change all the indexes in Index, but then
adding one moves them down.
Comment 5 Jason Height 2003-01-24 04:03:05 UTC
Yup Meaty little bug to fix eh? ;-) 

Yeah if you could place the code somewhere & let me know. I went looking on the 
website for a nightly tar ball of CVS but couldnt find one (only the nightly 
binary builds!). Perhaps this could be easily setup and then by Monday i will 
have a version i can at least download?

Jason
Comment 6 Jason Height 2003-01-31 10:50:21 UTC
Created attachment 4661 [details]
Patch to allow Access, Word Mail Merge import the sheet
Comment 7 Jason Height 2003-01-31 11:02:07 UTC
OK I see now why the DBCell and Index functionality hasnt been done up to 
now.. It was a real pain (but i still had fun doing it, well actually only 
when it started to work ;-)

This is a large patch so suggestions are welcome. 

There are some issues that I think need to be cleaned up after this patch is 
accepted, but i was thinking that i could include another patch at a later 
date if it is agreed that these issues should be resolved:
1) Removal of the old code that attempted to do this work from 
RowRecordsAggregate.
2) RowRecordsAggregate and ValueRecordsAggregate.serialize is no longer called 
so they can be removed.
3) I dont understand the need for the serialize method that returns the byte 
array. Since this doesnt seem to be called. (Although i could be wrong on this 
one, i didnt spend alot of time on it)
4) A unit test for this (although basically the unit tests that write a set of 
rows will already test these changes)
5) The lookup of rows from the RowRecordsAggregate and cells from 
ValueRecordAggregate seems sub optimal in certain situations (see comments in 
patch)

The patch will create the DBCell and Index records in the output stream but 
for most people to this point they have lived without this functionality. 
Maybe there should be a flag for the Sheet that tells it whether to write 
these records or not?

Thanks

Jason
Comment 8 Jason Height 2003-01-31 11:11:54 UTC
Actually i just did a little test and I suspect the quickest way to resolve 
the original BUG is simply to not write the INDEX and DBCell records at all. 
The problem with Access not being able to import was due to the fact that HSSF 
originally didnt ignore INDEX records but it did ignore DBCELL records (INDEX 
records reference DBCell records, so the INDEX record would have been 
referencing an invalid location in the resultant excel file). I am 95% sure of 
this (I dont have Access, i only have Word at home here, so i can only do a 
mail merge. I suspect that both mail merge and Access import both use DDE, 
hence my confidence. I will check on this when i get back to work)

However assuming that i am correct, Do you want me to modify the previous 
patch to have a flag on Sheet (and exposed in HSSFSheet) that will turn the 
Writing of DBCell & INDEX records on/off?

Let me know..

Jason
Comment 9 Guenther Grau 2003-01-31 12:19:19 UTC
Hi,

since you are requesting feedback:

First, let me thank you for looking into this problem!!!

Now for the comments:
I'd prefer if this file would be written in the exact format (or as close as
possible to it) that Excel would write. This would give it the most likelyhood
that it will work well with other programs depending on the format. 
I don't mind adding a method to strip off unneeded data if wanted but IMHO the
default behaviour should be as described above.

But as I don't have a clue what the format actually looks like, this might be
totally off topic. Anyways, these were just my 2 cents (pick currency yourself :-)

Thanx again,

  Guenther
Comment 10 Jason Height 2003-02-03 01:52:07 UTC
Please dont apply the current patch. I just tried it at work where i have MS 
Access available and it still doesnt import !

Investigating....

Jason
Comment 11 Jason Height 2003-02-04 12:06:19 UTC
Created attachment 4717 [details]
Final WORKING patch for DBCell, Index and EXTSST records!
Comment 12 Jason Height 2003-02-04 12:10:01 UTC
*** Can someone please apply this patch. ***

This patch implements the DBCellRecord, IndexRecord and ExtSSTRecord correctly.

These records are required for MailMerge to excel files, and Import into 
Access to work correctly.

As mentioned else where, the need to create a Collection over the TreeMap in 
the RowRecordAggregate and ValueRecordAggregate is a performance penalty and 
one which I will investigate ways of reducing after this patch is applied.

The testcases now work with this patch (they didnt half way through the 
development).

Thanks

Jason
Comment 13 Andy Oliver 2003-02-04 13:44:15 UTC
Glen is going to commit his zoom stuff tonight.  Then I do this.  Then we do
Danny's and any other various trinkets lying around.  Then we cut a dev release.
 Then I kiss your feet, thank you for doing this.  :-)
Comment 14 Andy Oliver 2003-02-08 16:41:10 UTC
I had trouble apply this so I did cvs update -r version against the Sheet and
Workbook, where version = the version cited in the patch.  Applied, did cvs
update -A for sheet and workbook and recompiled/tested.  this seems to have
worked for compilation. Before releasing we need to make sure these sheets work
in excel.  Also while the unit tests pass, I do not see NEW unit tests for
EXTSST/ DBCELL and INDEX.  I would really like to see those submited because I
guarantee we're going to break these like OVER and OVER again otherwise.  I
promise absolutely!

In anycase, a test failed (and boy formulas took forever):

Running org.apache.poi.hssf.usermodel.TestNamedRange
Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.824 sec

BUILD FAILED
/home/andy/poipatch/jakarta-poi/tools/cents/junit.cent/xbuild.xml:59: Test
org.apache.poi.hssf.usermodel.TestNamedRange failed

Total time: 5 minutes 51 seconds

formula test:

Running org.apache.poi.hssf.usermodel.TestFormulas
Tests run: 21, Failures: 0, Errors: 0, Time elapsed: 236.824 sec

I think we need some more consideration here, I'm sorry to say.  Can you look
into this further?
Comment 15 Jason Height 2003-02-09 21:52:17 UTC
Odd this patch definitly passed all tests on my machine ;-)

Never mind that though I will write some unit tests and make DAMN sure that 
they work. Maybe I will get to it today. Otherwise tonight. Give me 24 hours.

Jason
Comment 16 Andy Oliver 2003-02-09 22:05:53 UTC
its probably because of the new stuff.  
Comment 17 Jason Height 2003-03-10 02:38:34 UTC
Created attachment 5238 [details]
Unified Diff Patch. This doesnt address the concerns that people had with the patch. I will do this shortly
Comment 18 Jason Height 2003-03-19 21:50:28 UTC
*** Bug 18149 has been marked as a duplicate of this bug. ***
Comment 19 Guenther Grau 2003-05-27 17:31:33 UTC
Hi,

I downloaded the patch from 2003-03-10 (attach_id=5238) and applied it and I
still can't import generated XLS file into access 97:

External table isn't in the expected format.

Guenther
Comment 20 Guenther Grau 2003-05-27 17:33:27 UTC
Created attachment 6514 [details]
Test file generated by HSSF
Comment 21 Guenther Grau 2003-05-27 17:35:31 UTC
Created attachment 6515 [details]
HSSF generated file loaded in Excel and saved
Comment 22 Guenther Grau 2003-05-27 17:42:14 UTC
Hi,

I just added two attachments for this bug.
The first is a test file generated by HSSF.
If I import this directly into access I get an error message (table not in
expected format).
I loaded this file into Excel and saved it as test-ms.xls, which is the second
attachment I just added. The second attachment can be imported into Access
without any problems. 

I used poi-pre2.0 src with the patch from 2003-03-10 applied. (I get the same
results if the patch is not applied).

I'd be really glad if someone could look into this, or tell me where I should
start to look :-)

Thanx in advance,

  Guenther


The Java-code to create the file is very simple:
public class Test {
  public static void main(String[] args) {
  	try {
      HSSFWorkbook workbook = new HSSFWorkbook();
      HSSFSheet sheet = workbook.createSheet("Hallo");
      HSSFRow row = sheet.createRow(0);
      short cellIndex = 0;
      HSSFCell cell = row.createCell(cellIndex);
      cell.setCellValue("Hello world");
      FileOutputStream output = new FileOutputStream("c:/test.xls");
  	  workbook.write(output);
  	  output.close();
  	} catch (Exception e) {
  	  e.printStackTrace();
  	}
      }
}
Comment 23 Jason Height 2003-09-18 02:25:27 UTC
I have patched the head and as of now it should work fine.