Bug 19739

Summary: [PATCH] ContinueRecord not copied by RecordFactory
Product: POI Reporter: brett.knights
Component: HSSFAssignee: POI Developers List <dev>
Severity: blocker    
Priority: P3    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: other   
Attachments: Spreadsheet that does not copy successfully.
Allows ContinueRecords to be copied as-is when following a non-aware record type

Description brett.knights 2003-05-07 16:38:09 UTC
I have a spreadsheet with data originally from Japan. When I tried copying it 
with the source below but the size grows by 3k and I get a message 
saying "Unable to open file" when I try to open the copy.

copy code from a POI example:
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFCell;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

public class ReadWriteWorkbook
    public static void main(String[] args)
        throws IOException
        POIFSFileSystem fs      =
                new POIFSFileSystem(new FileInputStream(args[0]));
        HSSFWorkbook wb = new HSSFWorkbook(fs, true);
        HSSFSheet sheet = wb.getSheetAt(0);

        // Write the output to a file
        FileOutputStream fileOut = new FileOutputStream(args[1]);

Comment 1 brett.knights 2003-05-07 16:39:16 UTC
Created attachment 6260 [details]
Spreadsheet that does not copy successfully.
Comment 2 brett.knights 2003-05-09 15:40:58 UTC
I stripped down the spreadsheet until I had a very simple one that still 
wouldn't copy. 

It turns out the problem was in the handling of ContinueRecords. In four cases 
in my spreadsheet ContinueRecords followed UnknownRecords. UnknownRecord can't 
process ContinueRecords so whatever information was there wasn't being copied 
to my new spreadsheet. 

I could probably have modified UnknownRecord to suck any following 
ContinueRecords in but and recreate them on serialization but the 
UnknownRecords weren't anywhere near the record size limit so I have no idea 
why the ContinueRecords were there in the first place.

Instead I made the following changes to allow my spreadsheet to copy:
Added a method to org.apache.poi.hssf.record.Record:

     * Does this record want to process a Continue record?

    public boolean isContinueRecordAware()
        return false;

I overrode this method to return true in SSTDeserializer and SSTRecord.

Replaced line 218 of RecordFactory with:
                                  last_record = record;

Then I made a bunch of changes to ContinueRecord to allow it to serialize like 
UnknownRecord. I can diff these if you are interested in applying this as a 

I don't know if this is the best approach to handling ContinueRecords though.

One thought is to make Record ContinueRecord-aware and have it just keep a list 
of following ContinueRecords. Classes like SSTDeserializer and SSTRecord would 
expand their ContinueRecords as they do now but would also keep their 
ContinueRecord copies and would only toss their ContinueRecords(to be 
reconstitued if possible) if any of their fields were modified. 

Unmodified classes would automatically include their ContinueRecords on 
Comment 3 Avik Sengupta 2003-05-09 17:57:41 UTC
Which version are you using? I thought this was fixed in some of the later
version AFAIK. Danny, had you not had a go at this??
Comment 4 brett.knights 2003-05-09 18:08:04 UTC
This was in 1.10.0 from March 4, 2003
Comment 5 Danny Mui 2003-05-09 18:21:16 UTC
No sorry, just played with SharedFormula records recently.  Smells like a good
use for another aggregate!  Looks tacklable though.

Comment 6 brett.knights 2003-05-13 13:17:23 UTC
Created attachment 6348 [details]
Allows ContinueRecords to be copied as-is when following a non-aware record type
Comment 7 brett.knights 2003-05-13 13:20:03 UTC
I created a patch that implements my current hack for this problem. It doesn't 
break any existing test cases and isn't too much code that removing it would be 
an issue if a real fix is ever implemented.
Comment 8 brett.knights 2003-05-22 15:43:55 UTC

*** This bug has been marked as a duplicate of 17373 ***