Bug 41242 - HSSF: RecordInputStream does not read EndSubRecord
Summary: HSSF: RecordInputStream does not read EndSubRecord
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: unspecified
Hardware: Other other
: P2 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 38607 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-12-26 02:53 UTC by Yegor Kozlov
Modified: 2007-01-12 06:46 UTC (History)
1 user (show)



Attachments
test case (2.63 KB, application/octet-stream)
2006-12-26 02:54 UTC, Yegor Kozlov
Details
the patch (5.43 KB, patch)
2007-01-05 07:09 UTC, Yegor Kozlov
Details | Diff
archive with new and modified files (5.12 KB, application/force-download)
2007-01-05 07:10 UTC, Yegor Kozlov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yegor Kozlov 2006-12-26 02:53:39 UTC
Symptoms: After serialization ObjRecord becomes 4 bytes shorter than the
original record.
As the result, certain spreadsheets, especially containing cell comments, can be
corrupted.

A quick investigation revealed that trailing EndSubRecord is never serialized. 

Sub-records are read in ObjRecord.fillFields:

    protected void fillFields(RecordInputStream in)
    {
        subrecords = new ArrayList();
        byte[] subRecordData = in.readRemainder();
        RecordInputStream subRecStream = new RecordInputStream(new
ByteArrayInputStream(subRecordData));
        while(subRecStream.hasNextRecord()) {
          subRecStream.nextRecord();
          Record subRecord = SubRecord.createSubRecord(subRecStream);
          subrecords.add(subRecord);
        }
 ....
    }


The condition to find next record by RecordInputStream does not work for
EndSubRecord with sid=0.
I'm not sure which is the best way to fix it. Below are my variants:

 (a) change RecordInputStream to correctly process EndSubRecord.
  RecordInputStream is a core class and changing it just to
 handle a special case is risky.

 (b) if ObjRecord ALWAYS has a trailing EndSubRecord we can  manually append it
in ObjRecord.fillFields:

    protected void fillFields(RecordInputStream in)
    {
        subrecords = new ArrayList();
        byte[] subRecordData = in.readRemainder();
        RecordInputStream subRecStream = new RecordInputStream(new
ByteArrayInputStream(subRecordData));
        while(subRecStream.hasNextRecord()) {
          subRecStream.nextRecord();
          Record subRecord = SubRecord.createSubRecord(subRecStream);
          subrecords.add(subRecord);
        }
        //EndSubRecord is a special case. Append it manually.  
        subrecords.add(new EndSubRecord());     
 ....
    }

I like (b). Any objections to fixing it this way?
Are there any pitfalls with sub-records?

Regards, 
Yegor Kozlov
Comment 1 Yegor Kozlov 2006-12-26 02:54:08 UTC
Created attachment 19305 [details]
test case
Comment 2 Yegor Kozlov 2007-01-05 07:09:36 UTC
Created attachment 19366 [details]
the patch
Comment 3 Yegor Kozlov 2007-01-05 07:10:18 UTC
Created attachment 19367 [details]
archive with new and modified files
Comment 4 Yegor Kozlov 2007-01-05 07:17:03 UTC
The problem fixed.

I decided not to change RecordInputStream. There is a simpler solution in
ObjRecord.fillFields:

Count the number of bytes read, if there are 4 unread bytes it means
EndSubRecord has been skipped and needs to be appended explicitly.

P.S. Unit tests for ObjRecord were missing. It's time to add it. 

Yegor
Comment 5 Yegor Kozlov 2007-01-08 00:12:26 UTC
Fixed.

Yegor
Comment 6 Yegor Kozlov 2007-01-12 06:46:44 UTC
*** Bug 38607 has been marked as a duplicate of this bug. ***