Bug 31906 - [PATCH] "Pull" method for handeling continue records
Summary: [PATCH] "Pull" method for handeling continue records
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 2.5-FINAL
Hardware: Other other
: P3 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2004-10-26 22:30 UTC by Jason Height
Modified: 2005-08-21 21:13 UTC (History)
0 users

Record InputStream patch (511.21 KB, patch)
2004-10-26 22:31 UTC, Jason Height
Details | Diff
The new RecordInputStream class (8.69 KB, text/java)
2004-10-26 22:32 UTC, Jason Height
TestcaseRecordInputStream.java (1.64 KB, text/plain)
2005-07-21 05:02 UTC, Jason Height
RecordInputStream patch (537.42 KB, patch)
2005-07-26 09:30 UTC, Jason Height
Details | Diff
The new RecordInputStream class (8.70 KB, patch)
2005-07-26 09:31 UTC, Jason Height
Details | Diff
A new RefErrorPtg class (2.49 KB, patch)
2005-07-26 09:32 UTC, Jason Height
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jason Height 2004-10-26 22:30:26 UTC
Hi All,

I have been performing some experiments with automatic
handling of Continue records.

It seems to me that the current method of handling
Continue records is to detect that a Continue records
is required, store enough state to pick up after the
Continue record is Pushed down to the record by the
RecordFactory and start again. This makes for
complicated classes ie SST.

A number of records that contain strings, and
conceivably could exceed the record limits but do not
implement the current Continue mechanisms. I notice
that the recent drawing code does not implement
continue records (AbstractEscherHolderRecord),
probably because in the current implementation it is
so damned hard to do.

What I have created  a "RecordInputStream", that knows
enough about continue records so that each record can
pull data from this class. It will even automatically
transition ove a continue record boundary so that the
low level records can just keep sucking bytes so long
as there are continue records available.

This is a big patch because I have had to touch every single
HSSF Record constructor and fillFields method, and in
some cases ie SST rewrite them.

*** But look how simple SST deserialisation now is!! ***

I plan eventually to have a RecordOutputStream which
will do the serialization.

At this stage it is working great!

I would appreciate any comments on the patch before i commit it. 

Glen:/ You could consider wrapping a ddf aware input stream over the top of this
stream in AbstractEscherHolderRecord.fillFields eg

DDFInputStream inStr = new DDFInputStream(recordInputStream);


Comment 1 Jason Height 2004-10-26 22:31:23 UTC
Created attachment 13227 [details]
Record InputStream patch
Comment 2 Jason Height 2004-10-26 22:32:16 UTC
Created attachment 13228 [details]
The new RecordInputStream class
Comment 3 Glen Stampoultzis 2004-10-30 02:27:37 UTC
I'll have a closer look at this.

The big issue I have with continue records (not with this) is that they're not 
really continuous.  One possible, flawed, solution to this is to mark records 
that accept continue records and feed the later continue records to those 
records.  The trouble is if there are unknown records that can accept continue 
records then this solution will not work.  If you've got any idea's how to 
handle this I'd love to know.

Comment 4 Andy Oliver 2005-04-29 17:38:50 UTC
Jason, I thought you were a committer?  
Comment 5 Avik Sengupta 2005-04-29 18:20:15 UTC
I'm assuming he wanted feedback for a large patch. 
Comment 6 Jason Height 2005-05-05 04:22:55 UTC
Yes that is correct I wanted some feedback. This is a major change to how
records are created.

Personally i think that it is MUCH MUCH simpler (especially for SST), and has
the possibilty to be quicker if the underlying POIFS InputStream was also was
able to read byte by byte rather than loading the whole data stream into memory.

I am pretty sure that i solved all of the issues and that the unit tests all
passed. The reason that i didnt commit was due to the major changes in how this
created records, and i never got arround to creating the RecordOutputStream.

I will work further on the patch, but only if people are happy with the concepts.
Comment 7 Avik Sengupta 2005-07-06 13:50:47 UTC
The more i look at this, the more i like this!
Comment 8 Avik Sengupta 2005-07-19 13:45:08 UTC

The TestRecordinputstream.java file seems to be missing. Could you pls add it in. 

Comment 9 Jason Height 2005-07-21 05:02:30 UTC
Created attachment 15726 [details]
Comment 10 Jason Height 2005-07-21 05:04:38 UTC
Comment on attachment 15726 [details]

Additional file missing from original patch.
Comment 11 Jason Height 2005-07-21 05:07:06 UTC

Should be all here now. I just tried to compile my local repos, but i must have
done an update some time ago, because i get alot of compile errors due to cvs
not being able to merge updates with my patch.

Have you already attempted to merge this patch with the current CVS version? If
so ill hold off, now that you should have all of the files required to build and

Comment 12 Jason Height 2005-07-26 09:30:56 UTC
Created attachment 15773 [details]
RecordInputStream patch

New patch updated to the latest code base as of today.

Should apply relatively easily. All test cases pass.

I do not currently implement the escher graphics correctly, particularily the
case where there is an Obj record between two drawing group records, where the
second drawing group record is a continuation of the first. *Why is there no
test case for this anomoly??*

If can get some sort of affirmative 'hey, this looks like it shoud be
committed' then i will find a work around for the escher stuff and commit the

Comment 13 Jason Height 2005-07-26 09:31:54 UTC
Created attachment 15774 [details]
The new RecordInputStream class
Comment 14 Jason Height 2005-07-26 09:32:32 UTC
Created attachment 15775 [details]
A new RefErrorPtg class
Comment 15 Jason Height 2005-08-22 05:13:20 UTC
Patch applied to HEAD