Bug 34732 - Word documents generated by FrameMaker 6/7 throw ClassCastException
Summary: Word documents generated by FrameMaker 6/7 throw ClassCastException
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.11-dev
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-05-04 01:18 UTC by Bob Dickinson
Modified: 2015-03-22 22:10 UTC (History)
0 users



Attachments
Word 6.0/95 document created by FrameMaker (5.50 KB, application/octet-stream)
2005-05-04 01:19 UTC, Bob Dickinson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bob Dickinson 2005-05-04 01:18:56 UTC
Word documents generated by FrameMaker 6/7  throw ClassCastException

Unexpected Exception.
java.lang.ClassCastException
 at org.apache.poi.poifs.property.PropertyTable.<init>
    (PropertyTable.java:81)
 at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>
    (POIFSFileSystem.java:97)

Our immediate fix was to patch the constructor for
java.org.apache.poi.poifs.property.PropertyTable to throw an IOException instead
of the cce (so at least our app would handle it and not let the unchecked cce
kill us).  We did:

    if ((_properties.size() == 0) ||
        (!(_properties.get(0) instanceof DirectoryProperty)))
    {
        throw new IOException("No root directory property");
    }

But we still can't actually read the document.  It looks like FrameMaker creates
a structured storage with no root and no directory, but just the single stream
that contains the Word doc data.  Note that Word and OpenOffice can open these
documents without any problems.
Comment 1 Bob Dickinson 2005-05-04 01:19:56 UTC
Created attachment 14922 [details]
Word 6.0/95 document created by FrameMaker
Comment 2 Michael Zalewski 2005-05-05 14:24:02 UTC
It's a Word95 format document. I think HWPF only handles Word97.

There *is* a root property and there is a WordDocument stream. But there is no 
1Table or 0Table stream. I have no clue why your patch to PropertyTable works, 
but it seems like HWPFDocument() must check if 0Table/1Table exists.
Comment 3 Bob Dickinson 2005-05-05 19:01:59 UTC
Yeah, this bug isn't against HWPF, it's against POIFS.  We have our own Word doc
parser that does support Word 6.0 / 95 and we don't use HWPF.

I guess "fix" was not a good description of what our patch did.  All it does is
throw the checked IOException instead of the unchecked (and altogether
unexpected) ClassCastException that it used to throw (which killed our app).

So if this storage is so evil that you don't think POIFS should be able to read
it, or if you are only maintaining POIFS as required to support your own
document libraries, then at least failing in a cleaner way would be good.
Comment 4 Michael Zalewski 2005-05-06 03:00:37 UTC
Ahhh.... I did some closer checking and found out something very interesting.

The 'RootEntry' on this file is actually marked as a 'Storage' type, not 
as 'Root'. The mark of node type is in the directory record at offset 0x042. 
Office documents should always have 0x05 in this byte, but the file you posted 
has 0x02. So POIFS interprets this as a 'Storage' (usually a folder element 
that contains other streams).

This is why there is a ClassCastException. the PropertyTable object holds all 
the directory nodes, and the first one in the table should be the 'RootEntry'. 
That much is true for the file you posted.

That node should be marked as the directory root, by having the byte at offset 
0x042 set to 0x05. But the file you have posted has a value of 0x02. In POIFS, 
this causes PropertyFactory to create a DirectoryProperty object instead of a 
RootProperty. The first element of the PropertyTable list is therefore a 
DirectoryProperty, but the constructor tries to cast this element as a 
RootProperty.

A better fix would be to make a change in PropertyFactory

  int    offset         = 0;

  for (int k = 0; k < property_count; k++)
  {

+   int propertyType = data[ offset + PropertyConstants.PROPERTY_TYPE_OFFSET ];
+   if( k == 0) propertyType = PropertyConstants.ROOT_TYPE;
+   switch( propertyType)
-   switch (data[ offset + PropertyConstants.PROPERTY_TYPE_OFFSET ])
    {

      case PropertyConstants.DIRECTORY_TYPE :


If you can grok that change, you might give it a try and report back if it 
works.
Comment 5 Nick Burch 2014-08-01 13:49:43 UTC
Problem remains on trunk, even with NPOIFS

java.lang.ClassCastException: org.apache.poi.poifs.property.DocumentProperty cannot be cast to org.apache.poi.poifs.property.DirectoryProperty
	at org.apache.poi.poifs.property.PropertyTableBase.<init>(PropertyTableBase.java:63)
	at org.apache.poi.poifs.property.NPropertyTable.<init>(NPropertyTable.java:66)
	at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:384)
	at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:201)
	at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:183)