Bug 49139 - POIFSystem() fails to handle 4K block OLE documents.
Summary: POIFSystem() fails to handle 4K block OLE documents.
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: unspecified
Hardware: All All
: P2 blocker (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-16 17:47 UTC by michel.boudinot
Modified: 2010-05-05 11:10 UTC (History)
0 users



Attachments
Zeiss ZVI Image File (4k blocks OLE document) (264 bytes, text/plain)
2010-04-16 18:09 UTC, michel.boudinot
Details
512 byte block ole document (50.50 KB, application/octet-stream)
2010-04-28 18:07 UTC, michel.boudinot
Details
4K block OLE document (12.34 KB, application/zip)
2010-04-28 18:21 UTC, michel.boudinot
Details
Test case (code + data files) (407.88 KB, application/zip)
2010-05-03 16:19 UTC, michel.boudinot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description michel.boudinot 2010-04-16 17:47:18 UTC
When used with 4K blocks OLE documents POIFSFileSystem() invocation gives OutOfBoundsException.

.... part of my code 
		stream = new FileInputStream(fileName);
		POIFSFileSystem fs = null;
    	        fs = new POIFSFileSystem(stream);         <======( line 101 )
		DirectoryEntry dir = fs.getRoot();
		// dir is an instance of DirectoryEntry ...

		directoryParse(0, dir);
......

java.lang.IndexOutOfBoundsException
	at org.apache.poi.util.IntList.get(IntList.java:351)
	at org.apache.poi.poifs.storage.BlockAllocationTableReader.fetchBlocks(BlockAllocationTableReader.java:191)
	at org.apache.poi.poifs.storage.BlockListImpl.fetchBlocks(BlockListImpl.java:130)
	at org.apache.poi.poifs.storage.SmallBlockTableReader.getSmallDocumentBlocks(SmallBlockTableReader.java:57)
	at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:176)
	at POIFSDmp.zviDocumentParse(POIFSDmp.java:101)

Same code is ok with 512 byte block documents.

The document attached (IP-ConvertImage-01.zvi, 22Mb) is a 4K block Zeiss ZVI OLE document to reproduce the observed bug.
Thanks,
Michel.
Comment 1 michel.boudinot 2010-04-16 18:09:59 UTC
Created attachment 25315 [details]
Zeiss ZVI Image File (4k blocks OLE document)

The document attached (IP-ConvertImage-01.zvi, 22Mb) is a 4K block Zeiss ZVI 
OLE document to reproduce the observed bug. 
File is too big to upload.
Here is a link to download it within 4 days.

http://services.cnrs-gif.fr/bigfiles/Download?key=VONAAEQJFCRJJTQS
Comment 2 Nick Burch 2010-04-23 12:30:26 UTC
This is proving to be a bigger job than expected - large swathes of POIFS are hard coded with a 512 byte block size :(

I'm part way through re-doing it to handle things properly, but it's going to take a while longer still

In the mean time, could you please upload a much smaller example file (eg a few hundred kb) for us to use in a test case?
Comment 3 Nick Burch 2010-04-25 18:28:58 UTC
This should now be fixed in svn trunk

However, we do still need a smaller file with 4k blocks!
Comment 4 michel.boudinot 2010-04-28 18:07:19 UTC
Created attachment 25369 [details]
512 byte block ole document

This ole document is a Zeiss ZVI Image where the image Content has been removed to reduce file size.
Comment 5 michel.boudinot 2010-04-28 18:21:03 UTC
Created attachment 25370 [details]
4K block OLE document

This ole document is a Zeiss ZVI Image where the image Content has been removed to reduce file size.
Comment 6 michel.boudinot 2010-04-28 18:29:19 UTC
(In reply to comment #3)
> This should now be fixed in svn trunk
> 
> However, we do still need a smaller file with 4k blocks!

sorry to be late, I was on leave for a week.
I got svn trunk Checked out revision 939073 and it fails on 512b and 4Kb block ole documents.
See results of test with poi-3.6-20091214.jar and poi-3.7-SNAPSHOT-20100428.jar on 512b and 4K block documents.

poi-3.6-20091214.jar                 512_BlockOLE.ole  =>  Ok
poi-3.6-20091214.jar                   4K_BlockOLE.ole  => Fails

java.io.IOException: Cannot remove block[ 178 ]; out of range[ 0 - 23 ]
	at org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:98)
	at org.apache.poi.poifs.storage.SmallDocumentBlockList.remove(SmallDocumentBlockList.java:30)
	at org.apache.poi.poifs.storage.BlockAllocationTableReader.fetchBlocks(BlockAllocationTableReader.java:191)
	at org.apache.poi.poifs.storage.BlockListImpl.fetchBlocks(BlockListImpl.java:123)
	at org.apache.poi.poifs.storage.SmallDocumentBlockList.fetchBlocks(SmallDocumentBlockList.java:30)
	at org.apache.poi.poifs.filesystem.POIFSFileSystem.processProperties(POIFSFileSystem.java:534)
	at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:176)
	at POIFSDmp.zviDocumentParse(POIFSDmp.java:101)
       ...

poi-3.7-SNAPSHOT-20100428.jar    512_BlockOLE.ole  =>   Fails

java.lang.NullPointerException
	at POIFSDmp.dumpRootThumbnail(POIFSDmp.java:264)
	at POIFSDmp.directoryParse(POIFSDmp.java:152)
	at POIFSDmp.zviDocumentParse(POIFSDmp.java:105)
       ...


poi-3.7-SNAPSHOT-20100428.jar    4K_BlockOLE.ole   =>   Fails

java.lang.ArrayIndexOutOfBoundsException: 16
	at org.apache.poi.poifs.storage.DocumentBlock.getDataInputBlock(DocumentBlock.java:170)
	at org.apache.poi.poifs.filesystem.POIFSDocument.getDataInputBlock(POIFSDocument.java:284)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.getDataInputBlock(DocumentInputStream.java:117)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.readFully(DocumentInputStream.java:255)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:152)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:134)
	at POIFSDmp.directoryParse(POIFSDmp.java:126)
	at POIFSDmp.zviDocumentParse(POIFSDmp.java:105)
       ...


----Code-----
               ...
100		POIFSFileSystem fs = null;
101   	fs = new POIFSFileSystem(stream);
102		DirectoryEntry dir = fs.getRoot();
103		// dir is an instance of DirectoryEntry ...
104
105		directoryParse(0, dir);
106	}
107		

108	private static void directoryParse(int n , DirectoryEntry dir) throws java.io.IOException {
109		String dirName = dir.getName();
110		for (Iterator iter = dir.getEntries(); iter.hasNext(); ) {
111			Entry entry = (Entry) iter.next();
112			if (entry instanceof DirectoryEntry) {
113			// .. recurse into this directory
114				dirName = entry.getName();
115				directoryParse( n++, (DirectoryEntry) entry);
116			}
117   		else if (entry instanceof DocumentEntry) {
118        	// entry is a document, which you can read
119				String docName = entry.getName();
120				String path = getPath(entry);
121				if (DEBUG) System.out.print(path);
122				DocumentInputStream dis = new DocumentInputStream((DocumentNode) entry);
123				int numBytes = dis.available();
124				if (DEBUG) System.out.println ("("+numBytes+")");
125				byte[] data = new byte [numBytes];
126				dis.read(data);

--------------

The corresponding ole test files 512_BlockOLE.ole, 4K_BlockOLE.ole.zip are attached, 4K_BlockOLE.ole.zip should be unzipped.
Comment 7 Nick Burch 2010-05-03 13:20:21 UTC
Thanks for the files. I've committed them to svn along with a unit test

Not sure why you're getting the errors you are though - I've just tried with org.apache.poi.poifs.dev.POIFSViewer and it can load both files without error, and display the raw byte contents fine

I'd suggest a full clean of your build and re-try. Please re-open the bug however if the problem remains, and upload a failing unit test for us to work against
Comment 8 michel.boudinot 2010-05-03 16:13:07 UTC
(In reply to comment #7)
> Thanks for the files. I've committed them to svn along with a unit test
> 
> Not sure why you're getting the errors you are though - I've just tried with
> org.apache.poi.poifs.dev.POIFSViewer and it can load both files without error,
> and display the raw byte contents fine
> 
> I'd suggest a full clean of your build and re-try. Please re-open the bug
> however if the problem remains, and upload a failing unit test for us to work
> against

I am affraid the problem remains, I am attaching a test case with corresponding test files.
The bug appears in two parts. The first issue is new  after your fix, it was not there with previous poi versions. The second issue is relative to the 4k ole block size.

Thanks.
Comment 9 michel.boudinot 2010-05-03 16:19:42 UTC
Created attachment 25392 [details]
Test case (code + data files)

Test case has been run using poi-3.7-SNAPSHOT-20100503.jar, too big to be attached :-(
Comment 10 Nick Burch 2010-05-03 18:00:10 UTC
Thanks for the testcase. I've been able to reproduce the bug

Alas it looks like it's going to take some digging to find out why it's breaking as it is though...

Failing test case is committed to svn in src/testcases/org/apache/poi/poifs/filesystem/TestPOIFSFileSystem.java (but disabled) if anyone else fancies taking a look first!
Comment 11 michel.boudinot 2010-05-04 07:37:41 UTC
(In reply to comment #10)
> Thanks for the testcase. I've been able to reproduce the bug
> 
> Alas it looks like it's going to take some digging to find out why it's
> breaking as it is though...
> 
> Failing test case is committed to svn in
> src/testcases/org/apache/poi/poifs/filesystem/TestPOIFSFileSystem.java (but
> disabled) if anyone else fancies taking a look first!

I looked into the first issue of the bug and it appears that it's in fact a bug in my code triggered by the modifications you've done to fix the 4k block issue.

- To display Thumbnail, my code access a metadata hash without testing if it's available. For some raison it's available when using previous poi library and no yet available when using the latest library. It looks like you've changed the order you're accessing OLE streams.

- The second issue is the only concern that remains.

Thanks and sorry for the trouble caused by the false first issue.
Comment 12 Nick Burch 2010-05-05 11:10:57 UTC
DocumentBlock has 512 hard coded in it too, but slightly more subtly (contained 2^9, rather than the literal 512)

Fixed (r941334) to use the correct big block size, and now your 4k file can be loaded fine in your viewer.

On a different note, please do consider submitting at least some of your viewer as a ZVI metadata text extractor!