Bug 45877 - Null base style for paragraph results in crash
Summary: Null base style for paragraph results in crash
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: unspecified
Hardware: PC All
: P2 major with 9 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-24 07:11 UTC by dnapoletano
Modified: 2010-09-19 07:53 UTC (History)
2 users (show)



Attachments
Word document causing NullPointerException (61.00 KB, application/octet-stream)
2008-09-24 07:11 UTC, dnapoletano
Details
Patch that modifies stylesheets. (700 bytes, patch)
2009-06-09 13:58 UTC, Chris Walter
Details | Diff
Patches PAPX.java to avoid NPE when a paragraph's PAPX is based upon a character style (634 bytes, patch)
2009-08-02 12:07 UTC, Andrew Duffy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description dnapoletano 2008-09-24 07:11:44 UTC
Created attachment 22629 [details]
Word document causing NullPointerException

Attached document, scanning its parts with QuickTest, results in a java.lang.NullPointerException
	at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:50)
	at org.apache.poi.hwpf.model.PAPX.getParagraphProperties(PAPX.java:135)
	at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:822)
	at org.apache.poi.hwpf.QuickTest.main(QuickTest.java:45)

After some debugging, I found that

1) the istd value for problematic paragraph style is 10
2) in getParagraphProperties method, baseStyle is null, and then in uncompressPAP method, the null variable causing the exception is "parent"
3) During StyleSheet constructor execution (the istd of problematic style is 10), _parahraphDescriptions[10] is not null, but _parahraphDescriptions[10]getPap() returns null
4) in createPAP(10) (called by the second loop in constructor), "pap" and "papx" local variables are *both* null, then createPAP does not create the PAP for istd=10

A more weird thing is that, deleting or changing other document parts, for example removing other paragraphs, the crash disappears...
Comment 1 Chris Walter 2009-06-09 13:58:00 UTC
Created attachment 23783 [details]
Patch that modifies stylesheets.

During the initialization of "org.apache.poi.hwpf.model.StyleSheet" POI runs through all the style descriptions and adds them to the StyleDescription array in the createPAP method. If a description has an improperly set parent (ie it's parent is null) it still tries to run the ParagraphSprmUncompressor. Doing so attempts to clone the parent and throws a Null Pointer exception. The attached patch simply blocks the running of the uncompressor if the parent is null. I've tested this and it passes all tests and fixes the previously attached document as well as some of my own.

If this fixed in deemed inappropriate, please do not fix with a Runtime error. The documents contents have still been loaded correctly and should still be useable in certain contexts.
Comment 2 Andrew Duffy 2009-08-02 12:07:16 UTC
Created attachment 24082 [details]
Patches PAPX.java to avoid NPE when a paragraph's PAPX is based upon a character style

I've encountered this error while decoding .doc files saved by OpenOffice Writer. Some paragraphs have a PAPX with an istd that is a character style, and PAPX.getParagraphProperties throws an NPE as a result.

I've attached a patch to PAPX.java to work around this. I don't think Chris Walter's patch to StyleSheet.java is necessary.
Comment 3 Nick Burch 2010-09-19 07:53:03 UTC
I've just tried opening your document with POI on svn head, and it was loaded fine, and we could get the text without error. Looks like the bug was fixed at some point between when you reported this and today.