Bug 52420

Summary: [PATCH] WordToHtmlConverter NullPointerException in compactChildNodesR method
Product: POI Reporter: Sachin Gorade <sachin.gorade>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED WORKSFORME    
Severity: normal CC: yanis.biziuk
Priority: P2 Keywords: PatchAvailable
Version: 3.8-dev   
Target Milestone: ---   
Hardware: Other   
OS: other   
Attachments: With this file you can reproduce exception
patch

Description Sachin Gorade 2012-01-04 11:18:11 UTC
While running Apache POI on android with simple application I found that AbstractWordUtils class throws NullPointerException in compactChildNodesR method.
This is due to -

 while ( child2.getChildNodes().getLength() > 0 )
  child1.appendChild( child2.getFirstChild() );
// following line causes NullPointerException
 child2.getParentNode().removeChild( child2 );
 i--;

I think a check should be there before removing this child. After adding a check my simple application is able to convert doc files to html on Android platform.

Following is the code change that I have done -

 while ( child2.getChildNodes().getLength() > 0 )
  child1.appendChild( child2.getFirstChild() );
 if(child2.getParentNode()!=null){
  child2.getParentNode().removeChild( child2 );
  i--;
 }
Comment 1 Sergey Vladimirov 2012-11-05 15:53:14 UTC
Sachin,

Could you please provide an example file, that produces an exception?

Sergey
Comment 2 Yanis 2013-11-14 12:05:37 UTC
Created attachment 31043 [details]
With this file you can reproduce exception

With this file you can reproduce exception

11-14 13:25:53.108: WARN/System.err(8630): Caused by: java.lang.NullPointerException
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.AbstractWordUtils.compactChildNodesR(AbstractWordUtils.java:146)
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.WordToHtmlUtils.compactSpans(WordToHtmlUtils.java:238)
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.WordToHtmlConverter.processParagraph(WordToHtmlConverter.java:596)
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.AbstractWordConverter.processParagraphes(AbstractWordConverter.java:1113)
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.WordToHtmlConverter.processSingleSection(WordToHtmlConverter.java:617)
11-14 13:25:53.108: WARN/System.err(8630): at org.apache.poi.hwpf.converter.AbstractWordConverter.processDocument(AbstractWordConverter.java:722)

this code will solve this error

<code>
if(child2.getParentNode()!=null){
                child2.getParentNode().removeChild( child2 );
                i--;
            }
</code>

but converted html not contains all data from doc file (other bug?)
Comment 3 Yanis 2013-11-14 14:48:08 UTC
Created attachment 31044 [details]
patch

Patch to fix this error.
Comment 4 Dominik Stadler 2015-01-02 22:40:39 UTC
I have tried to reproduce the issue that you reported, but couldn't, see r1649147 for the related test-case that I added.

Can you please retry this with the latest version of POI and if you still see the problem provide some more information, ideally via a self-contained unit-test?

Also I could not find any text not contained in the resulting document, which exact part was missing for you? Maybe this is fixed via some other changes in the meanttime...
Comment 5 Dominik Stadler 2016-06-18 05:55:10 UTC
Could not reproduce and no update for some time, therefore closing this as WORKSFORME.