Bug 63979

Summary: org.apache.poi.util.PngUtils is not in maven-artifact poi-scratchpad-4.1.0
Product: POI Reporter: Andreas Joseph Krogh <andreas>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: 4.1.0-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Andreas Joseph Krogh 2019-12-01 11:39:52 UTC
org.apache.poi.util.PngUtils is not in poi-scratchpad-4.1.0 which results in this stacktrace:

java.lang.NoClassDefFoundError: org/apache/poi/util/PngUtils
        at org.apache.poi.hwpf.usermodel.Picture.fillImageContent(Picture.java:176)
        at org.apache.poi.hwpf.usermodel.Picture.<init>(Picture.java:124)
        at org.apache.poi.hwpf.model.PicturesTable.extractPicture(PicturesTable.java:162)
        at org.apache.poi.hwpf.converter.AbstractWordConverter.processCharacters(AbstractWordConverter.java:488)
        at org.apache.poi.hwpf.converter.WordToTextConverter.processParagraph(WordToTextConverter.java:414)
        at org.apache.poi.hwpf.converter.AbstractWordConverter.processParagraphes(AbstractWordConverter.java:1109)
        at org.apache.poi.hwpf.converter.WordToTextConverter.processSection(WordToTextConverter.java:424)
        at org.apache.poi.hwpf.converter.AbstractWordConverter.processDocumentPart(AbstractWordConverter.java:735)
        at org.apache.poi.hwpf.converter.WordToTextConverter.processDocumentPart(WordToTextConverter.java:245)
        at org.apache.poi.hwpf.extractor.WordExtractor.getText(WordExtractor.java:269)

Making it impossible to extract text from certain Word-documents. 

I see there's another approach in 4.1.1, but we cannot use that version because of this bug: #63955
Comment 1 Andreas Beeker 2019-12-01 12:04:17 UTC
1. please check if you have mixed POI versions in your classpath.
I've refactored the PngUtils via r1866809 - I guess, you don't have a problem with depending on such an internal class itself, but only the HWPF call should work

2. why is #63955 blocking you?
Comment 2 Andreas Joseph Krogh 2019-12-01 12:12:21 UTC
I have "kind of mixed" as I'm using poi-4.1.1, but have to use poi-scratchpad-4.1.0 due to bug #63955, because it fails parsing winmail.dat, which we need.

These are the jars I'm using:
poi-4.1.1.jar
poi-ooxml-4.1.1.jar
poi-ooxml-schemas-4.1.1.jar
poi-scratchpad-4.1.0.jar
Comment 3 Andreas Joseph Krogh 2019-12-01 12:15:02 UTC
But - shouldn't PngUtils be a part of poi-scratchpad-4.1.0.jar?
It *is* used, in Picture, so I find it strange that it's not in the jar.
Comment 4 Andreas Beeker 2019-12-01 12:25:20 UTC
Not sure what other side effects happen with this mixed versions, but I would try to provide the class in my source and make sure it's before the POI classes in the classpath.
As it was refactored, the HWPF classes use the new class instead, i.e. I don't intend to add it back in.

And about 2. - this was a copy & paste error on my side, as I was looking at the wrong bug before ...

Please keep us updated, if the above works for you ...
Comment 5 Andreas Joseph Krogh 2019-12-01 12:30:51 UTC
I think the cleanest solution is for us to downgrade to poi-4.1.0, so they're all the same version.

Any thoughts on fix for #63955 and shipping 4.1.2? I should think #63955 ruins the day for everyone working with winmail.dat-files...
Comment 6 Andreas Joseph Krogh 2019-12-01 12:35:26 UTC
I'll close this as it's really a "you're using an unsupportet version-combination".