Bug 59834 - poi-ooxml pom.xml should include dependency on poi-scratchpad
Summary: poi-ooxml pom.xml should include dependency on poi-scratchpad
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.14-FINAL
Hardware: PC Mac OS X 10.1
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-09 13:22 UTC by PJ Fanning
Modified: 2018-10-21 14:54 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description PJ Fanning 2016-07-09 13:22:06 UTC
http://central.maven.org/maven2/org/apache/poi/poi-ooxml/3.14/poi-ooxml-3.14.pom includes no reference to the compile and runtime dependency on poi-scratchpad.

https://github.com/apache/poi/blob/trunk/src/ooxml/java/org/apache/poi/extractor/ExtractorFactory.java includes imports for classes in poi-scratchpad, eg org.apache.poi.hwpf.OldWordFileFormatException.
Comment 1 Javen O'Neal 2016-07-09 18:05:25 UTC
I wonder if we could detect poi-scratchpad uses in poi-main if we clear previously built poi-scratchpad.jar artifacts, compile just poi main and require all imports to be present, then compile poi-scratchpad.

ant test probably compiles scratchpad before compiling ooxml, which might be why this went unnoticed. Unfortunately, we would have the opposite problem if we compiled ooxml before scratchpad.

Is there any way we resolve this at compile time rather through a unit test, and avoid throwing away a compilation result?
Comment 2 Nick Burch 2016-07-09 18:19:48 UTC
There aren't (and shouldn't be) any references to scratchpad from the POI main jar

There are (and due to a lack of full interfaces for SL and WP, amongst others) a handful of references from OOXML to Scratchpad

However, most uses of OOXML don't need Scratchpad. The extractor factory is one of the handful of places that does
Comment 3 Javen O'Neal 2016-07-09 18:49:11 UTC
(In reply to Nick Burch from comment #2)
> There are (and due to a lack of full interfaces for SL and WP, amongst
> others) a handful of references from OOXML to Scratchpad
> The extractor factory is
> one of the handful of places that [lack full interfaces]

I guess this bug depends on adding extractor to Common SL, WP, DG, etc
Comment 4 Nick Burch 2016-07-11 17:10:45 UTC
(In reply to Javen O'Neal from comment #3)
> I guess this bug depends on adding extractor to Common SL, WP, DG, etc

Not sure interfaces would help - the extractor factory needs to create concrete classes.

I don't think we want to add another jar for things like ExtractorFactory, as we get enough confused people already. I'm not sure we want to require the scratchpad jar, as people doing eg just XSSF don't need it.

Maybe we could have ExtractorFactory do some stuff with reflection, and give more helpful errors if the scratchpad jar is missing?

Might also be good to try running all the OOXML unit tests with no scratchpad jar, and check if any bits other than the extractors fail?
Comment 5 Dominik Stadler 2016-07-11 21:03:35 UTC
Yes, I also stumbled over this quite a while ago and in that project I actually copied WorkbookFactory and removed the access to these classes to make it work.

I have now added a test-build at https://builds.apache.org/view/POI/job/POI-no-scratchpad/ which removes the classpath to the scratchpad-classes before running tests. Currently only TestExtractorFactory seems to fail in test-ooxml.
Comment 6 Nick Burch 2016-07-11 22:59:52 UTC
OK, I've had a go at splitting the logic out, so that the OOXML stuff lives in OOXML (and remains the default), the Excel-specific stuff is in main and called from OOXML, and the Scratchpad stuff is in Scratchpad but called from main. Not too much duplication, not too much reflection...

With all 3 jars, nothing will change

With core + ooxml, excel and ooxml will work fine. However, if you don't have scratchpad but do give it a scratchpad-related file, it'll give you an exception telling you to add the jar

I think this is probably the best we can do... Further testing and sanity checking requested though!
Comment 7 Dominik Stadler 2016-07-12 15:42:14 UTC
Somehow we still link to some of the Scratchpad-Classes, TestExtractorFactory fails with the following exception, see https://builds.apache.org/view/POI/job/POI-no-scratchpad/5/testReport/(root)/%3Cinit%3E/org_apache_poi_extractor_TestExtractorFactory/

java.lang.NoClassDefFoundError: org/apache/poi/hsmf/extractor/OutlookTextExtactor
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:171)
Caused by: java.lang.ClassNotFoundException: org.apache.poi.hsmf.extractor.OutlookTextExtactor
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Comment 8 Nick Burch 2016-07-12 15:46:39 UTC
I would still expect some parts of ExtractorFactory to fail when the Scratchpad jar is missing. Should compile fine, but fail at runtime with a message that you need scratchpad

The ExtractorFactory test likely will fail with no scratchpad, not only for the above reason, but also because it checks what extractor classes come back so has references to scratchpad-only classes
Comment 9 Dominik Stadler 2016-11-08 15:55:58 UTC
BTW, if I exclude TestExtractorFactory in the Jenkins-Job at https://builds.apache.org/job/POI-no-scratchpad/, it fails in some other tests with ClassNotFoundException:

java.lang.NoClassDefFoundError: org/apache/poi/hslf/usermodel/HSLFSlideShow
	at org.apache.poi.sl.TestTable.testTextDirection(TestTable.java:108)
Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShow
        ...
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)


java.lang.RuntimeException: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory
	at org.apache.poi.sl.draw.TestDrawPictureShape.openSampleDocument(TestDrawPictureShape.java:49)
	at org.apache.poi.sl.draw.TestDrawPictureShape.testResize(TestDrawPictureShape.java:59)
Caused by: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory
	at org.apache.poi.sl.usermodel.SlideShowFactory.createSlideShow(SlideShowFactory.java:262)
	at org.apache.poi.sl.usermodel.SlideShowFactory.createHSLFSlideShow(SlideShowFactory.java:227)
	at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:79)
	at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:143)
	at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:107)
	at org.apache.poi.sl.draw.TestDrawPictureShape.openSampleDocument(TestDrawPictureShape.java:47)
Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory
        ...
	at org.apache.poi.sl.usermodel.SlideShowFactory.createSlideShow(SlideShowFactory.java:236)

So there seem to be a few more places where we unconditionally access these parts, e.g. SlideShowFactory. We should at least throw a better error message there as well.
Comment 10 Dominik Stadler 2016-11-08 15:56:59 UTC
Another interesting failure is the following one:

java.lang.NoClassDefFoundError: org/apache/poi/hslf/blip/WMF$NativeHeader
	at org.apache.poi.xslf.usermodel.XSLFPictureData.cacheProperties(XSLFPictureData.java:179)
	at org.apache.poi.xslf.usermodel.XSLFPictureData.getImageDimension(XSLFPictureData.java:145)
	at org.apache.poi.sl.draw.DrawPictureShape.resize(DrawPictureShape.java:100)
	at org.apache.poi.xslf.usermodel.XSLFSheet.createPicture(XSLFSheet.java:238)
	at org.apache.poi.xslf.TestXSLFBugs.bug59434(TestXSLFBugs.java:466)
Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.blip.WMF$NativeHeader
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

Seems WMF classes are in scratchpad, but XSLFPictureData accesses it, should we promote WMF into main?
Comment 11 Dominik Stadler 2018-10-21 14:54:58 UTC
I think this is mostly done now: CI is running without Scratchpad at https://builds.apache.org/view/P/view/POI/job/POI-DSL-no-scratchpad/, only a few tests are ignored (see exclude-scratchpad-test in build.xml).