http://central.maven.org/maven2/org/apache/poi/poi-ooxml/3.14/poi-ooxml-3.14.pom includes no reference to the compile and runtime dependency on poi-scratchpad. https://github.com/apache/poi/blob/trunk/src/ooxml/java/org/apache/poi/extractor/ExtractorFactory.java includes imports for classes in poi-scratchpad, eg org.apache.poi.hwpf.OldWordFileFormatException.
I wonder if we could detect poi-scratchpad uses in poi-main if we clear previously built poi-scratchpad.jar artifacts, compile just poi main and require all imports to be present, then compile poi-scratchpad. ant test probably compiles scratchpad before compiling ooxml, which might be why this went unnoticed. Unfortunately, we would have the opposite problem if we compiled ooxml before scratchpad. Is there any way we resolve this at compile time rather through a unit test, and avoid throwing away a compilation result?
There aren't (and shouldn't be) any references to scratchpad from the POI main jar There are (and due to a lack of full interfaces for SL and WP, amongst others) a handful of references from OOXML to Scratchpad However, most uses of OOXML don't need Scratchpad. The extractor factory is one of the handful of places that does
(In reply to Nick Burch from comment #2) > There are (and due to a lack of full interfaces for SL and WP, amongst > others) a handful of references from OOXML to Scratchpad > The extractor factory is > one of the handful of places that [lack full interfaces] I guess this bug depends on adding extractor to Common SL, WP, DG, etc
(In reply to Javen O'Neal from comment #3) > I guess this bug depends on adding extractor to Common SL, WP, DG, etc Not sure interfaces would help - the extractor factory needs to create concrete classes. I don't think we want to add another jar for things like ExtractorFactory, as we get enough confused people already. I'm not sure we want to require the scratchpad jar, as people doing eg just XSSF don't need it. Maybe we could have ExtractorFactory do some stuff with reflection, and give more helpful errors if the scratchpad jar is missing? Might also be good to try running all the OOXML unit tests with no scratchpad jar, and check if any bits other than the extractors fail?
Yes, I also stumbled over this quite a while ago and in that project I actually copied WorkbookFactory and removed the access to these classes to make it work. I have now added a test-build at https://builds.apache.org/view/POI/job/POI-no-scratchpad/ which removes the classpath to the scratchpad-classes before running tests. Currently only TestExtractorFactory seems to fail in test-ooxml.
OK, I've had a go at splitting the logic out, so that the OOXML stuff lives in OOXML (and remains the default), the Excel-specific stuff is in main and called from OOXML, and the Scratchpad stuff is in Scratchpad but called from main. Not too much duplication, not too much reflection... With all 3 jars, nothing will change With core + ooxml, excel and ooxml will work fine. However, if you don't have scratchpad but do give it a scratchpad-related file, it'll give you an exception telling you to add the jar I think this is probably the best we can do... Further testing and sanity checking requested though!
Somehow we still link to some of the Scratchpad-Classes, TestExtractorFactory fails with the following exception, see https://builds.apache.org/view/POI/job/POI-no-scratchpad/5/testReport/(root)/%3Cinit%3E/org_apache_poi_extractor_TestExtractorFactory/ java.lang.NoClassDefFoundError: org/apache/poi/hsmf/extractor/OutlookTextExtactor at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:171) Caused by: java.lang.ClassNotFoundException: org.apache.poi.hsmf.extractor.OutlookTextExtactor at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
I would still expect some parts of ExtractorFactory to fail when the Scratchpad jar is missing. Should compile fine, but fail at runtime with a message that you need scratchpad The ExtractorFactory test likely will fail with no scratchpad, not only for the above reason, but also because it checks what extractor classes come back so has references to scratchpad-only classes
BTW, if I exclude TestExtractorFactory in the Jenkins-Job at https://builds.apache.org/job/POI-no-scratchpad/, it fails in some other tests with ClassNotFoundException: java.lang.NoClassDefFoundError: org/apache/poi/hslf/usermodel/HSLFSlideShow at org.apache.poi.sl.TestTable.testTextDirection(TestTable.java:108) Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShow ... at java.lang.ClassLoader.loadClass(ClassLoader.java:247) java.lang.RuntimeException: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory at org.apache.poi.sl.draw.TestDrawPictureShape.openSampleDocument(TestDrawPictureShape.java:49) at org.apache.poi.sl.draw.TestDrawPictureShape.testResize(TestDrawPictureShape.java:59) Caused by: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory at org.apache.poi.sl.usermodel.SlideShowFactory.createSlideShow(SlideShowFactory.java:262) at org.apache.poi.sl.usermodel.SlideShowFactory.createHSLFSlideShow(SlideShowFactory.java:227) at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:79) at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:143) at org.apache.poi.sl.usermodel.SlideShowFactory.create(SlideShowFactory.java:107) at org.apache.poi.sl.draw.TestDrawPictureShape.openSampleDocument(TestDrawPictureShape.java:47) Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.usermodel.HSLFSlideShowFactory ... at org.apache.poi.sl.usermodel.SlideShowFactory.createSlideShow(SlideShowFactory.java:236) So there seem to be a few more places where we unconditionally access these parts, e.g. SlideShowFactory. We should at least throw a better error message there as well.
Another interesting failure is the following one: java.lang.NoClassDefFoundError: org/apache/poi/hslf/blip/WMF$NativeHeader at org.apache.poi.xslf.usermodel.XSLFPictureData.cacheProperties(XSLFPictureData.java:179) at org.apache.poi.xslf.usermodel.XSLFPictureData.getImageDimension(XSLFPictureData.java:145) at org.apache.poi.sl.draw.DrawPictureShape.resize(DrawPictureShape.java:100) at org.apache.poi.xslf.usermodel.XSLFSheet.createPicture(XSLFSheet.java:238) at org.apache.poi.xslf.TestXSLFBugs.bug59434(TestXSLFBugs.java:466) Caused by: java.lang.ClassNotFoundException: org.apache.poi.hslf.blip.WMF$NativeHeader at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Seems WMF classes are in scratchpad, but XSLFPictureData accesses it, should we promote WMF into main?
I think this is mostly done now: CI is running without Scratchpad at https://builds.apache.org/view/P/view/POI/job/POI-DSL-no-scratchpad/, only a few tests are ignored (see exclude-scratchpad-test in build.xml).