Running an embedded instance as 'java -jar <thin-jar-with-dependencies-in-manifest>' StandardJarScanner ignores jars in manifest Class-path header. The reason is system class loader does not expose these jars via getURLs() as explained in the following SO question. http://stackoverflow.com/questions/35922072/scanning-manifest-classpath-jars-in-embeeded-tomcat/36185408#36185408
Added PR in GitHub https://github.com/apache/tomcat80/pull/5
Notes: - The web application class loader will ignore any Class-Path entries in a manifest. - Any class loader above that (common, server, shared, etc.) will process Class-Path entries for any JAR on the class path. - The proposed patch requires JARs referenced by the Class-Path header must contain a manifest in order to be scanned. A solution without this limitation is preferred.
Hi Mark, this problem comes for system (application) class loader, that is responsible of handling class-path entries, not returning manifest jar urls when executed as java -jar. If there is a portable way of knowing the executable jar at runtime, the solution would be trivial: just parse its manifest class path entries. But there is not (AFAIK) The suggested change does not get jar entries from the class-path header of the manifest, instead it infers jar entries from the enumeration of manifest that it sees (and its parent don't) It is true that it requires jars referenced in the manifest of the executable jar to have a manifest in order to be scanned, but this improves the current behavior of not scanning them at all For code simplicity I added the same processing for every class loader in Tomcat, but it is true, that I adds a little overhead, so I have just created a new pull request that applies the hack only to the system class loader.
(In reply to Ignacio del Valle Alles from comment #3) > Hi Mark, this problem comes for system (application) class loader, that is > responsible of handling class-path entries, not returning manifest jar urls > when executed as java -jar. No, the problem is wider than that. See the second of my notes in comment #2 above. > If there is a portable way of knowing the executable jar at runtime, the > solution would be trivial: just parse its manifest class path entries. But > there is not (AFAIK). Not relevant. See above. > The suggested change does not get jar entries from the class-path header of > the manifest, instead it infers jar entries from the enumeration of manifest > that it sees (and its parent don't). > > It is true that it requires jars referenced in the manifest of the > executable jar to have a manifest in order to be scanned, but this improves > the current behavior of not scanning them at all It is arguable whether a flawed solution is better than no solution. My preference is to fix it properly. > For code simplicity I added the same processing for every class loader in > Tomcat, but it is true, that I adds a little overhead, so I have just > created a new pull request that applies the hack only to the system class > loader. Limiting the scan to the system class loader is as flawed as including the web application class loader.
Hint: Rather than inferring the JARs added to the class path via one or more manifests, process the manifests and determine the JARs that have been added.
(In reply to Mark Thomas from comment #4) > (In reply to Ignacio del Valle Alles from comment #3) > > Hi Mark, this problem comes for system (application) class loader, that is > > responsible of handling class-path entries, not returning manifest jar urls > > when executed as java -jar. > > No, the problem is wider than that. See the second of my notes in comment #2 > above. > > > If there is a portable way of knowing the executable jar at runtime, the > > solution would be trivial: just parse its manifest class path entries. But > > there is not (AFAIK). > > Not relevant. See above. > > > The suggested change does not get jar entries from the class-path header of > > the manifest, instead it infers jar entries from the enumeration of manifest > > that it sees (and its parent don't). > > > > It is true that it requires jars referenced in the manifest of the > > executable jar to have a manifest in order to be scanned, but this improves > > the current behavior of not scanning them at all > > It is arguable whether a flawed solution is better than no solution. My > preference is to fix it properly. > > > For code simplicity I added the same processing for every class loader in > > Tomcat, but it is true, that I adds a little overhead, so I have just > > created a new pull request that applies the hack only to the system class > > loader. > > Limiting the scan to the system class loader is as flawed as including the > web application class loader. My reasoning was based on the wrong assumption that these entries were only read by the System class loader and only when executed as "java -jar" (don't ask me why). So yes, the solution is simpler and can be implemented just by scanning the manifest. I'll work on it and send a new PR
Just added a new PR on GitHub. Basically what it does is from the jar url returned by a url class loader start a DFS traversal over the graph of related jars (the relation is transitive)
Update. I've spent a fair amount of time looking at the proposed patch to see if there is a way to avoid processing a JAR once to look at the class path and again to scan it. I think it is possible but it is going to need some refactoring. Regarding the proposed patch, it doesn't handle all of the cases that the Jar Scanner does (packed WARs, exploded JARs, etc.) which is why I'd prefer to do this in a single scan. Reviewing the code has also uncovered some minor bugs that need to be fixed.
I see. I'll think about it to too. Can you give more details on these minor bugs?
Thanks for the report and the proposed patches. While the proposed patches weren't used, it was very helpful to have an alternative point of view when exploring options for resolving this issue. The fix has been applied to: 9.0.x for 9.0.0.M5 onwards 8.5.x for 8.5.1 onwards 8.0.x for 8.0.34 onwards
Thank you
Just found an interesting problem: We're trying to migrate a bunch of webapps from tomcat 7 to 8.5.4 with the goal in mind not to change too much. Our Tomcat has got a whole bunch of 3rd party libraries in a separate folder which is loaded by the common (or shared) class loader (i.e. not in WEB-INF/lib). Since current version of StandardJarScanner now also follows the class-path in header in the JARs manifest.mf, it tries and fails to load all those additional JARS mentioned in the manifest. But we never hat these additional JARS in classpath and collecting them all is a bit of not so funny work although feasible. Don't get me wrong. I understand that following the class-path mentioned in the manifest.mf files is technically totally correct. But it somehow complicates migration of already existing apps. Wouldn't it be useful to have sort of "don't resolve class-path header" switch (with default FALSE) but giving the the option to do so? What do you guys thing? Or have I just missed an already existing workaround?
I haven't checked the code but you might be able to use jarsToSkip or a JarScanFilter to not scan them. Failing that, a new enhancement request to make the manifest processing option sounds reasonable. I'd avoid the double negative though and have processManifest (or similar) with a default of true.
I totally agree that this double negation would be confusing. Unfortunatelly jarsToSkip and <JarScanFilter> does not help in that case, because at least one of the JARs contains TLDs (Which I do want to be scanned) and additionally the unwanted class-path entry (which I don't want the StandardJarScanner to follow) in it's manifest. I can't judge how specific this problem is and if other tomcat users experience similar problems but once I open an enhancement request the experienced developers hopefully judge according to their experience.
I think I also encountered this problem after deploying in Tomcat 8.5.5 a webapp previously run in Tomcat 8.0.14. I have a lot of WARNING: file not found for JARs mentioned in useless Class-Path entries in JARs deployed in WEB-INF/lib, like jaxb-impl-2.1.13.jar (which mentions "jaxb-api.jar activation.jar jsr173_1.0_api.jar jaxb1-impl.jar") or xalan-2.7.2.jar (which mentions "xercesImpl.jar xml-apis.jar serializer.jar"). Fact is that some of those JARs (at least those I really need) are deployed with different names (with a version specifier, for instance), as resolved by tools like Gradle or Maven). So my Tomcat log on startup is now full of these warnings that look like errors a lot (they're complete with stack traces). Maybe lowering the log output to debug level and/or remove stack traces could also be much less intrusive than the current bevahiour?
See bug 59961