Bug 61810

Summary: [Proposal] Support configure the interval to keep all jars open if no jar is accessed
Product: Tomcat 7 Reporter: Huxing Zhang <huxing.zhang>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: enhancement    
Priority: P2    
Version: trunk   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Support configure the interval to keep all jars open if no jar is accessed

Description Huxing Zhang 2017-11-24 05:38:30 UTC
The Problem:

When the traffic spikes, the web application's business thread pool becomes full.

Jstack shows one of the thread is holding a lock that block most of other threads. The stack trace is as follows:

"HSFBizProcessor-DEFAULT-12-thread-332" Id=10156 RUNNABLE
        at java.util.zip.ZipFile.open(Native Method)
        at java.util.zip.ZipFile.<init>(ZipFile.java:219)
        at java.util.zip.ZipFile.<init>(ZipFile.java:149)
        at java.util.jar.JarFile.<init>(JarFile.java:166)
        at java.util.jar.JarFile.<init>(JarFile.java:130)
        at org.apache.catalina.loader.WebappClassLoaderBase.openJARs(WebappClassLoaderBase.java:3120)
        at org.apache.catalina.loader.WebappClassLoaderBase.findResourceInternal(WebappClassLoaderBase.java:3409)
        -  locked [Ljava.util.jar.JarFile;@972f6eb
        at org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:3152)
        at org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:1373)
        at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1861)
        -  locked org.apache.catalina.loader.WebappClassLoader@726ec590


We have a web application who has ~800 jar files under WEB-INF/lib. By default tomcat will close all the JarFile objects if there is no access to the jar file after 90s, which is hard-coded.

However, if at some point, we need to load a class that is not loaded before, tomcat will have to open all the jar files before trying to load the class. What makes matter worse is that, the disk is HDD, which makes Opening ~800 jar files quite time consuming, eventually cause this operation to block all other threads.


Lessons learned:

Enable parallel class loading, so that one thread trying to load a class don't block other threads. However, if multiple threads trying to load the same class, the issue might still happen.


New Proposal:

From my point of view, the reason why Tomcat close all the jars opened is to release the file descriptors to save resources. If resource is not a problem, we can keeps all the jar opened for a fairly long time, or even keeps them always opened. 

Therefore, we propose to introduce a new attribute, called 'jarOpenInterval', in WebappClassLoaderBase, to track the interval that can keep all the jars opened if they are not accessed. The default value is 90000 milliseconds, which is the same as the current implementation. The attribute can be configured in two ways:

1. static configuration in context.xml

<Loader jarOpenInterval="90000" />

2. dynamic configuration via JMX. This value should be changed during runtime.


Any thoughts?

p.s. About the default value for jarOpenInterval, I found that Tomcat 8+ has removed the implementation of the close jar operation, which indeed will hold all the file descriptors during start up. Does that mean that holding all the file descriptor is not a issue any more? Can we keep all jars open by default?
Comment 1 Christopher Schultz 2017-11-24 23:17:31 UTC
(In reply to Huxing Zhang from comment #0)
> p.s. About the default value for jarOpenInterval, I found that Tomcat 8+ has
> removed the implementation of the close jar operation, which indeed will
> hold all the file descriptors during start up. Does that mean that holding
> all the file descriptor is not a issue any more? Can we keep all jars open
> by default?

So this is only an issue for Tomcat 7, then, correct?

I wonder if it's worth a change for such an old version of Tomcat. Tomcat 7 at this point is *very* stable.

I have no objections to this new feature; just wondering if it makes more sense to patch Tomcat 7 or encourage "the client" to upgrade to Tomcat 8+.
Comment 2 Huxing Zhang 2017-11-25 04:15:10 UTC
(In reply to Christopher Schultz from comment #1)
> (In reply to Huxing Zhang from comment #0)
> > p.s. About the default value for jarOpenInterval, I found that Tomcat 8+ has
> > removed the implementation of the close jar operation, which indeed will
> > hold all the file descriptors during start up. Does that mean that holding
> > all the file descriptor is not a issue any more? Can we keep all jars open
> > by default?
> 
> So this is only an issue for Tomcat 7, then, correct?

Yes.

> 
> I wonder if it's worth a change for such an old version of Tomcat. Tomcat 7
> at this point is *very* stable.

I don't want to change the default value for tomcat 7 by adding this new feature. But I think tomcat should support the option to turn it off.

> 
> I have no objections to this new feature; just wondering if it makes more
> sense to patch Tomcat 7 or encourage "the client" to upgrade to Tomcat 8+.

Yes, we are planning to upgrade to Tomcat 8.5, but it needs some time. We have to fix it since it does affect the business.

I have a patch for this feature, and will upload it later.
Comment 3 Huxing Zhang 2017-12-04 02:17:04 UTC
Created attachment 35581 [details]
Support configure the interval to keep all jars open if no jar is accessed
Comment 4 Huxing Zhang 2017-12-11 03:45:30 UTC
Any comments on the patch? If there is no objection, I am planning to commit the code in the next one or two days.
Comment 5 Konstantin Kolinko 2017-12-12 13:55:39 UTC
1. Overall it looks OK.
You must update documentation as well
http://tomcat.apache.org/tomcat-7.0-doc/config/loader.html

2. FYI: See also bug 52448. Its idea is to scan the jars and cache their indexes, to reopen only those jars that contain the class that is to be loaded.
Comment 6 Konstantin Kolinko 2017-12-12 14:04:06 UTC
> 2197 	synchronized (jarFiles) {
> 2198 	   if (force || (jarOpenInterval > 0 && System.currentTimeMillis()
> 2199 	                            > (lastJarAccessed + jarOpenInterval))) {

The above lines can be additionally wrapped with "if (force || (jarOpenInterval > 0))", to avoid wasting time on "synchronized (jarFiles)" when jarOpenInterval is negative (the jar closing feature is turned off).
Comment 7 Huxing Zhang 2017-12-20 06:37:53 UTC
(In reply to Konstantin Kolinko from comment #5)
> 1. Overall it looks OK.
> You must update documentation as well
> http://tomcat.apache.org/tomcat-7.0-doc/config/loader.html

Right, it is on my TODO list.

> 
> 2. FYI: See also bug 52448. Its idea is to scan the jars and cache their
> indexes, to reopen only those jars that contain the class that is to be
> loaded.


Yes, actually I would like to propose a similar enhancement after finishing this, that is instead of reopen all the jars, we could just open the specific jar if we know that where the class/resource is located.

During startup, Tomcat scans the jars for annotations, at this time, class are loaded as resource, but  not defined, therefore MANIFEST.MF are not required.

However, when a class is loaded during runtime, MANIFEST.MF is required according to JVM spec. If the jars are closed, we have to open all the jars to find the class. This logic can be found within org.apache.catalina.loader.WebappClassLoaderBase#findResourceInternal(java.lang.String, java.lang.String, boolean):

        ResourceEntry entry = resourceEntries.get(path);
        if (entry != null) {
            if (manifestRequired && entry.manifest == MANIFEST_UNKNOWN) {
                // This resource was added to the cache when a request was made
                // for the resource that did not need the manifest. Now the
                // manifest is required, the cache entry needs to be updated.
                synchronized (jarFiles) {
                    if (openJARs()) {
                        ...
                     }
                }
            }
            return entry;
        }

If we can record which jar is a class is loaded (as resource) from, we can just open that specific jar file instead of all jars.

This will be very helpful when a class is scanned before and being loaded for the first time.
Comment 8 Huxing Zhang 2017-12-20 06:43:20 UTC
(In reply to Konstantin Kolinko from comment #6)
> > 2197 	synchronized (jarFiles) {
> > 2198 	   if (force || (jarOpenInterval > 0 && System.currentTimeMillis()
> > 2199 	                            > (lastJarAccessed + jarOpenInterval))) {
> 
> The above lines can be additionally wrapped with "if (force ||
> (jarOpenInterval > 0))", to avoid wasting time on "synchronized (jarFiles)"
> when jarOpenInterval is negative (the jar closing feature is turned off).

Do mean sth. like this?

if (jarFiles.length > 0 && (force || jarOpenInterval > 0)) {
    synchronized (jarFiles) {
        if (force || (jarOpenInterval > 0 && System.currentTimeMillis()
                                  > (lastJarAccessed + jarOpenInterval))) {
            ...
        }
    }
}

If so, yes, I think it is better, I will take that.
Comment 9 Konstantin Kolinko 2017-12-20 14:23:26 UTC
(In reply to Huxing Zhang from comment #8)
> (In reply to Konstantin Kolinko from comment #6)
> 
> Do mean sth. like this?
> 
> if (jarFiles.length > 0 && (force || jarOpenInterval > 0)) {
> ...
> 
> If so, yes, I think it is better, I will take that.

Yes.
Comment 10 Huxing Zhang 2017-12-21 14:38:08 UTC
This feature will be available from 7.0.84 onwards.