Bug 57420

Summary: Wrong class names generated since URL_ENCODER in DirContextURLConnection is not thread safe
Product: Tomcat 7 Reporter: matthias.keller
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: critical CC: marc.buetikofer
Priority: P2    
Version: 7.0.55   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Patch
Make UEncoder a local variable and introduce a default safe charset enum

Description matthias.keller 2015-01-06 19:53:53 UTC
We've been debugging this for a long time. Sometimes when tomcat starts up, it will fail to deploy a webapp since it cannot find a certain class. That class name is always complete garbage.

The error stack trace always is:
SEVERE: Unable to process resource element [jndi:/localhost/testapp/WEB-INF/classes/ch/blabli/dNlewrP.lculgiansDisalog$3.class] for annotations
java.io.FileNotFoundException: jndi:/localhost/testapp/WEB-INF/classes/ch/blabli/dNlewrP.lculgiansDisalog$3.class
        at org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:389)

From time to time we see, that the referenced class name is actually a mix of the real class name and one or more other classes that exist in the vicinity of that class.

Finally I was able to catch this exception with the debugger.
What I see is that the problem stems from rg.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986):
- dcUrlConn (DirContextURLConnection) contains the correct entries of the WAR file
- Enumeration<String> dirs = dcUrlConn.list()  however does not. Calling list() again from the debugger yields the correct results.

Here's the contents of that dirs variable:

[lCsosn, oCrAopnpfliigcEadtiitoonr$AEpdpiltiocraMtoidoen.$c2l.acslsass, , tCioonnfi$gu4r.aticolnaLsosad, liconaftgiuroatni$o5n.LocadlLaisstsen, cCaotnifiognur$a6.clastsi, lCaonsfsig, s, cluaiss, CwoindgfetisegtEditorApplication$9.class, ConfigEditorApplication$ApplicationConfigLoadListener.class, ConfigEditorApplication$ConfigEditorToParameterHandlerIntegration.class, ConfigEditorApplication$EditorMode$1.class, ConfigEditorApplication$EditorMode$2.class, ConfigEditorApplication$EditorMode.class, ConfigEditorApplication.class, ConfigLoader$1.class, ConfigLoader.class, ConfigurationLoadListener$ConfigType.class, ConfigurationLoadListener$ConfigurationLoadedEvent.class, ConfigurationLoadListener$FailedToLoadConfigurationEvent.class, ConfigurationLoadListener.class, command, data, field, rendering, session, ui, util, widgetset]

walking into the list() command leads to collection.list("/") which in turn is encoded using an URL_ENCODER (class UEncoder). This URL_ENCODER is *not* thread safe and can result in exactly such garbage if used concurrently.

It is interesting to note that every single failure we had at this step was always caused by classes in the WEB-INF/classes directory, never in JAR files in WEB-INF/lib.

To me it appears that there are two different DirContextURLConnections with rare concurrency issues when both use the same URL_ENCODER.

Here's the full stacktrace when the error occurs:
        at org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:389)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1994)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsUrl(ContextConfig.java:1902)
        at org.apache.catalina.startup.ContextConfig.webConfig(ContextConfig.java:1298)
        at org.apache.catalina.startup.ContextConfig.configureStart(ContextConfig.java:876)
        at org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:374)
        at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
        at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:90)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5378)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1083)
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1880)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

Jan 06, 2015 3:43:44 PM org.apache.catalina.startup.ContextConfig processAnnotationsJndi
SEVERE: Unable to process resource element [jndi:/localhost/medusa-config-editor-5.48.0/WEB-INF/classes/ch/ergon/medusa/configeditor/ui/dialog/TeerstRelgeaxsDialog.class] for annotations
java.io.FileNotFoundException: jndi:/localhost/medusa-config-editor-5.48.0/WEB-INF/classes/ch/ergon/medusa/configeditor/ui/dialog/TeerstRelgeaxsDialog.class
        at org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:389)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1994)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsUrl(ContextConfig.java:1902)
        at org.apache.catalina.startup.ContextConfig.webConfig(ContextConfig.java:1298)
        at org.apache.catalina.startup.ContextConfig.configureStart(ContextConfig.java:876)
        at org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:374)
        at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
        at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:90)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5378)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1083)
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1880)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

Jan 06, 2015 3:43:44 PM org.apache.catalina.startup.ContextConfig processAnnotationsJndi
SEVERE: Unable to process resource element [jndi:/localhost/medusa-config-editor-5.48.0/WEB-INF/classes/ch/ergon/medusa/configeditor/ui/dialog/aTsesxtAreaDialog$2.class] for annotations
java.io.FileNotFoundException: jndi:/localhost/medusa-config-editor-5.48.0/WEB-INF/classes/ch/ergon/medusa/configeditor/ui/dialog/aTsesxtAreaDialog$2.class
        at org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:389)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1994)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(ContextConfig.java:1986)
        at org.apache.catalina.startup.ContextConfig.processAnnotationsUrl(ContextConfig.java:1902)
        at org.apache.catalina.startup.ContextConfig.webConfig(ContextConfig.java:1298)
        at org.apache.catalina.startup.ContextConfig.configureStart(ContextConfig.java:876)
        at org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:374)
        at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
        at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:90)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5378)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1083)
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1880)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Comment 1 Felix Schumacher 2015-01-17 13:20:49 UTC
Could you tell us, how you are starting the application?

For example: Is it inside eclipse, as a war, ...? Are you using parallel startup with the startStopThreads parameter in Host? Do you use virtualWebappLoader?
Comment 2 matthias.keller 2015-01-18 13:01:55 UTC
We're starting it from command line using "catalina run". The scenario I was able to (sometimes) reproduce the problem hat two WAR files with lots of classes inside the WEB-INF/classes directory. One thing that could also make a difference is, that the classes of the two WAR files are very similar, 95% (or more) are the same or have at least the same class names.

The parameters used are startStopThreads="0" (= parallel), we're not using a virtualWebappLoader.
Comment 3 Violeta Georgieva 2015-01-19 12:22:19 UTC
Created attachment 32378 [details]
Patch

Hi,

What do you think if we switch to UEncoder/UDecoder as regular fields of the class? (see attachment)

Regards,
Violeta
Comment 4 Konstantin Kolinko 2015-01-19 12:56:21 UTC
1) As far as I see, UDecoder does not have any fields and thus is thread-safe.

The thread safety is not documented, though.

As such, I do not mind creating an instance of UDecoder, as that feels safer.

A possible way to solve documentation issue is to add getInstance() factory methods to UEncoder, UDecoder. For decoder the method would return a singleton (thus "documenting" the thread safety), while for encoder it will always return a new instance.

2) Both encoder and decoder are usually used only once per connection.

Thus I think it would be better to use local variables, creating the objects just before use, instead of caching them in class fields.

UDecoder is used in connect(). That call happens once.

UEncoder is used in list(). While it is possible to call list() multiple times, I think that in practice it is called only once.
Comment 5 Felix Schumacher 2015-01-19 18:33:52 UTC
I came up with the same patch as Violeta and it fixes the reported problem (unpackWARs has to be set to false, if you want to reproduce it and of course it helps if some sleep statements are inserted into the encoder methods).

I agree, that it would be nicer to use the UDecoder/UEncoder as local variables. I don't think that we should introduce getInstance methods for tomcat7 only.
Comment 6 Violeta Georgieva 2015-01-20 07:49:37 UTC
(In reply to Konstantin Kolinko from comment #4)
> 1) As far as I see, UDecoder does not have any fields and thus is
> thread-safe.
> 
> The thread safety is not documented, though.
> 
> As such, I do not mind creating an instance of UDecoder, as that feels safer.
> 
> A possible way to solve documentation issue is to add getInstance() factory
> methods to UEncoder, UDecoder. For decoder the method would return a
> singleton (thus "documenting" the thread safety), while for encoder it will
> always return a new instance.
> 
> 2) Both encoder and decoder are usually used only once per connection.
> 
> Thus I think it would be better to use local variables, creating the objects
> just before use, instead of caching them in class fields.
> 
> UDecoder is used in connect(). That call happens once.

+1 for creating it as a local variable

> UEncoder is used in list(). While it is possible to call list() multiple
> times, I think that in practice it is called only once.

list() is called many times by org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(URL, WebXml, boolean)

As we add a save char to UEncoder I do not think that it will be OK to do this every time when list() is called.

Wdyt?
Comment 7 Konstantin Kolinko 2015-01-20 08:19:22 UTC
(In reply to Violeta Georgieva from comment #6)
> (In reply to Konstantin Kolinko from comment #4)
> 
> > UEncoder is used in list(). While it is possible to call list() multiple
> > times, I think that in practice it is called only once.
> 
> list() is called many times by
> org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(URL,
> WebXml, boolean)
> 
> As we add a save char to UEncoder I do not think that it will be OK to do
> this every time when list() is called.
> 
> Wdyt?

As far as I see, it opens a new DirContextURLConnection each time. While the method is called many times, a connection is not reused. Thus a class-level field provides no benefit.

It should be possible to initialize UEncoder.safeChars bitset once and share the same bitset between UEncoders.  It will save us some time and memory.

A single UEncoder.addSafeCharacter() call does not matter much, but UEncoder().initSafeChars() that adds around 60 characters does matter.
Comment 8 Felix Schumacher 2015-01-20 08:36:19 UTC
What if we add another constructor which sets the bitset to use (or add a method to set it)? Or make the initial bitset static and clone that on construction?
Comment 9 Felix Schumacher 2015-01-20 21:00:14 UTC
Created attachment 32382 [details]
Make UEncoder a local variable and introduce a default safe charset enum

This patch uses a local variable for the UEncoder in DirContextURLConnection and enables the safeChars in UEncoder to be immutable.

That way all Response and DirContextURLConnection instances can share one safeChars BitSet while having their own UEncoder instance which remains unsafe for multithreaded usage.
Comment 10 Violeta Georgieva 2015-01-21 13:29:01 UTC
(In reply to Konstantin Kolinko from comment #7)
> (In reply to Violeta Georgieva from comment #6)
> > (In reply to Konstantin Kolinko from comment #4)
> > 
> > > UEncoder is used in list(). While it is possible to call list() multiple
> > > times, I think that in practice it is called only once.
> > 
> > list() is called many times by
> > org.apache.catalina.startup.ContextConfig.processAnnotationsJndi(URL,
> > WebXml, boolean)
> > 
> > As we add a save char to UEncoder I do not think that it will be OK to do
> > this every time when list() is called.
> > 
> > Wdyt?
> 
> As far as I see, it opens a new DirContextURLConnection each time. While the
> method is called many times, a connection is not reused. Thus a class-level
> field provides no benefit.
> 
You are right - +1
Comment 11 Violeta Georgieva 2015-01-21 13:29:47 UTC
(In reply to Felix Schumacher from comment #9)
> Created attachment 32382 [details]
> Make UEncoder a local variable and introduce a default safe charset enum
> 
> This patch uses a local variable for the UEncoder in DirContextURLConnection
> and enables the safeChars in UEncoder to be immutable.
> 
> That way all Response and DirContextURLConnection instances can share one
> safeChars BitSet while having their own UEncoder instance which remains
> unsafe for multithreaded usage.

+1
Comment 12 Felix Schumacher 2015-01-22 20:57:48 UTC
This fix has been applied to tomcat7.0.x (for 7.0.58 and onwards)

Thanks for the nice debugging work.