Bug 64089 - Resource paths resolve symlinks
Summary: Resource paths resolve symlinks
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 8.5.50
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-21 12:16 UTC by Marvin Fröhlich
Modified: 2020-03-03 09:17 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marvin Fröhlich 2020-01-21 12:16:21 UTC
Tomcat 8.5.35 introduced a behavior, which is a bug for us. Still consists in 8.5.50.

In our development environments we use symlinks for all of our webapp folders. So under tomcat (resp. catalina base) there is the webapps folder, that contains only symlinks, which point to the actual webapps (not wars).

The applications' web.xml files use XML imports like this:

######################
<!DOCTYPE web-xml [
    <!ENTITY myentity SYSTEM "../../../foo/bar/myentity.xml">
]>
######################

This relative import worked just fine in 8.5.34 and prior, but will fail in 8.5.35+. The error message in the log unfortunately does not tell you more than "file not found", but does not say, where it was looking for it.

The class org.apache.catalina.startup.ContextConfig, method fixDocBase() introduced a change in 8.5.35 (line 655 in 8.5.50's source), that uses getCanonicalPath() to resolve the absolute path for a resource (in this case web.xml). This path is used as base (systemId) for the WebXmlParser. Since the path has resolved symlinks, but the relative import assumes to originate from a standard catalina_base structure, it won't find the imported file.

Instead of getCanonicalPath() you could use something like toPath().toAbsolutePath().normalize(), which does NOT follow symlinks.

The behavior differs on Windows, where symlinks (Junktions) are not followed.

This bug is critical for us. And there's no way to work around it.
Comment 1 Mark Thomas 2020-01-22 18:32:10 UTC
I'm afraid I am going to take a different view.

The behaviour you were relying on was a bug and, now that bug has been fixed, the behaviour has changed.

For reference the change was as a result of this thread: https://tomcat.markmail.org/thread/gonkmfw5acognpy3

Further, the entity reference in the web.xml shown in the example accessing a location outside of the root of the web application. That goes against the general principal that web applications are meant to be self-contained (and access additional resources via JNDI).

I'm wondering if it is possible to map these external xml files into a web application via http://tomcat.apache.org/tomcat-9.0-doc/config/resources.html

There have been some changes recently that might impact on this so I'll do some testing locally and report back.
Comment 2 Marvin Fröhlich 2020-01-23 06:59:38 UTC
Hi Mark,

thanks a lot for taking a look at this.

What a pity, that this is considered correct behavior. Couldn't this be configurable like allowLinking? I mean, it may be correct for many, but not for all.
Comment 3 Marvin Fröhlich 2020-01-23 07:18:01 UTC
How about the difference between Linux and Windows? Shouldn't this fail on Windows, too, if it does on Linux? Or vice versa, work on both?
Comment 4 Mark Thomas 2020-01-27 19:14:04 UTC
Dealing with this first. The behaviour on Linux and Windows is the same. There is no platform specific code in Tomcat's resource handling. Junctions != symbolic links and the JRE treats them differently.

Unfortunately the relevant XML API - EntityResolver2 - passes URLs around as Strings.

Tomcat has some special handling in the URL instances returned for resources. This is primarily so resource access via URLs is cache aware but the same behaviour would have helped here. As soon as the URLs are converted to Strings that special handling is lost.

The Tomcat 7 approach of implementing a Tomcat specific URL scheme would work but the resources refactoring in 8.5.x onwards took a deliberate decision not to do that - for a combination of complexity, performance, maintenance and ease of embedding reasons.

I have a few more ideas to try to see if I can find a work-around. Hopefully one that won't require a new configuration option but I'm not against that if that is what is required.
Comment 5 Mark Thomas 2020-01-28 17:57:02 UTC
Would it help if Tomcat added ${...} property replacement support so you could do something like:

######################
<!DOCTYPE web-xml [
    <!ENTITY myentity SYSTEM "${property.containing.correct.path}">
]>
######################

And then you could add

property.containing.correct.path=/path/relative/to/canonical/location

to catalina.properties (or one of the other ways of setting properties).

This isn't currently supported but I think I can see a way to add it in a backwards compatible way.
Comment 6 Marvin Fröhlich 2020-01-29 08:12:45 UTC
Yes, I was about to suggest that. I was thinking about using ${CATALINA_BASE} as part of the import path.

So, yes, that would help a lot.

Is it possible by the way to specify a custom entity resolver, which would enable us to solve the problem?
Comment 7 Mark Thomas 2020-01-30 17:45:21 UTC
Glad we are thinking in the same direction.

The EntityResolver isn't currently configurable. I'll consider that option as well but adding support for ${...} looks like the simplest solution at the moment.
Comment 8 Mark Thomas 2020-01-30 21:23:35 UTC
${catalina.base}/../../../myentity.txt  works now.

Fixed in:
- master for 10.0.0.0-M1 onwards
- 9.0.x for 9.0.31 onwards
- 8.5.x for 8.5.51 onwards
- 7.0.x for 7.0.100 onwards
Comment 9 Marvin Fröhlich 2020-03-03 07:46:28 UTC
Wow, that was quick. Unfortunately I missed the notification. Sorry for that.

This solution is perfect. Thanks a lot.
Comment 10 Marvin Fröhlich 2020-03-03 08:35:12 UTC Comment hidden (obsolete)
Comment 11 Marvin Fröhlich 2020-03-03 08:37:46 UTC Comment hidden (obsolete)
Comment 12 Mark Thomas 2020-03-03 09:00:11 UTC Comment hidden (obsolete)
Comment 13 Marvin Fröhlich 2020-03-03 09:08:31 UTC Comment hidden (obsolete)
Comment 14 Marvin Fröhlich 2020-03-03 09:16:01 UTC Comment hidden (obsolete)
Comment 15 Marvin Fröhlich 2020-03-03 09:17:06 UTC Comment hidden (obsolete)