Bug 60963 - Optimize class loading for unpackWARs=false case
Summary: Optimize class loading for unpackWARs=false case
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 8.0.43
Hardware: Other Linux
: P2 enhancement with 5 votes (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
: 61212 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-07 15:58 UTC by Thomas Meyer
Modified: 2017-10-18 15:55 UTC (History)
2 users (show)



Attachments
Proposed patch (11.34 KB, patch)
2017-04-07 15:58 UTC, Thomas Meyer
Details | Diff
updated patch (4.28 KB, patch)
2017-05-05 07:57 UTC, Thomas Meyer
Details | Diff
updated patch (15.54 KB, patch)
2017-05-05 07:58 UTC, Thomas Meyer
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Meyer 2017-04-07 15:58:50 UTC
Created attachment 34902 [details]
Proposed patch

Attached patch improves the class loading time for unpackWARs=false especially when the war is build with only uncompressed entries.
Comment 1 Thomas Meyer 2017-04-08 12:43:45 UTC
Depending on the deployed WAR files loading is up to 7 times faster.
Would be nice if somebody from the tomcat developers could have a look at this.
Comment 2 Thomas Meyer 2017-04-08 19:59:41 UTC
I did als upload the patch here: http://static.217.14.99.88.clients.your-server.de/501
Comment 3 Mark Thomas 2017-04-12 14:30:58 UTC
Looking at this is on my TODO list. The main question I have is does it still avoid file locking issues? These will be more obvious Windows (you won't be able to delete the WAR) but they affect all platforms.
Comment 4 Christopher Schultz 2017-04-12 21:22:37 UTC
It looks like isInWAR in getJarInputStreamWrapper can be leaked.

Will calling getWebResourceSet().closeJarFile undo this performance optimization?

For ZipInputStreamWithPosition constructor, should you do this instead:

        super(new PushbackCountingInputStream(in, 512));

That seems a little cleaner.
Comment 5 Thomas Meyer 2017-04-14 21:49:09 UTC
Regarding file locking issue: I didn't check this. I'll try to test this.
Comment 6 Thomas Meyer 2017-04-14 21:57:05 UTC
Regarding ZipInputStreamWithPosition:
I think you found the ugly part of this patch. I guess your suggestion wouldn't work, because ZipInputStream will construct itself with a new PushBackInputstream and the provided input stream as decorated object. Passing the input stream as you suggest would result in a PushBackInputStream with an PushBackInputStreamWithPosition as decorated object, which is wrong for our intention here.
By setting the decorated object after the super constructor has finished we replace the PushbsckInputStream with our implementation.
Comment 7 Christopher Schultz 2017-04-16 21:00:35 UTC
(In reply to Thomas Meyer from comment #6)
> Regarding ZipInputStreamWithPosition:
> I think you found the ugly part of this patch. I guess your suggestion
> wouldn't work, because ZipInputStream will construct itself with a new
> PushBackInputstream and the provided input stream as decorated object.
> Passing the input stream as you suggest would result in a
> PushBackInputStream with an PushBackInputStreamWithPosition as decorated
> object, which is wrong for our intention here.
> By setting the decorated object after the super constructor has finished we
> replace the PushbsckInputStream with our implementation.

I'm sorry, I don't follow. What's the problem with the superclass's constructor using the PushbackCountingInputStream instead of handing it an undecorated InputStream, and then replacing it after the constructor has completed?

Does the superclass constructor do something with the InputStream that we don't want to happen to the PushbackCountingInputStream?

I would think that "bytesread" would be incorrect (initialized to zero) if the superclass constructor performs a read. Don't you want those reads to be counted?
Comment 8 Thomas Meyer 2017-04-18 17:31:43 UTC
(In reply to Christopher Schultz from comment #7)
> (In reply to Thomas Meyer from comment #6)
> > Regarding ZipInputStreamWithPosition:
> > I think you found the ugly part of this patch. I guess your suggestion
> > wouldn't work, because ZipInputStream will construct itself with a new
> > PushBackInputstream and the provided input stream as decorated object.
> > Passing the input stream as you suggest would result in a
> > PushBackInputStream with an PushBackInputStreamWithPosition as decorated
> > object, which is wrong for our intention here.
> > By setting the decorated object after the super constructor has finished we
> > replace the PushbsckInputStream with our implementation.
> 
> I'm sorry, I don't follow. What's the problem with the superclass's
> constructor using the PushbackCountingInputStream instead of handing it an
> undecorated InputStream, and then replacing it after the constructor has
> completed?
Yes, exactly this is what is happening:
"super(new PushbackInputStream(in, 512), new Inflater(true), 512);"
In the super constructor.

> Does the superclass constructor do something with the InputStream that we
> don't want to happen to the PushbackCountingInputStream?

Yes, it decorates the passed InputStream with a PushBackInputStream.

> 
> I would think that "bytesread" would be incorrect (initialized to zero) if
> the superclass constructor performs a read. Don't you want those reads to be
> counted?

Theoretically we want those bytes to be counted, but the ZipInputStream doesn't read anything in it's construction phase, that's why the whole "fast forward skipping works at all".
And yes this is a somewhat hackish solution but it works.

I did also look at what commons-compress provides but their implementation also doesn't have a callback for the event "new PK entry in the InputStream reached"
Comment 9 Mark Thomas 2017-04-19 08:46:10 UTC
I've spent a little time looking at this.

I can confirm that I see similar performance improvements with local testing.

I can also confirm I have not observed any file locking. I haven't looked for other leaks but if there are any, I expect them to be simple to fix.

The one thing that worries me about this patch is the degree to which it depends on the JVM internals. While this works with the current Oracle JVM, my concern is for JVMs from other vendors and future Oracle versions.

I did take a quick look at the ZIP specification [1] and it appears it should be fairly simple to read the file names and data offsets from the local file headers. I wonder if writing a parser that extracts just the info we need and skips the rest might be a better option. 

Finally, there are a few changes in the patch that aren't strictly related to fixing the problem at hand. It is generally better to put that sort of clean-up in a separate patch (no need to re-submit this patch - the comment is more for future reference).

[1] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
Comment 10 Thomas Meyer 2017-04-19 13:06:10 UTC
(In reply to Mark Thomas from comment #9)
 
> The one thing that worries me about this patch is the degree to which it
> depends on the JVM internals. While this works with the current Oracle JVM,
> my concern is for JVMs from other vendors and future Oracle versions.

I ha exactly the same concerns! But I was sure about how much change the upstream development would accept.

> 
> I did take a quick look at the ZIP specification [1] and it appears it
> should be fairly simple to read the file names and data offsets from the
> local file headers. I wonder if writing a parser that extracts just the info
> we need and skips the rest might be a better option.

Yes, I had the same idea while figuring how to abuse the ZipInputStream for above solution. How hard could it be to parse an header entry...

> 
> Finally, there are a few changes in the patch that aren't strictly related
> to fixing the problem at hand. It is generally better to put that sort of
> clean-up in a separate patch (no need to re-submit this patch - the comment
> is more for future reference).

Okay, I always try to separate the relevant changes from unrelated stuff, but sometimes I miss something while preparing a patch.

> 
> [1] https://pkware.cachefly.net/webdocs/casestudies/

Another remark from my side:
The JarInputStream is used to parse each i.e. Jar file once per Webapp class loader. I tried to understand the verifying of the manifest and possible signed entries, but I failed. Do you have a better understanding of this topic and can you say if something did break in this area? How to check?
Comment 11 Konstantin Kolinko 2017-04-19 18:01:51 UTC
(In reply to Thomas Meyer from comment #8)
> 
> Theoretically we want those bytes to be counted, but the ZipInputStream
> doesn't read anything in it's construction phase, that's why the whole "fast
> forward skipping works at all".
> And yes this is a somewhat hackish solution but it works.
> 

1. I do not know whether this is a concern (I have not looked at the patch), but note the behaviour that was recently reported as bug 60940:

JarInputStream constructor does read MANIFEST.MF and META-INF/ directory entry that may precede it.

2. Zip* and Jar* APIs differ in their handling of i18n filenames. JAR specification uses UTF-8 for file names, Zip uses platform encoding by default.
Comment 12 Thomas Meyer 2017-04-19 20:01:28 UTC
(In reply to Konstantin Kolinko from comment #11)
> (In reply to Thomas Meyer from comment #8)
> > 
> > Theoretically we want those bytes to be counted, but the ZipInputStream
> > doesn't read anything in it's construction phase, that's why the whole "fast
> > forward skipping works at all".
> > And yes this is a somewhat hackish solution but it works.
> > 
> 
> 1. I do not know whether this is a concern (I have not looked at the patch),
> but note the behaviour that was recently reported as bug 60940:
> 
> JarInputStream constructor does read MANIFEST.MF and META-INF/ directory
> entry that may precede it.

Yes, I saw this and it's still done after the patch.

What I didn't finish to understand is the validation of the signed MANIFEST.MF entries. Is it really enough to just read through the whole JAR file with a JarInputStream with getNextJarEntry() and everything is automagically checked for correctness?

Do we care that a WAR file contains only correct signed JAR files? I never really was involved in the topic signed war files and/or containing jar files inside a war file.

> 
> 2. Zip* and Jar* APIs differ in their handling of i18n filenames. JAR
> specification uses UTF-8 for file names, Zip uses platform encoding by
> default.

That is an important hint, I think we should really go for an minimal functional ZIP "PK entry" parser to extract the offset of each "PK entry start", to fast forward in the jar file inside the war file.
Comment 13 Thomas Meyer 2017-04-22 14:28:20 UTC
Hi,

I implemented a basic ZIP file parser. An updated patch is here:
http://static.217.14.99.88.clients.your-server.de/501
Comment 14 Thomas Meyer 2017-04-28 06:34:59 UTC
Hi,

any news on this? Do you want me to attach the patch here? Anything else I can do?
Comment 15 Thomas Meyer 2017-05-05 07:57:01 UTC
Created attachment 34980 [details]
updated patch
Comment 16 Thomas Meyer 2017-05-05 07:58:23 UTC
Created attachment 34981 [details]
updated patch
Comment 17 Thomas Meyer 2017-05-13 05:57:24 UTC
The upcoming commons-compress 1.14 library will have support for fetching the offsets of the zip local headers - https://github.com/apache/commons-compress/pull/22

should we use that library in favour of my custom written zip parser?

opinions?
Comment 18 Mark Thomas 2017-05-13 22:54:33 UTC
That is probably overkill in terms of size of the library compared to the current patch.
Comment 19 Mark Thomas 2017-06-20 21:13:59 UTC
Just a quick note to say I haven't forgotten about this. It is still on my TODO list. I hope to get to it in the next week or so.
Comment 20 Thomas Maslen 2017-06-22 21:47:28 UTC
Just kibitzing, and likely way off base, but:  I've been fairly impressed by the Spring Boot Loader code for handling JARs in (uncompressed) WARs, not sure whether it could make itself useful here.
Comment 21 Mark Thomas 2017-06-23 18:59:16 UTC
*** Bug 61212 has been marked as a duplicate of this bug. ***
Comment 22 Mark Thomas 2017-07-05 21:21:37 UTC
Unfortunately, Boot depends on the outer JAR/WAR being uncompressed. Tomcat does not have the option to require that.

My current test (a WAR that just contains all the JAR files from Jira 7.3.4) takes ~3s to start unpacked, ~110s to start packed and ~95s to start packed with the latest patch. There is clear improvement but it is still a long way off the unpacked start time.

Nearly all the additional time is spent in decompression.

The fundamental problem is that we have to decompress the input stream associated with the inner JAR file before we can read any resources from it. And we have to do this for every resource we read.

The obvious answer is to unpack the JAR files to the work directory. This is what Tomcat 7 does. Start-up time there is ~9s (including the time to unpack the JARs to work). However, one of the aims of the resources re-write in 8.0.x was to avoid the complexity of file locking protection that that entailed.

Which brings me back to an old question on this topic. What is it that prevents you from running with unpackWARs="true"? It might turn out to be simpler to address whatever is preventing you from using that default config.

Still mulling over how best to handle this...
Comment 23 Thomas Meyer 2017-07-05 21:50:34 UTC
I was told that all tomcats are run with unpackWARs=false for security reasons. I really tried to convince the ops from changing the parameter to true, but no chance. I think you can run Tomcat with ro filesystem with unpackWARs=false?!

So this is why I tried to optimise this case a bit.
Comment 24 Mark Thomas 2017-07-18 20:21:47 UTC
Any chance you could elaborate on "security reasons"?

For a read-only file system, you'd be much better off deploying am unpacked WAR. And the tricks earlier Tomcat versions used (like unpacking the JARs to the work directory) wouldn't work in that case anyway.

The proposed patch essentially trades memory for performance. I'm currently thinking about extending that idea and experimenting with caching different things with a view to providing a small number of options that gives enough flexibility to meet the needs of most use cases.
Comment 25 Mark Thomas 2017-07-20 14:20:31 UTC
Having looked into this further, the only way I could see to improve things significantly was to pre-process the JAR files from a packed WAR and populate the cache with the contents of the JARs. This wasn't a great solution for several reasons:
- It would have required a fair amount of new code and/or a lot of refactoring
- The memory requirements for the cache would increase. For the solution to be useful, all the contents of all the JARs need to reside in the cache.

In the end, I opted to implement a new WebResourceRoot implementation, ExtractingRoot that returns to the 7.0.x idea of extracting the JAR files into the work directory. This will be included in 9.0.0.M25 onwards. Assuming no issues are identified, I'll back-port it to 8.5.x and 8.0.x. An open question is whether the overall refactoring of resource handling is sufficient to protect against the file locking issues.
Comment 26 Mark Thomas 2017-09-14 09:40:05 UTC
ExtractingRoot has been back-ported to 8.5.x for 8.5.22 onwards and to 8.0.x for 8.0.47 onwards.