Bug 33650 - Jasper performance for multiple files processing
Jasper performance for multiple files processing
Status: RESOLVED FIXED
Product: Tomcat 5
Classification: Unclassified
Component: Jasper
5.5.7
PC Windows XP
: P2 enhancement (vote)
: ---
Assigned To: Tomcat Developers Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2005-02-19 10:36 UTC by Stephane Bailliez
Modified: 2009-01-04 13:16 UTC (History)
1 user (show)



Attachments
modified Options targeted at jasper performance improvement (4.98 KB, patch)
2005-08-19 22:25 UTC, Xingbo Gao
Details | Diff
modified JspC targeted at jasper performance improvement (42.56 KB, patch)
2005-08-19 22:32 UTC, Xingbo Gao
Details | Diff
modified EmbeddedServletOptions targeted at jasper performance improvement (19.68 KB, patch)
2005-08-19 22:37 UTC, Xingbo Gao
Details | Diff
modified TagLibraryInfoImpl targeted at jasper performance improvement (28.56 KB, patch)
2005-08-19 22:38 UTC, Xingbo Gao
Details | Diff
proposed patches for Options.java, JspC.java, EmbeddedServletOptions.java and TagLibraryInfoImpl.java in cvs diff format (4.02 KB, patch)
2005-08-20 04:18 UTC, Xingbo Gao
Details | Diff
cpu profiling snapshot by class (call tree) before the change (70.15 KB, image/png)
2005-08-20 04:40 UTC, Xingbo Gao
Details
cpu profiling snapshot by class (call tree) after the change (70.06 KB, image/png)
2005-08-20 04:46 UTC, Xingbo Gao
Details
Modified patch (6.09 KB, patch)
2005-08-22 10:44 UTC, Remy Maucherat
Details | Diff
Modified patch 2 (9.58 KB, patch)
2005-08-22 11:43 UTC, Remy Maucherat
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephane Bailliez 2005-02-19 10:36:42 UTC
Jasper performance is abysmal when using jspc to precompiles jsps, or more 
especially to generate the java files.

I have a webapp with 450 jsps, jasper generates the java files in 12 minutes. 
In comparison, javac compilation for those 450 generated java files is 
performed in 1 minute.

Profiling and debugging the code, it appears clearly that most of the time is 
spent doing the same parsing for static include directives over and over.

Typically a page includes an init page which itself includes 2 init pages.
page.jsp -> init.jsp -> { init-a.jsp, init-b.jsp, ... }

For each and every page, the parser always go down the tree without caching 
includes and parse init.jsp, and so on. So basically the init.jsp is parsed 
450 times.

The Mark object is also pretty inefficient as it generates bazillions of 
objects which are not so lightweight (via Mark.mark()).
Comment 1 Remy Maucherat 2005-02-19 11:53:34 UTC
Ok. Does this mean you have done some optimizations already (patches) ?
Do you have test cases ? (JSPs which are particularly slow to compile)

Note: This will looked at in 5.0.x (better to have a slow source generation than
a *broken* source generation).
Comment 2 Stephane Bailliez 2005-02-19 12:09:25 UTC
I'm currently hacking the jasper code to try to understand it as i'm not very 
familiar with the code and need to understand relations between objects and 
particulary how nodes are attached to each other, how dependant files are 
added and resolved, etc... it is a bit of obscure right now.

I'm not sure yet what would be the best place to cache those include: 
ParserController, Parser, .. and I think I need to make those Node classes 
mutable to get control over the parent and reset/set it easily.

I guess that for anyone familiar with the code, the view is pretty clear, mine 
is not at the moment.

My current testcase is the liferay portal sources (www.liferay.com), it needs 
a bit of doc to setup so it's not convenient for an independant testcase.

(NB: If I reported against 5.5.7 it is a mistake and is indeed 5.0.x)
Comment 3 Remy Maucherat 2005-02-19 13:47:07 UTC
I made the "mistake". As I said, this will not be addressed (as in -1 = veto) in
5.0.x.
Comment 4 Stephane Bailliez 2005-02-19 14:00:06 UTC
Ah ok, did not really understood it that way since I consider it is 'better to 
have a slow source generation than a *broken* source generation' in any 
situation and not just in 5.0.x :). Thus the qualification as 'enhancement'.

Nonetheless, I'm still trying to understand the code. I have made little 
progress as I realized I needed to delta the dependencies that are added 
through the Compiler instance if I want to cache the includeDirective 
fragment. I still need to find how are the taglibs evaluated to generate the 
java code.

At this time it looks like I could dream sort of a 10x improvement if this 
sort of caching can be done.
Comment 5 Remy Maucherat 2005-02-19 14:20:32 UTC
My comment was indeed missing two words.

BTW, regardless of what your profiler says, I don't think you can get 10x.
However, now that the runtime performance is good, I'm interested in optimizing
compile times. One part has been done already by replacing the compiler by one
which decreses I/O (although the main benefit was to get compilation on the JRE
instead of the JDK).
Comment 6 Stephane Bailliez 2005-02-19 14:47:25 UTC
The compilation process is an external compilation process with ant javac 
task. I'm not using the internal one as for compilation within jspc it is 
often inappropriate and it is not flexible enough for file selection, 
classpath, and jvm settings.

I'm using this because many people tend to name jsp include with .jsp 
extensions and it will f* up blind attempt to compile from Jasper as they will 
be generated to java and it is invalid code by essence.

Workaround process is usually.
1. jspc (.jsp -> .java)
2. delete generated static includes based on text content selectors (looking 
for unevaluated taglibs syntax for example)
3. javac everything

I'm myself always recommending people to name their includes as jspf (or .inc) 
extensions as it is:
- visually more convenient
- Jasper does not choke on it
- JSP editors will still understand the syntax (except for .inc)

I would really like to enhance this Ant jspc task but this is in my mile long 
todo list.

As for the speed up, I cannot see any reason why it should be 10x slower than 
java compilation (2-3s for a single jsp->java is huge). but really, all I'm 
missing is the taglibs evaluation and the process apparently should be able to 
fly

NB: For I/O it should not cost much to wrap BufferedOutputStream over the 
FileOutputStream. (I did not check if there was any change on this in cvs as I 
playing with 5.0.30 sources) but really it is peanuts compared to the zillions 
of I/O from constant loading and parsing of includes and tlds.
Comment 7 Stephane Bailliez 2005-02-19 17:01:06 UTC
FYI, putting a simple hashmap to cache the TLDs TreeNode to avoid the DOM 
parsing of TLDs in TagLibraryInfoImpl for each and every JSP brings the timing 
to 8 minutes instead of 12 on my 'testcase'. A 30% improvement.
Comment 8 Remy Maucherat 2005-02-19 17:09:56 UTC
You still need to "play nice" with regular Jasper mode, when compilation occurs
on access. I would complain if there are memory leaks, and/or abusive memory
usage ;)

If code generation times go down significantly, and since JDT is used to compile
(fast and does very little I/O), I would assume it would encourage many people
to not bother with precompilation (hence the need to avoid leaks and huge memory
usage).
Comment 9 Remy Maucherat 2005-02-19 17:11:50 UTC
(In reply to comment #7)
> FYI, putting a simple hashmap to cache the TLDs TreeNode to avoid the DOM 
> parsing of TLDs in TagLibraryInfoImpl for each and every JSP brings the timing 
> to 8 minutes instead of 12 on my 'testcase'. A 30% improvement.

Nice improvement.
Comment 10 Stephane Bailliez 2005-02-19 18:04:39 UTC
> You still need to "play nice" with regular Jasper mode, when compilation
> occurs on access. I would complain if there are memory leaks, and/or abusive
> memory usage ;)

I think the cache needs to be enabled only in command line mode (batch 
compile). Otherwise it does not make much sense and it complicate things when 
you are development mode.

> I would assume it would encourage many people to not bother with
> precompilation 

It is totally different.

Pre-compiling your jsp is mostly a way to check that there is no syntax or 
configuration problems with your taglibs or jsp. You don't care much with a 
petstore webapp with 10 jsps that you can check manually at deployment time, 
but with hundred of thems, it is not something you can afford.

Assuming you have some QA dept or some automated robot to go through all your 
webpages, that's also useless to deliver something that does not even compile. 
So it's mostly a development thing and must be part of your process, and it 
should be part of the daily build as well of course. 

So if we can cut down this check from 12 to 1min, believe me I think some 
people would love it.
Comment 11 Yoav Shapira 2005-06-01 04:26:21 UTC
I've made this item a candidate for the Summer of Code:
http://wiki.apache.org/general/SummerOfCode2005.
Comment 12 Stephane Bailliez 2005-06-01 09:03:35 UTC
I was just looking at it. :) Good thing, Yoav, I unfortunately spend my time 
switching priorities due to the current nature of my work, so I hope someone 
will take over, there sure is a nice opportunity.
Comment 13 Nathan DeStefano 2005-06-02 06:27:38 UTC
(In reply to comment #11)
> I've made this item a candidate for the Summer of Code:
> http://wiki.apache.org/general/SummerOfCode2005.

I have applied to participate in this project during google summer of code.  If
I am accepted or not I am still very interested in this issue.  I am a student
intern at a company and during my time there I have been involved in development
of a very large web application involving several hundred jsps.  I have
encountered this issue during development / deployment and would enjoy learning
more about this issue and finding a clean optimized solution.  Any suggestions
on where to start?  Feel free to email me.
Comment 14 Remy Maucherat 2005-06-02 09:53:02 UTC
I have changed the summer project to a more general "optimizing JSP compilation
performance", as hopefully not all optimizations are going to be about caching
stuff.
Comment 15 Ketan Padegaonkar 2005-06-07 10:44:27 UTC
(In reply to comment #11)
> I've made this item a candidate for the Summer of Code:
> http://wiki.apache.org/general/SummerOfCode2005.


I followed this bugzilla from the SummerOfCode2005 link. This looks like an
interesting optimization. Is there any whitepaper/architecture document that I
can look at? I would surely enjoy taking up this challenge, contest or no contest.
Comment 16 Ralf Hauser 2005-07-30 11:09:39 UTC
another aspect to this: Bug 31645
Comment 17 Xingbo Gao 2005-08-19 22:25:51 UTC
Created attachment 16121 [details]
modified Options targeted at jasper performance improvement
Comment 18 Xingbo Gao 2005-08-19 22:32:29 UTC
Created attachment 16122 [details]
modified JspC targeted at jasper performance improvement
Comment 19 Xingbo Gao 2005-08-19 22:37:54 UTC
Created attachment 16123 [details]
modified EmbeddedServletOptions targeted at jasper performance improvement
Comment 20 Xingbo Gao 2005-08-19 22:38:35 UTC
Created attachment 16124 [details]
modified TagLibraryInfoImpl targeted at jasper performance improvement
Comment 21 william.barker 2005-08-19 22:55:36 UTC
(In reply to comment #17)
> Created an attachment (id=16121) [edit]
> modified Options targeted at jasper performance improvement

You should submit patches in diff format ("cvs diff").  It's too difficult to 
evaluate the patch when you just post the entire modified file.
Comment 22 Xingbo Gao 2005-08-20 04:12:20 UTC
(In reply to comment #21)
> (In reply to comment #17)
> > Created an attachment (id=16121) [edit] [edit]
> > modified Options targeted at jasper performance improvement
> 
> You should submit patches in diff format ("cvs diff").  It's too difficult to 
> evaluate the patch when you just post the entire modified file.

Thank you. I will replace the format.
Comment 23 Xingbo Gao 2005-08-20 04:18:50 UTC
Created attachment 16126 [details]
proposed patches for Options.java, JspC.java, EmbeddedServletOptions.java and TagLibraryInfoImpl.java in cvs diff format

This patch is about caching the TreeNode tld in TagLibraryInfoImpl. Because the
profiling results show the repeating parsing of the same TLD has dominated the
precompilation process, this change would make the jasper works much better on
a large Ant build :-) (I would also submit the different profiling snapshots
before and after the code changes). Here is what I basically did:

1. Options.java: add two public functions in Options interface. One is
isCacheTldXml(), and the other is Map getCachedTldXmlMap(). The parsed TLD
XML data is cached in a Map(String uri, TreeNode tld).

2. JspC.java: add switch "-cacheTldXml" and implement the two functions defined

in Options. The cacheTldXml is defaulted to true but the users can set it.

3. EmbeddedServletOptions.java: add boolean cacheTldXml and implement the two
functions. As EmbeddedServletOptions is called by run-time compilation,
cacheTldXml is set to false by default.

4. TagLibraryInfoImpl.java: add an if-else clause in parseTld to distinguish
command-line build and run-time build. If the CompilerContext is JspC, try to
get the TreeNode from cache first; otherwise call parseXMLDocument to parse the

TLD directly.
Comment 24 Xingbo Gao 2005-08-20 04:40:04 UTC
Created attachment 16127 [details]
cpu profiling snapshot by class (call tree) before the change

The testbed is the Liferay Portal Source (www.liferay.com). It is a quite big
and 464 Java source files would be created. The build machine is a typical Dell
4600 desktop with Fedora Core 4, 2.8GHz, 768M.

Before the patch, the TagLibraryInfoImpl class takes more than 50% CPU time.
Comment 25 Xingbo Gao 2005-08-20 04:46:54 UTC
Created attachment 16128 [details]
cpu profiling snapshot by class (call tree) after the change

After the patch, the TagLibraryInfoImpl would take much less CPU time, less
than 5%.

As it's inconvenient to post the profiling snapshots here (one at a time), I
would post the rest of them on my personal website to show the differences.
I'll post the URL when it's done.
Comment 26 Xingbo Gao 2005-08-21 17:27:15 UTC
(In reply to comment #25)
> Created an attachment (id=16128) [edit]
> cpu profiling snapshot by class (call tree) after the change
> After the patch, the TagLibraryInfoImpl would take much less CPU time, less
> than 5%.
> As it's inconvenient to post the profiling snapshots here (one at a time), I
> would post the rest of them on my personal website to show the differences.
> I'll post the URL when it's done.

The URL is http://www.cs.ucf.edu/~xbgao/jasper/jasper-33650.htm. It contains 
12 screenshots in total.
Comment 27 Remy Maucherat 2005-08-22 10:20:40 UTC
Ok, I've looked at the patch, and I think it looks good. Some of it could maybe
be a bit more elegant (in particular, I think the instanceof should be replaced
by the use the CacheTldXml flag).

For additional optimizations, I think there's room seeing if includes could be
cached as well (I'm fairly certain a static include happy webapp behaves as
described by Stephane Bailliez in the original report), and of course optimizing
hotspots (JspReader/Mark seems to be the biggest offender).
Comment 28 Remy Maucherat 2005-08-22 10:44:41 UTC
Created attachment 16139 [details]
Modified patch

Small tweaks to make names more generic
Comment 29 Remy Maucherat 2005-08-22 10:45:41 UTC
I do have a problem: with the patch, precompilation of the admin webapp fails,
so I cannot apply it for the time being. There must be something wrong somewhere.
Comment 30 Stephane Bailliez 2005-08-22 11:01:42 UTC
As the current code only caches the result of the xml parsing it should 
translates in roughly a 30% speedup for Liferay compilation.

Liferay being static include happy, it is a good testcase as well. The xml 
parsing was the easy part :) As Remy mentions there is load of improvement in 
there as the JSP parsing is obviously not very efficient.

As a side note, I'm nit-picking but I think it would be better to have a 
single method that returns the TreeNode. here I feel like they are 2 pieces of 
code related to 'is cache in use'.

1) getOptions().getCachedTldXmlMap() return the map or null
2) if (ctxt.getServletContext() instanceof 
org.apache.jasper.servlet.JspCServletContext) is used to figure out if it 
should look into the cache (why not checking if the map is null or not ?)

I might be missing something, but this if is confusing me.
Comment 31 Remy Maucherat 2005-08-22 11:20:23 UTC
(In reply to comment #30)
> As the current code only caches the result of the xml parsing it should 
> translates in roughly a 30% speedup for Liferay compilation.

There's weird stuff going on, though. Maybe I did something wrong when applying
the patch. Before trying to debug, I'll try to see if caching the TaglibInfoImpl
works better.

> Liferay being static include happy, it is a good testcase as well. The xml 
> parsing was the easy part :) As Remy mentions there is load of improvement in 
> there as the JSP parsing is obviously not very efficient.
> 
> As a side note, I'm nit-picking but I think it would be better to have a 
> single method that returns the TreeNode. here I feel like they are 2 pieces of 
> code related to 'is cache in use'.
> 
> 1) getOptions().getCachedTldXmlMap() return the map or null
> 2) if (ctxt.getServletContext() instanceof 
> org.apache.jasper.servlet.JspCServletContext) is used to figure out if it 
> should look into the cache (why not checking if the map is null or not ?)
> 
> I might be missing something, but this if is confusing me.

Yes, it could be better. I assume the cache map will eventually be used for more
than TaglibInfoImpl.
Comment 32 Remy Maucherat 2005-08-22 11:43:46 UTC
Created attachment 16140 [details]
Modified patch 2

This patch will cache TagLibraryInfoImpl objects rather than the XML objects.
Comment 33 Xingbo Gao 2005-08-22 16:16:39 UTC
(In reply to comment #32)
> Created an attachment (id=16140) [edit]
> Modified patch 2
> 
> This patch will cache TagLibraryInfoImpl objects rather than the XML objects.

I looked at this patch, but I am afraid that the TagLibraryInfoImpl cannot be
cached in Parser.java and JspDocumentParser.java in this simple way. In
TagLibraryInfoImpl the individual CompilerContext is used for each jsp
precompilation specifically. I guess that's why it didn't work on the admin app
precompilation.
Comment 34 Xingbo Gao 2005-08-22 16:33:42 UTC
(In reply to comment #30)
> As the current code only caches the result of the xml parsing it should 
> translates in roughly a 30% speedup for Liferay compilation.
> 

In my testcase with liferay source package (more specifically, all jsp files
under portal-web/docroot), I could get more than 30% speedup. I did a series of
experiments and the results were:

1. Dell Dimension 4600, 2.8GHz, 768M: time reduced from ~130s to ~65s.
2. one cluster node, AMD Athlon(tm) Processor, 908MHz, 768M: time reduced from
(12m + 50s) to (2m + 55s).

It depends on the machines but my cases show significant improvement.

> Liferay being static include happy, it is a good testcase as well. The xml 
> parsing was the easy part :) As Remy mentions there is load of improvement in 
> there as the JSP parsing is obviously not very efficient.
> 
> As a side note, I'm nit-picking but I think it would be better to have a 
> single method that returns the TreeNode. here I feel like they are 2 pieces of 
> code related to 'is cache in use'.
> 
> 1) getOptions().getCachedTldXmlMap() return the map or null
> 2) if (ctxt.getServletContext() instanceof 
> org.apache.jasper.servlet.JspCServletContext) is used to figure out if it 
> should look into the cache (why not checking if the map is null or not ?)
> 
> I might be missing something, but this if is confusing me.

Comment 35 Remy Maucherat 2005-08-22 16:34:23 UTC
(In reply to comment #33)
> I looked at this patch, but I am afraid that the TagLibraryInfoImpl cannot be
> cached in Parser.java and JspDocumentParser.java in this simple way. In
> TagLibraryInfoImpl the individual CompilerContext is used for each jsp
> precompilation specifically. I guess that's why it didn't work on the admin app
> precompilation.

No, the latest patch I posted doesn't fail, for some reason (the first patch I
attached before did, but it might have been a build clean issue or something, as
the problem disappeared magically).

I know there's a compiler context field which might possibly cause issues. It
could be investigated if it really does.
Comment 36 Remy Maucherat 2005-08-31 00:36:34 UTC
The TL info caching seems to work reasonably well, after review, so I'll commit
it. The Options interface changes are not final.
Comment 37 Konstantin Kolinko 2009-01-04 13:16:59 UTC
I do not see a reason to still keep this open.

All the discussed patches are already applied to the sources 3+ years ago and are in TC 5.5.12 and later, and that includes all versions of TC 6.0.

Thus I am resolving this as FIXED.

If anyone has examples that are still unacceptable slow to compile, and likes to profile the Jasper/JspC code, I think everyone will be interested.