Bug 53986 - Comment end tag preceded directly by a dash causes JSP fail compilation
Comment end tag preceded directly by a dash causes JSP fail compilation
Status: RESOLVED FIXED
Product: Tomcat 7
Classification: Unclassified
Component: Jasper
7.0.32
PC All
: P2 normal (vote)
: ---
Assigned To: Tomcat Developers Mailing List
:
: 54069 54184 (view as bug list)
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2012-10-09 18:40 UTC by Steve Ratay
Modified: 2012-11-21 19:13 UTC (History)
2 users (show)



Attachments
JSP file to reproduce compilation error (293 bytes, text/plain)
2012-10-09 18:40 UTC, Steve Ratay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Ratay 2012-10-09 18:40:08 UTC
Created attachment 29466 [details]
JSP file to reproduce compilation error

I have a JSP file that contained a comment line as follows:
<%--- comment ---%>

Such a JSP compiles in Tomcat 7.0.30, but it does not compile in 7.0.32.  I’ve attached a copy of a test file, which compiles in 7.0.30, but not in 7.0.32.  The only related change I see in the release notes is https://issues.apache.org/bugzilla/show_bug.cgi?id=53713.  

From reading the JSP 2.2 spec, it appears that absent a specific comment about whitespace, the rules of XML shall apply, which effectively means whitespace is ignored.  This would seem to indicate this could be a regression introduced in 7.0.32, but since the spec is somewhat vague in the section about comments, 

ERROR 09 Oct 2012 11:15:25,190 [http-bio-8080-exec-8] [com.liferay.portal.log.Co
mmonsLogImpl.error(52)] org.apache.jasper.JasperException: /html/portal/layout/v
iew/portlet.jsp (line: 48, column: 6) Unterminated &lt;%-- tag
        at org.apache.jasper.compiler.DefaultErrorHandler.jspError(DefaultErrorH
andler.java:42)
        at org.apache.jasper.compiler.ErrorDispatcher.dispatch(ErrorDispatcher.j
ava:408)
        at org.apache.jasper.compiler.ErrorDispatcher.jspError(ErrorDispatcher.j
ava:133)
        at org.apache.jasper.compiler.Parser.parseComment(Parser.java:615)
        at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1425)
        at org.apache.jasper.compiler.Parser.parse(Parser.java:138)
        at org.apache.jasper.compiler.ParserController.doParse(ParserController.
java:242)
        at org.apache.jasper.compiler.ParserController.parse(ParserController.ja
va:102)
        at org.apache.jasper.compiler.Compiler.generateJava(Compiler.java:198)
        at org.apache.jasper.compiler.Compiler.compile(Compiler.java:373)
        at org.apache.jasper.compiler.Compiler.compile(Compiler.java:353)
        at org.apache.jasper.compiler.Compiler.compile(Compiler.java:340)
        at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext
.java:646)
        at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper
.java:357)
        at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:3
90)
        at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl
icationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF
ilterChain.java:210)
        at com.liferay.filters.strip.StripFilter.doFilter(StripFilter.java:260)
Comment 1 Christopher Schultz 2012-10-09 21:19:08 UTC
Confirmed broken in 7.0.32, working in 7.0.30.

I played-around with it a bit and it appears that any odd number of - symbols before %> will cause the page to fail to compile (the original example has 7 - symbols before the final %> in the comment).
Comment 2 Christopher Schultz 2012-10-09 21:25:57 UTC
I thought there might be language in the spec to say that (like XML) the string "--" cannot appear in the comment itself but that's not the case.

From the 2.2 spec:
"
JSP.1.5.1.2 JSP Comments

A JSP comment is of the form
<%-- anything but a closing --%> ... --%>

The body of the content is ignored completely.
"

So this is both a regression and a spec violation.
Comment 3 Mark Thomas 2012-10-09 21:30:38 UTC
Possible cause: r1381417

I haven't looked into this yet but that is where I'd start.
Comment 4 Christopher Schultz 2012-10-09 22:07:13 UTC
(In reply to comment #3)
> Possible cause: r1381417

Reverting that patch resolves the issue in 7.0.x/trunk, so the regression appears to be in there. I'm not sure where, though.

Steve, a workaround might be replacing "([^-])-------%>" with "\1------%>" in your JSPs (or just stick with 7.0.30 for the time being).
Comment 5 Steve Ratay 2012-10-10 01:40:41 UTC
(In reply to comment #4)

Thanks Chris.  I actually only had a single JSP with this problem, so I already fixed the JSP and am up and running on 7.0.32.  However, since it appeared to be a regression from the previous version I thought it best to report the issue.
Comment 6 Mark Thomas 2012-10-10 14:05:47 UTC
Found the problem. It is line 429 of JspReader:

setCurrent(restart);

This makes the current Mark and the reset Mark the same object and they need to be kept separate. Looking at a fix now. It is possible that this issue will have have wider impacts that just comment tag parsing. We'll need to keeo an eye on it in case an early 7.0.33 is required.
Comment 7 Mark Thomas 2012-10-10 14:28:49 UTC
Fixed in trunk and 7.0.x and will be included in 7.0.33 onwards.
Comment 8 Konstantin Kolinko 2012-10-11 13:19:31 UTC
(In reply to comment #6)
> Found the problem. It is line 429 of JspReader:
> 
> setCurrent(restart);
> 
> This makes the current Mark and the reset Mark the same object and they need
> to be kept separate. Looking at a fix now. It is possible that this issue
> will have have wider impacts that just comment tag parsing.

I agree with the fix.

For reference: r1396615 r1396617

For reference: The JspReader.skipUntil(String limit) method performs search for the first occurrence of string "limit" and returns its position or null if none found.

As far as I see, the only place in that method where "current" and "restart" marks being the same object matters is this line:
  (#433 in 7.0.x, 427 in trunk)

   nextChar();

It is supposed to advance "current", but if the objects are the same, it advances "restart" mark as well. Thus the next search re-try will start from a wrong place.


To trigger this issue the following two steps should occur:

1. The first character of "limit" must occur somewhere in the preceding text. That is to trigger incorrect setCurrent(restart) call to start the issue.

2. It should be incorrect to skip more than 1 character when re-trying the search from a new place. This is what produces the incorrect result. This can happen only if the first character of "limit" is repeated somewhere on non-first position in the "limit" string.

The JspReader.skipUntil(..) method is invoked with the following arguments:
skipUntil("</" + tag);
skipUntil("--%>");
skipUntil("%>");
skipUntil("<");
skipUntil(">");
skipUntil("]]>");
skipUntil(":root");

An '<' cannot occur in "tag". So the only places where the issue can happen are
skipUntil("--%>");
skipUntil("]]>");

I looked for other calls of setCurrent(..) and do not see such problems elsewhere, so it is only limited to this skipUntil(..) method.

The "]]>" is used to terminate a "<![CDATA[". Noting that this affects JSP pages only. As far as I see, JSP documents are parsed differently.


The impact of this issue is that if the "limit" string is preceded by "-" or "]" correspondingly, it will not be recognized. If there is further occurrence of the sought string, it may skip there, without a compilation failure.


Example 1. "test.jsp"
<jsp:text><![CDATA[Hello world!] ]]]></jsp:text>

Expected output: "Hello world!] ]"
Actual output in 7.0.32: JasperException: Unterminated CDATA tag


Example 2.
<%-- - ---%>Hello<%-- --%> world!

Note the stray "-" to start the issue. It causes "---%" to be not recognized.

Expected output: "Hello world!"
Actual output in 7.0.32: " world!"
Comment 9 Christopher Schultz 2012-10-11 16:42:54 UTC
(In reply to comment #8)
> The JspReader.skipUntil(..) method is invoked with the following arguments:
> skipUntil("</" + tag);
> skipUntil("--%>");
> skipUntil("%>");
> skipUntil("<");
> skipUntil(">");
> skipUntil("]]>");
> skipUntil(":root");
> 
> An '<' cannot occur in "tag". So the only places where the issue can happen
> are
> skipUntil("--%>");
> skipUntil("]]>");

It may be true that a *valid* document may not include '<' within a tag but that doesn't mean that someone won't try it. As long as the resulting error message(s) make sense, I think this is an okay assumption.

> The "]]>" is used to terminate a "<![CDATA[". Noting that this affects JSP
> pages only. As far as I see, JSP documents are parsed differently.

I hadn't tested CDATA initially: I did test things like <%! ... %> and <%= ... %> because of their JSP-ness. I hadn't thought of CDATA: thanks for adding this to the list of possible triggers. Perhaps you could add this to the test case Mark recently committed.

I'm actually curious as to why Jasper cares about CDATA tags.. is that only in JSPX mode, or in vanilla JSP mode as well?

> Example 2.
> <%-- - ---%>Hello<%-- --%> world!
> 
> Note the stray "-" to start the issue. It causes "---%" to be not recognized.

That's interesting, and not something I noticed in my testing. Yet another good thing to add to the test case.
Comment 10 Konstantin Kolinko 2012-10-12 08:14:53 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > The JspReader.skipUntil(..) method is invoked with the following arguments:
> > skipUntil("</" + tag);
> > skipUntil("--%>");
> > skipUntil("%>");
> > skipUntil("<");
> > skipUntil(">");
> > skipUntil("]]>");
> > skipUntil(":root");
> > 
> > An '<' cannot occur in "tag". So the only places where the issue can happen
> > are
> > skipUntil("--%>");
> > skipUntil("]]>");
> 
> It may be true that a *valid* document may not include '<' within a tag but
> that doesn't mean that someone won't try it. As long as the resulting error
> message(s) make sense, I think this is an okay assumption.

The skipUntil("</" + tag) call is used to search for matching end-tag for a custom tag. The tag name there cannot contain "<". The tag name is parsed elsewhere and the spec has constraints on what characters are allowed there.
Comment 11 Rainer Jung 2012-10-30 08:05:48 UTC
*** Bug 54069 has been marked as a duplicate of this bug. ***
Comment 12 Mark Thomas 2012-11-21 19:13:37 UTC
*** Bug 54184 has been marked as a duplicate of this bug. ***