Bug 23319 - <c:url> generates incorrect encoding
Summary: <c:url> generates incorrect encoding
Alias: None
Product: Taglibs
Classification: Unclassified
Component: Standard Taglib (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P3 enhancement with 1 vote (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL: http://schildbach.de/test-jstl-c-out.jsp
: 28652 (view as bug list)
Depends on:
Reported: 2003-09-22 08:39 UTC by Andreas Schildbach
Modified: 2009-08-17 02:03 UTC (History)
1 user (show)


Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schildbach 2003-09-22 08:39:24 UTC
Please have a look at the following JSP/JSTL snippet:

<a href="<c:url value="action.jsp">
     <c:param name="param1" value="value1"/>
     <c:param name="param2" value="value2"/>

When requested by an HTTP client, the following response is returned:

<a href="action.jsp?param1=value1&param2=value2">action</a>

BUG: The ampersand between the two parameters should be escaped, for example by
using &amp;

For testing, I created a demo under http://schildbach.de/test-jstl-c-out.jsp

Please use the W3C validator (http://validator.w3.org/) and enter my test URL,
and you will get the 6 errors I attached below.

I am using Tomcat 4.1.27, JDK 1.4.2 and JSTL 1.03


> # Line 13, column 81:  cannot generate system identifier for general 
> entity "param2"
>    ...E16EDEC5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
>                                            ^
> # Line 13, column 81: general entity "param2" not defined and 
> no default 
> entity (explain...).
>    ...E16EDEC5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
>                                            ^
> # Line 13, column 87: reference not terminated by REFC delimiter
>    ...C5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
>                                            ^
> # Line 13, column 87: reference to external entity in attribute value
>    ...C5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
>                                            ^
> # Line 13, column 87: reference to entity "param2" for which 
> no system 
> identifier could be generated
>    ...C5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
>                                            ^
> # Line 13, column 80: entity was defined here
>    ...2E16EDEC5EAFDA47DCBBB44?param1=value1&param2=value2">action</a>
Comment 1 Pierre Delisle 2003-09-23 00:40:24 UTC
Thanks for the report.
A fix is ready but unfortunately it breaks in Tomcat's RequestDispatcher
when "&amp;" is used instead of "&" (with <c:import>).
(all actions that make use of <c:param> share the same param generation code).
I have pinged the Tomcat team on that issue.
It is possible to do a fix for only <c:url>, but I'll wait to hear from them 
before going ahead.
Comment 2 Pierre Delisle 2003-10-07 20:18:18 UTC
OK. Cleared up the situation with tomcat and this is indeed a 
problem that has to be fixed at the level of <c:url>.

This will however require a change to the spec.
Unfortunately, the JSTL 1.1 Maintenance Release spec is now frozen
and the fix will have to wait for the next release.

The change that the expert group has agreed on so far is to add
one new attribute to <c:url>, 'escapeAmp', to escape the ampersand character.
Default value for that attribute will be false, for backwards compatibility.
Comment 3 Andreas Schildbach 2003-11-04 13:31:12 UTC
I don't understand the rationale behind defaulting to generating invalid HTML
(according to W3C).

In the case of <c:url>, I don't understand the need for that escapeAmp attribute
at all, because why would I ever _not_ escape an ampersand if I get invalid HTML

If the reason for that behaviour is that the same routines are being shared
between <c:url> and other related code, maybe this is a hint that the code
should not be shared any longer or should be refactored to accommodate the two
different cases.

Anyway, perhaps it would be possible to apply or at least describe a workaround
for JSTL 1.0.
Comment 4 Pierre Delisle 2004-01-08 02:08:03 UTC

I understand your point, and it makes a lot of sense.
The problem though is that we have to deal with what's currently
in the spec and its impact on backwards compatibility.

If it had been clearly spelled out in the spec that the output of 
<c:url> were to be used strictly for URIs used in HTML element attribute 
values, then I'd agree with you.

The problem though is that the output of <c:url> may also be used 
in other contexts. For example (not the best example, but just
to make the point), one could have used <c:url> as follows:

  <c:url value="foo" var="myUrl">
    <c:param name="param1" value="value1"/>
    <c:param name="param2" value="value2"/>

  <c:redirect url="${myUrl}"/>

In that context, we don't want the ampersands to be escaped.
It would break <c:redirect> which expects parameters to be separated
by '&' and not '&amp;'. 

Since the escaping of '&' to '&amp;' has not been clearly 
spelled out in the spec, we would not be backwards compatible.
Comment 5 Pierre Delisle 2004-04-28 21:20:39 UTC
*** Bug 28652 has been marked as a duplicate of this bug. ***
Comment 6 Pierre Delisle 2004-04-28 21:22:29 UTC
This is not yet resolved. Will be in the next JSTL Maintenance Release.
Comment 7 Felipe Leme 2004-05-12 00:26:56 UTC
CC'ing the taglibs-dev address to all Standard bugs. 
Comment 8 Justyna Horwat 2004-06-14 17:38:41 UTC
Changing Severity to Enhancement since this issue cannot be resolved in the implementation until the 
specification is clarified. Will be addressed by next JSTL specification maintenance release.
Comment 9 Henri Yandell 2006-12-13 14:17:43 UTC
As far as I can tell, there were no changes to the spec related to this. The
JSTL 1.2 spec for c:param and c:url is identical to the 1.1 and 1.0 specs.

So this remains an issue for the EG team and not this particular codebase.
Reading the spec, it does state for c:url that:

"If the URL contains characters that should be encoded (e.g. space), it is the
user's responsibility to encode them. "

c:param on the other hand says:

"Moreover, it has been designed such that the attributes name and value are
automatically URL encoded. "

The two sections seem to contradict each other.
Comment 10 Henri Yandell 2006-12-13 14:21:06 UTC
I've reported this to the 1.2 RI:


We won't be fixing it unless the spec changes.
Comment 11 Henri Yandell 2006-12-13 14:26:02 UTC
I'm getting my url encoding and xml escaping muddled. Ignore my quotes from the