Bug 10024 - <import> uses ContentEncoding for java character set
Summary: <import> uses ContentEncoding for java character set
Alias: None
Product: Taglibs
Classification: Unclassified
Component: Standard Taglib (show other bugs)
Version: unspecified
Hardware: All All
: P3 major with 1 vote (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2002-06-19 18:29 UTC by Gael Stevens
Modified: 2004-11-16 19:05 UTC (History)
0 users

Fixes 8bit encoding error for AbsoluteFtp.jsp (20020514) (1.82 KB, patch)
2002-06-22 01:58 UTC, Gael Stevens
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gael Stevens 2002-06-19 18:29:25 UTC
The problem shows up when running AbsoluteFTP.jsp.  In our environment,
 the when following bit of code is extecuted:

                     String responseAdvisoryEncoding =
                     if (responseAdvisoryEncoding != null)
                         r = new InputStreamReader(i,
                         r = new InputStreamReader(i, DEFAULT_ENCODING);

 The responseAdvisoryEncoding is 8bit, which is not a legal
 characterSet for the InputStreamReader, and
 a javax.servlet.jsp.JspException: 8bit is eventually thrown.

 One workaround is to put a try catch around it, and use the default
 encoding, as below.

                    String responseAdvisoryEncoding =
                    if (responseAdvisoryEncoding != null)
                       try { // contentEncoding can be 8bit, not a java encoding
                         r = new InputStreamReader(i,
                       } catch (java.io.UnsupportedEncodingException ex){
                         r = new InputStreamReader(i, DEFAULT_ENCODING);
                         r = new InputStreamReader(i, DEFAULT_ENCODING);

 The basic issue is that content encoding, does not necessarily map to a
 java character encoding.  Im using jdk 1.3.1, so the new java nio Charset is
 not available.
Comment 1 Gael Stevens 2002-06-22 01:58:55 UTC
Created attachment 2154 [details]
Fixes 8bit encoding error for AbsoluteFtp.jsp (20020514)
Comment 2 Gael Stevens 2002-06-22 02:07:37 UTC
It may be that the charset from the ContentType is what you want, rather than
the ContentEncoding to create the InputStreamReader.  If so, then that
attached diff file may be of some help.   The charset attribute of the content
type (if present) provides a good mapping (earlier jdk versions had some issues
with IANNA's TIS-620 v.s java's TIS620, don't know if it's fixed in a later
jdk).  The uc.getContentEncoding() really doesn't relate to the java encoding
parameter (jdk 1.3) of the InputStreamReader's constructor.
Comment 3 Gael Stevens 2002-06-23 16:13:21 UTC
It may be that the problem is with the example, AbsoluteFtp.jsp. As per the
spec, 7.4 under Character Encoding : 
  Note that the charEncoding attribute should normally only be required when
  accessing absolute URL resources where the protocol is not HTTP, and where the
  encoding is not ISO-8859-1.

If so, then the example should include the charEncoding attribute.  Perhaps
a clarification of the spec is needed here.  The above section also says :

  If the response has content encoding information (e.g.
  URLConnection.getContentEncoding() has a non null value), then the
  character encoding specified is used.

In the case of the URLConnection.getContentEncoding() returning 8bit, which is 
of course, non null and also not a valid java character encoding, what should be
the result?  This case is not covered in the error section under For External 
Resources, as the URLConnection class does not throw an exception.  
Comment 4 Justyna Horwat 2002-06-27 22:22:27 UTC
This is an issue that needs to be resolved in the JSTL specification. Currently 
the reference implementation correctly implements section 7.4 of the spec.

I went ahead and filed your bug against the JSTL specification. Once the issue 
is addressed by the specification, it can be fixed in the RI.
Comment 5 Pierre Delisle 2003-03-31 14:01:57 UTC
JSTL 1.1 has been amended to properly handle this bug.
Advisory character encoding now properly fetched from "charset" attribute
of "content-type" header.