Bug 29091

Summary: Non-ascii characters are not handled correctly...
Product: Tomcat 5 Reporter: Jesper S <jesper.soderlund>
Component: Webapps:AdministrationAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED WORKSFORME    
Severity: normal CC: diegofr2, Dirk.Schwartzkopff, gruber, jesus, jfclere, ushakov
Priority: P3    
Version: 5.5.24   
Target Milestone: ---   
Hardware: Other   
OS: Windows XP   

Description Jesper S 2004-05-19 14:51:37 UTC
When I enter the administration interface and into "User definitions/Users". I
select an existing user, e.g. the user called "tomcat".

I enter a name in the "Full name" field and when the name contains a non-ASCII
character, e.g. "ö" (I don't know if bugzilla will display this correct either)
which in unicode has code point 0x00F6.

When the name is sumbitted the characters ö turn up instead which is the UTF-8
encoding of 0x00F6.

When this value is saved to the tomcat-users.xml the correct UTF-8 encoding is
saved (ö).
Comment 1 Yoav Shapira 2004-06-10 14:13:14 UTC
*** Bug 26258 has been marked as a duplicate of this bug. ***
Comment 2 Yoav Shapira 2004-07-30 14:34:51 UTC
*** Bug 28219 has been marked as a duplicate of this bug. ***
Comment 3 Yoav Shapira 2004-07-30 14:36:00 UTC
*** Bug 29836 has been marked as a duplicate of this bug. ***
Comment 4 jks 2004-08-13 03:07:44 UTC
3 dups of this bug, but it is still marked as NEW?
Comment 5 Yoav Shapira 2004-08-30 19:27:09 UTC
Yeah, it's a big deal to fix.
Comment 6 Tim Funk 2004-10-26 23:56:17 UTC
*** Bug 29823 has been marked as a duplicate of this bug. ***
Comment 7 Sergey Ushakov 2004-10-27 05:29:12 UTC
Isn't uncommenting the 'Set Character 
Encoding' filter and its mapping in 'web.xml' for 'admin' app the right 
approach? Or is it just a shallow fix that does not address some deeper 
problem? In any case it works for me... Maybe change the default to uncommented 
filter state in the next release anyway? At least some of new users will not 
trod on this issue...
Comment 8 Yoav Shapira 2004-11-19 16:09:14 UTC
The problem is not with the web page display, it's with the persistence of 
changes to the file system.  So the Filter doesn't solve the problem.
Comment 9 Sergey Ushakov 2004-11-22 02:54:46 UTC
Well, I can't boast deep understanding of Tomcat internals, and this filter was
probably intended for something different (for some more delicate tuning?), but
this trick with filter definitely works for me in all aspects including persistence.

With this filter enabled:
- I can enter descriptions in any combination of languages (English / Russian /
Chinese / whatever);
- descriptions survive non-distorted during the engine life cycle;
- description are persisted in tomcat-users.xml using correct UTF-8;
- descriptions are correctly initialized after engine being restarted.

Totally I have no grudge against multi-language descriptions persistence with
this filter. All this behaves well with both IE6 and Firefox. I use TC 5.0.28 on
Windows.

The two problems in the distribution are:
- the filter is disabled by default;
- the filter class name in admin app's web.xml is reduced to two last segments
('filters.SetCharacterEncodingFilter'); it works well if TC is installed from
.zip distribution, but if installed using .exe distribution the filter name
needs to be written in full -
'org.apache.webapp.admin.filters.SetCharacterEncodingFilter' - I did not manage
to investigate why...
Comment 10 Yoav Shapira 2004-11-22 15:12:07 UTC
Per Mr. Ushakov's investigation and comments, I'm closing this issue.  Thanks.
Comment 11 Sun House 2005-01-05 16:23:37 UTC
(In reply to comment #10)
> Per Mr. Ushakov's investigation and comments, I'm closing this issue.  Thanks.

Hi, 

Sorry to spoil the "party", but I reopen this bug.
This bug is marked as a duplication for #28219 (Dolar sign in password of JNDI-
Datasource disappears), thus #28219 is closed on duplication.

Maybe configuring the character encoding solves the problem with servlet 
parameter inputs, but it doesn't solve the problem of putting non ascii 
characters in tomcat 5 configuration files(I use tomcat 5.027).

To be more precise, i need to use $ sign in server.xml in order to define a 
shared folder (i.e. \\my-machine\c$\my-app\)
To make it work on tomcat 5 i had to put double dollar sign (\\my-
machine\c$$\my-app\ ) and also add Sun's tools.jar in common\lib. this is a 
workaround and not a way to solve this.

I see in forums that this problem is annoying many people. not just me. 
This problem did not occur in tomcat 4.03

Regards 
Sun House
Comment 12 Yoav Shapira 2005-01-20 15:56:17 UTC
A few things.

One, whether it occurred in an ancient version or not doesn't make much 
difference.  The codebase is substantially different now.

Two, a workaround is just that: a solution for fringe cases that may or may 
not be addressed in the future.  It's less likely to be addressed in the 
future if you're working with a maintenance branch (active develoment is now 
in Tomcat 5.5).

Three, Tomcat 5.5.7 which was just released should have a fix for this.  It'd 
be great if you could test it out and let us know what you think.
Comment 13 Ingemar Allkvist 2005-09-05 15:19:06 UTC
I am using Tomcat 5.5.9, and I still have the problem described in 26258; that 
is, the non-ascii characters of the XML-files are not properly escaped.

I'm using Tomcat on windows/xp and I've chosen to use windows administrator 
account as Tomcat admin, and this account had a non-ascii character in it's 
name (ö or &ouml;). The name is the default name of the admin account on a 
swedish Windows/XP ("Administratör").

In the "tomcat-users.xml" this should be written as "&#xf6;" or 
(perhaps) "&ouml;", but instead it is written as some obscure non UTF-8 
character.

The end result is that the Tomcat server won't start "out of the box" if you 
run it on a standard Swedish Windows/XP installation...

Comment 14 Mark Thomas 2006-09-04 23:07:20 UTC
Testing with the latest 5.5.x code I do see issues editing a user of the name
Administratör. This is causes by the default connector configuration. The
solution is to set URIEncoding="UTF-8" on the connector.

With this in place I can create, edit, view and delete the Administratör user
and all changes are correctly persisted to tomcat-users.xml
Comment 15 hli29 2007-09-21 06:19:55 UTC
By following the instructions here, I am able to verify that create, edit and 
delete all work for the non-ASCII charactor username. However I can not log 
into web admin app by using the non-ASCII(Chinese in this case) username. I 
received the invalid username/password error. I think there might be something 
wrong with the j_security_check. This is a bigger problem because I have to 
restrict username to be ASCII in my own app. I am using Tomcat 5.5.25.