Bug 31198

Summary: Non-ASCII Passwords Converted to UTF-8
Product: Tomcat 4 Reporter: Jeff Barnes <jeff.barnes>
Component: Webapps:AdministrationAssignee: Tomcat Developers Mailing List <dev>
Severity: minor    
Priority: P3    
Version: 4.1.31   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Jeff Barnes 2004-09-13 16:10:50 UTC
Non-ASCII passwords, in both forms and basic interfaces, are converted to UTF-8 
bytes with each UTF-8 byte in a separate Unicode character in the password 
string. We are using a custom realm, but we expect this behaviour would be 
consistent across all the realms. A logical implementation would be to map the 
UTF-8 character to the equivalent Unicode character before presenting the 
Unicode password String in the interface.

This is similar to Tomcat 5 bug 29091.
Comment 1 Mark Thomas 2004-09-13 19:18:31 UTC
As bug 29091 states, this is not an easy fix.
Comment 2 Mark Thomas 2005-01-05 22:58:51 UTC
Quick update:

The fix (using the filter) described in 29091 appears to resolve the issues with
the admin app (passwords are saved correctly in UTF-8) but a much trickier
problem has emerged in testing.

For BASIC auth the password is converted to bytes and base 64 encoded. The
problem appears to be the different browsers (at least IE and FireFox) make
different encoding assumptions (and neither seem to assume UTF-8) at this point
because the same username and password results in different Authorization
headers. It is looking like another i18n grey area but I will do some more work
to see if there is anything that can be done to work around this fun and games.
Comment 3 Mark Thomas 2005-01-05 23:50:31 UTC
Yep. BASIC auth and non-ASCII passwords is a mess before it even gets to Tomcat.
Mozilla definitely (and I suspect IE as well) does a lossy conversion of
non-ASCII usernames and passwords before base64 encoding. There is no way I can
see of Tomcat supporting BASIC auth for non-ASCII usernames and passwords as
things currently stand.

On to FORM auth...
Comment 4 Mark Thomas 2005-01-07 11:08:10 UTC
FORM and DIGEST required a few small fixes - these have been applied to CVS for
TC4 and TC5.

Remember that for editing of these to work correctly via the admin app, the
SetCharacterEncodingFilter must be configured.
Comment 5 Paul McCulloch 2007-11-06 05:18:21 UTC
The encoding used to interpret the login request can be forced (in TC5 at least)
using the characterEncoding attribute of the
org.apache.catalina.authenticator.FormAuthenticator valve -