Bug 39895

Summary: apr_os_strerror on WinCE needs to xlate unicode->utf8
Product: APR Reporter: Curt Arnold <carnold>
Component: APRAssignee: Apache Portable Runtime bugs mailinglist <bugs>
Status: NEW ---    
Severity: enhancement Keywords: PatchAvailable
Priority: P3    
Version: HEAD   
Target Milestone: ---   
Hardware: Other   
OS: Windows CE   
Attachments: Explicitly call FormatMessageA on Win32, naive transcode on WinCE
Slight modification of previous patch
Alternative minimal patch for the issue
Previous mid contained spurious test changes
Previous min contained spurious test changes

Description Curt Arnold 2006-06-23 21:30:43 UTC
apr_os_strerror which provides message for operating system messages for Windows
does not consistently modify the lpBuffer and nSize arguments to the call to
FormatMessage which could result in overrunning the passed buffer on WinCE or
any build when _UNICODE is defined.  

The character lpBuffer argument is cast to LPTSTR (char* or wchar_t* depending
on state of _UNICODE).  The nSize argument is supposed to be the number of
TCHAR's available in the buffer, but is not adjusted for the size of TCHAR.  If
_UNICODE is set, the nSize value will be twice the proper value.

The patch addresses the issue by explicitly specifying FormatMessageA for non-CE
builds and casting the buffer to LPSTR instead of LPTSTR.  This will cause the
code to always compile to the current behavior for "normal" Windows regardless
of the state of _UNICODE.

For Windows CE, the invariant locale is specified instead of using the user's
current language which should allow the naive in-place conversion of the message
to US-ASCII to produce reasonable diagnostic messages.  Since APR messages are
not localized, it should be acceptible that OS message are not localized either.
on that platform.  A more capable conversion using WideCharToMultiByte would
require allocating a new buffer for the result and would introduce more
complexity.
Comment 1 Curt Arnold 2006-06-23 21:32:06 UTC
Created attachment 18527 [details]
Explicitly call FormatMessageA on Win32, naive transcode on WinCE
Comment 2 William A. Rowe Jr. 2006-09-19 19:54:42 UTC
Mass reassign the 44 open apr-bugs to apr bug list
Comment 3 William A. Rowe Jr. 2006-09-20 18:58:19 UTC
Of course, FormatMessageA is the wrong API when OS >= NT (and the unicode output
should be passed through the utf-8 filter, just as we do the filenames)...

so the patch needs a bit of massaging before it's committed, but I agree with
the spirit of your patch.
Comment 4 Davi Arnaut 2007-04-29 15:32:55 UTC
It compiles and builds fine on VS C++ 6.0, but it introduces C++ comments (//) 
on C code.
Comment 5 Curt Arnold 2007-05-03 08:44:58 UTC
Created attachment 20120 [details]
Slight modification of previous patch

Slight modification of previous patch, removing C++ style comment and checking
for negative TCHAR values.
Comment 6 Curt Arnold 2007-05-03 08:48:05 UTC
Created attachment 20121 [details]
Alternative minimal patch for the issue

This patch eliminates the problematic call to FormatMessage on WinCE entirely
and falls back to just output the error code.
Comment 7 Curt Arnold 2007-05-03 09:14:57 UTC
I looked hard at trying to address Will Rowe's comments on 2006-09-20, but I
could not come up with a simple solution.  Both patches will address my original
issue (only getting the first letter of any message and possible buffer overrun
on WinCE) without changing the established behavior on other platforms.  I 'd
prefer the in-place transcode over skipping the formatted message, but either is
fine.  

The issue that apr_strerror may not return a UTF-8 encoded message is not a
WinCE specific issue and I've really tried to make no change to behavior on
other platforms in the WinCE patches.  

If there were a pool handy, it would be simple to allocate memory double the
size of the destination buffer, call FormatMessage, and then
apr_conv_ucs2_to_utf8 to transcode to UTF-8.  However,  without a memory pool,
then you are constrained by the buffer available.  Even if you were able to
write a bulletproof in-place transcoder, you would still have the side effect of
reducing the maximum message that can be placed in the buffer by half.  If this
was to be addressed (maybe if APR 2 and got a pool in scope), it should be split
off into a different bug and not hold up the WinCE specific fix.

If your concern was that a caller was expecting UTF-8 and might get a message in
the current default encoding that was not valid UTF-8, you still call
FormatMessageA but walk through the message and replace any code point >= 0x80
with '?'.  However, again that is not something WinCE specific and should be
done under a different bug.

(WinCE does not have FormatMessageA or FormatMessageW, just FormatMessage which
takes LPWSTR regardless of the state of _UNICODE or _MBCS).
Comment 8 Curt Arnold 2007-05-03 09:33:49 UTC
Created attachment 20122 [details]
Previous mid contained spurious test changes
Comment 9 Curt Arnold 2007-05-03 09:34:24 UTC
Created attachment 20123 [details]
Previous min contained spurious test changes
Comment 10 William A. Rowe Jr. 2007-06-01 10:40:06 UTC
(In reply to comment #7)
> The issue that apr_strerror may not return a UTF-8 encoded message is not a
> WinCE specific issue and I've really tried to make no change to behavior on
> other platforms in the WinCE patches.  

Hold on .. why do you make that assumption?  What is the issue with returing
a UTF-8 message?  The issue is that utf8 can be reasonably rendered in an ascii
derived character set, and ONLY utf8 will represent the filename/symbol/entity
that is was substituted from the OS call with unicode args.

In any case, it won't fit, we can't rely on formatting this forwards or reverse
into utf-8 since utf-8 is variably sized smaller or longer than the original
bytestream, so my thought is that only TLS store would help us here.  Since I'm
not hacking that in today - I've deployed your 'ascii only' flavor in trunk
for the moment, which is better than nothing.

If both TLS and proper string handling is deployed, I'll be all for bumping
this back to 1.2/0.9, so note the initial patch to trunk is r543549.