Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | build breaks in libxml2 on Korean Windows due to special character | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Build Tools | Reporter: | jeongkyu.kim | ||||||||||
Component: | code | Assignee: | zhang jianfang <zhangjf> | ||||||||||
Status: | CLOSED FIXED | QA Contact: | issues@tools <issues> | ||||||||||
Severity: | Trivial | ||||||||||||
Priority: | P3 | CC: | eric.bachard, foral, issues, tora3, zhangjf, zhangxiaofei.ooo | ||||||||||
Version: | OOO300m4 | ||||||||||||
Target Milestone: | 4.0.0 | ||||||||||||
Hardware: | PC | ||||||||||||
OS: | Windows Vista | ||||||||||||
Issue Type: | DEFECT | Latest Confirmation in: | --- | ||||||||||
Developer Difficulty: | --- | ||||||||||||
Attachments: |
|
I can confirm it on Chinese Windows. Environment: OS: Microsoft Windows XP Professional Version 2002 Service Pack 3 (Japanese) Cygwin: CYGWIN_NT-5.1 Compiler: Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86 (English) Milestone: DEV300_m41 Error messages: ======================================== Building module libxml2 ... cl.exe /nologo /D "WIN32" /D "_WINDOWS" /D "_MBCS" /W1 /MD /I.. /I..\include /I.\include /D "_REENTRANT" /D "HAVE_WIN32_THREADS" /D_CRT_SECURE\ _NO_DEPRECATE /D_CRT_NONSTDC_NO_DEPRECATE /D "NDEBUG" /O2 /Foint.utils.msvc\ /c ..\testapi.c testapi.c ..\testapi.c : warning C4819: The file contains a character that cannot be represented in the current code page (932). Save the file in Unicode format\ to prevent data loss ..\testapi.c(294) : error C2001: newline in constant ..\testapi.c(295) : error C2143: syntax error : missing ')' before 'return' NMAKE : fatal error U1077: 'c:\PROGRA~1\MICROS~1.0\VC\bin\cl.exe' : return code '0x2' Stop. dmake: Error code 2, while making './wntmsci12.pro/misc/build/so_built_so_libxml2' ERROR: Error 65280 occurred while making /cygdrive/o/ooo/cws/DEV300_m41/libxml2 rmdir /cygdrive/c/WINDOWS/TEMP/2712 dmake: Error code 1, while making 'build_instsetoo_native' ======================================== Quick investigation: In the error message, the "code page (932)" denotes Japanese. $ cd $SRC_ROOT/libxml2/wntmsci12.pro/misc/build/libxml2-2.6.31 $ find * -name '*.c' | xargs perl -ne 'do { print "$ARGV\n"; close(ARGV) } if m/[\x80-\xff]/' doc/examples/testWriter.c entities.c runtest.c testapi.c xmlschemas.c $ cd $SRC_ROOT/libxml2/unxsoli4.pro/misc/build/libxml2-2.6.31 $ perl -ne 'next if m{\A\s*/?\*}; printf "%s:%d: %s", $ARGV,$.,$_ if m/[\x80-\xff]/; close ARGV if eof' *.c | iconv -f iso-8859-1 -t utf-8 runtest.c:2713: "urip://example.com/résumé.html", testapi.c:294: if (no == 2) return((xmlChar) 'ø'); testapi.c:402: if (no == 2) return((xmlChar *) "nøne"); There are three lines with problematic characters encoded in ISO-8859-1. $SRC_ROOT/libxml2/wntmsci12.pro/misc/build/libxml2-2.6.31/include/libxml/xmlstring.h /** * xmlChar: * * This is a basic byte in an UTF-8 encoded string. * It's unsigned allowing to pinpoint case where char * are assigned * to xmlChar * (possibly making serialization back impossible). */ typedef unsigned char xmlChar; Quick solution: Substitute the characters with corresponding hexadecimal escape sequences. An experimental patch file is being attached. References: C++ Character Constants http://msdn.microsoft.com/en-us/library/6aw8xdf2.aspx C++ String Literals http://msdn.microsoft.com/en-us/library/69ze775t.aspx Created attachment 60315 [details]
An experimental patch
Thanks for your effort, tora! The patch works fine on Korean (MS949) windows. @mh: No chance to apply this patch? @tora 1) Are you Apache OpenOffice commiter ? 2) if not,I could commit your fix + the mandatory changes, but I'll need your real name to mention you are the author. Thanks in advance A small suggestion, please try to use unified diff format ( -u ) for the patches. The resulting patches are usually smaller and easier to read. The patch here is a little bit out of date. it is for libxml2 2.6.31, while latest used version is 2.7.6 in AOO3.4. I will try to generate a latest version of patch. And the libxml2 patch only helps to build English version under DBCS environment. But to build AOO DBCS version, for ex Simplified Chinese version, you still need the English build environment. Created attachment 77696 [details]
runtest.c patch
Created attachment 77697 [details]
testapi.c patch
Created attachment 77698 [details]
makefile patch
I just simply migrate tora's code to latest AOO 3.4 code base, so it's original author is still tora. 3 patch files, libxml2-testapi.patch and libxml2-runtest.patch should be added to libxml2\ directory directly. makefile.patch should be applied to libxml2\makefile.mk Since several people complains on this issue, http://markmail.org/message/4ef7qvgaurduvnlt?q=93433. I will take the bug to deliver the patch to 3.4. Committed in revision r1344534 with log message, Fix issue #93433: build breaks in libxml2 on Korean Windows due to special character * /libmxl2/libxml2-testapi.patch : replaced '\248' encoded in ISO-8859-1 with '\xf8' * /libmxl2/libxml2-runtest.patch : replaced 'e' encoded in ISO-8859-1 as in 'resume' with \xe9 Patch by: tora3@nichoume.com Updated target to release that will contain the fix. |
The following error occurred in libxml2 while I was building OOO300_m4. ---------------------------------------- testapi.c ..\testapi.c : warning C4819: The file contains a character that cannot be repre sented in the current code page (949). Save the file in Unicode format to preven t data loss ..\testapi.c(294) : error C2001: newline in constant ..\testapi.c(295) : error C2143: syntax error : missing ')' before 'return' NMAKE : fatal error U1077: 'c:\PROGRA~1\MICROS~1.0\VC\bin\cl.exe' : return code '0x2' Stop. ---------------------------------------- And I found the problematic code line includes special character which is not correctly translated on Korean Windows. I guess this applies to Chinese and Japanese Windows too. static xmlChar gen_xmlChar(int no, int nr ATTRIBUTE_UNUSED) { if (no == 0) return('a'); if (no == 1) return(' '); if (no == 2) return((xmlChar) 'ø'); << Here is the problematic line return(0); } A workaround for me was to convert the encoding of the file into utf8 using the following commands. $ piconv -f iso-8859-1 -t utf8 ./wntmsci12.pro/misc/build/libxml2-2.6.31/testapi.c > testapi.c.utf8 $ cp testapi.c.utf8 ./wntmsci12.pro/misc/build/libxml2-2.6.31/testapi.c