The StringUtil.getFromUnicode(...) function works incorrectly on the symbols with code > 127.
Created attachment 2345 [details] Now getting from Unicode correctly
Getting the symols from Unicode more accuratly with the lower half.
It is the patch. Sorry for double message. I forgot to mark the previous path as [PATCH]. Here the code: Index: src/java/org/apache/poi/util/StringUtil.java =================================================================== RCS file: /home/cvspublic/jakarta- poi/src/java/org/apache/poi/util/StringUtil.java,v retrieving revision 1.2 diff -r1.2 StringUtil.java 147,155c147,154 < byte[] bstring = new byte[len]; < int index = offset + 1; < // start with low bits. < < for (int k = 0; k < len; k++) { < bstring[k] = string[index]; < index += 2; < } < return new String(bstring); --- > > char[] chars = new char[ len ]; > for ( int i = 0; i < chars.length; i++ ) { > chars[i] = (char)( string[ offset + ( 2 * i ) ] + > ( string[ offset + ( 2 * i + 1 ) ] << 8 ) ); > } > > return new String( chars );
Thank you for this patch. In the future please remember to add yourself to the @author tags. I have applied it and committed it, please cross-check. As a suggestion it would be good to have a unit test demonstrating the failed condtion that this fixes (such as russian text for instance) check under src/testcases/org/apache/poi/hssf to see how easy it is to create such unit tests. Thanks, -Andy
I attempted to apply this however it caused a unit test to fail: Running org.apache.poi.util.TestShortList Tests run: 18, Failures: 0, Errors: 0, Time elapsed: 0.721 sec Running org.apache.poi.util.TestStringUtil Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.631 sec BUILD FAILED D:\andy\homestuff\jakarta-poi2\2\jakarta-poi\tools\cents\junit.cent\xbuild.xml:5 9: Test org.apache.poi.util.TestStringUtil failed Total time: 4 minutes 1 second Please run "./build.sh clean compile test". Once you've corrected this, please resubmit it and I'll apply it. Thanks.