If we need to iterate over chars, we should use codepoints (ints) instead of char primitives. Unicode surrogates need 2 java chars to represent one Unicode codepoint. DrawTextParagraph.java has an example where we iterate over the chars of a String. See https://stackoverflow.com/questions/1527856/how-can-i-iterate-through-the-unicode-codepoints-of-a-java-string
Is there any way to add this to forbidden-apis-check to find the issues and make sure it stays fixed?
We should forbid: Character toLowerCase() and toUpperCase() String toLowerCase() and toUpperCase() We should only use String toLowerCase(Locale) and toUpperCase(Locale)
https://svn.apache.org/viewvc/poi/trunk/src/resources/devtools/forbidden-signatures.txt?view=log
This seems to be mostly fixed now, is there still anything missing?
Dominik, there are still a lot of places where the POI code iterates over chars. I suspect that it is best not to proceed with refactoring most of this code though. The risks of introducing new bugs needs to be weighed up against the likelihood that the code in question needs to be able to process Unicode surrogates correctly.