Bug 61792 - Need to rework any code that iterates over chars
Summary: Need to rework any code that iterates over chars
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.17-FINAL
Hardware: PC Mac OS X 10.1
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2017-11-20 22:48 UTC by PJ Fanning
Modified: 2021-10-18 19:57 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description PJ Fanning 2017-11-20 22:48:24 UTC
If we need to iterate over chars, we should use codepoints (ints) instead of char primitives. Unicode surrogates need 2 java chars to represent one Unicode codepoint.
DrawTextParagraph.java has an example where we iterate over the chars of a String.
See https://stackoverflow.com/questions/1527856/how-can-i-iterate-through-the-unicode-codepoints-of-a-java-string
Comment 1 Javen O'Neal 2017-11-20 23:28:05 UTC
Is there any way to add this to forbidden-apis-check to find the issues and make sure it stays fixed?
Comment 2 PJ Fanning 2017-11-20 23:32:11 UTC
We should forbid:
Character toLowerCase() and toUpperCase() 
String toLowerCase() and toUpperCase() 

We should only use String toLowerCase(Locale) and toUpperCase(Locale)
Comment 4 Dominik Stadler 2017-12-26 10:35:08 UTC
This seems to be mostly fixed now, is there still anything missing?
Comment 5 PJ Fanning 2017-12-26 11:13:18 UTC
Dominik, there are still a lot of places where the POI code iterates over chars. I suspect that it is best not to proceed with refactoring most of this code though. The risks of introducing new bugs needs to be weighed up against the likelihood that the code in question needs to be able to process Unicode surrogates correctly.