Hello, When someone,using DBCS as natural language,uses HSSF library,now he must call HSSFCell#setEncording(HSSFCell#ENCODING_UTF_16) at each cell objects. But it is not convenience.In japan,sometime I saw a new-user couldn't find this method,and made cell contains broken caracters, and evaluate POI as "useless". XP When serialize a cell's value,it is easy to check the value has double bytes character or not,I feel. So I suggest, add encording mode ENCODING_AUTOMATIC to HSSFCell, and if seleted mode is this,chek characters automatically and decide to seialize as ENCODING_COMPRESSED_UNICODE or ENCODING_UTF_16 in POI library. And I hope,you make default encoding mode to ENCODING_AUTOMATIC.
Latest code in SVN will use string chars to determine encoding type. Which does what you describe. Of course i am not a unicode expert, so if this latest code doesnt work then please raise a bug with an appropriate testcase. Jason