|Summary:||DocumentSummary ignores codepage settings|
|Component:||HPSF||Assignee:||POI Developers List <dev>|
Description michael.gesmann 2004-05-19 13:02:50 UTC
Problem: I have an excel input file, generated on a German PC, i.e. the file was written with an ISO-8859-1 encoding. The file properties as well as the content (sheet name and cell content) contain German umlaute. Then I'm reading this file with a java engine with -Dfile.encoding=ISO646-US. I'm doing this in a debugger (CodeGuide). When reading the document's SummaryInformation with HPFS the returned strings (Java Unicode) contain "?" instead of the umlaute. When reading the sheet name and cell content I see umlaute as expected. No examplary output: Unfortunately, I can see this only in the debugger. I do not know, how to show this with a short example. If I use the property -Dfile.encoding=ISO-8859-1 then I get the correct result with umlaute. If I use another encoding (in my case ISO646-US), then a System.out.print() converts all Umlaute into "?". System environment: I have downloaded poi-bin-2.5-final-20040302.zip from http://ftp.uni- erlangen.de/pub/mirrors/apache/jakarta/poi/release/bin. So I expect this to be version 2.5 (not in the list above). I'm compiling and running everything with jdk 1.4.2_02. Relevance: Problem not only occurs with explict setting of file.encoding property but also if file will be read on a maschine with a different default encoding. We are only interested in the Java Unicode String, not in any other output device. Further info: The current HPFS sources in CVS contain a class "VariantSupport.java" which seems to implement codepage support in the SummaryInformation. This source is not contained in the downloaded 2.5 version. I can provide an example if needed, I have no idea how to attach it here. Best regards, Michael Gesmann
Comment 1 Rainer Klute 2004-06-02 18:09:23 UTC
Codepage support is implemented in the CVS HEAD but not in the 2.5 release.