Bug 51572

Summary: ISO-8859-15 support in StringUtils
Product: POI Reporter: Alejandro Torras <atec.post>
Component: POI OverallAssignee: POI Developers List <dev>
Status: RESOLVED WORKSFORME    
Severity: major    
Priority: P2    
Version: 3.6-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: Test case

Description Alejandro Torras 2011-07-28 07:20:49 UTC
Created attachment 27326 [details]
Test case

Hi all,

I'm facing a problem opening a .XLS file (saved in excel 97-2003 format by Excel 2007) with euro characters.
It prints a "?" (question mark) instead of the euro char "€".

Googling over, I found that a comment of Douglas Atique at https://issues.apache.org/bugzilla/show_bug.cgi?id=30319#c10 pointed out at org.apache.poi.util.StringUtil.java .
It's code shows clearly that it uses ISO-8859-1 instead of the newer ISO-8859-15.

I think that it would be better to use the new coding.

More info:
* http://en.wikipedia.org/wiki/ISO/IEC_8859-15


Thanks,
Alejandro.
Comment 1 Alejandro Torras 2011-07-28 07:24:40 UTC
Simple java code to dump the contents:

	private void dumpExcel(InputStream is) throws Exception {

		final HSSFSheet st = new HSSFWorkbook(new POIFSFileSystem(is)).getSheetAt(0);
		for (final Iterator<Row> ri = st.rowIterator(); ri.hasNext();) {
			final Row r = ri.next();
			for (final Iterator<Cell> ci = r.cellIterator(); ci.hasNext();) {
				final Cell c = ci.next();
				c.setCellType(Cell.CELL_TYPE_STRING);
				System.out.print(c.getStringCellValue() + '\t');
			}
			System.out.println();
		}
	}
Comment 2 Nick Burch 2011-07-28 10:50:56 UTC
It's not a question of what would be better, but what Excel itself does...

Normally a string with a euro symbol in it will get stored as a unicode string, not an 8 bit one.

Could you try creating some files with characters that are in ISO-8859-1 but not -15, and the other way around? We can then use those to try to see if Excel flags in some way when it's deciding to use one encoding or the other
Comment 3 Dominik Stadler 2015-03-22 19:35:04 UTC
Waiting for information since 2011, therefore I am resolving this for now, please reopen with some more sample files if this is still an issue for you.