Bug 49928 - formatCellValue returns incorrect value for £ formatted cells
Summary: formatCellValue returns incorrect value for £ formatted cells
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.6-FINAL
Hardware: PC Windows Vista
: P2 normal with 4 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-14 10:48 UTC by Bob Smith
Modified: 2011-01-20 08:20 UTC (History)
0 users



Attachments
A cell containing £1, that is formatted by POI as $1 (7.80 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2010-09-14 10:48 UTC, Bob Smith
Details
The Excel format dialog in an English locale (60.75 KB, image/png)
2010-10-04 05:54 UTC, Bob Smith
Details
The Excel format dialog in a US locale (64.26 KB, image/png)
2010-10-04 05:56 UTC, Bob Smith
Details
The original test-case file, saved as .xls (16.50 KB, application/vnd.ms-excel)
2010-12-20 08:06 UTC, Bob Smith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bob Smith 2010-09-14 10:48:47 UTC
Created attachment 26025 [details]
A cell containing £1, that is formatted by POI as $1

Steps to reproduce:

1. Using a machine set to the English locale, create an xlsx file using Excel 2007.
2. Type "£1" into a cell and observe that Excel converts it to a currency type cell with the "£" symbol. (See attachment).
3. Use POI to read the cell from step 2; call [new org.apache.poi.ss.usermodel.DataFormatter.formatCellValue(Cell)] on the cell.
4. Observe that the returned value is "$1", not "£1" as expected.
Comment 1 Nick Burch 2010-09-17 10:19:56 UTC
I'm pretty sure you're being bitten by an excel "feature". Excel is probably treating your £ as a generic currency symbol, and actually storing a $ in the file

Can you please try opening your sample file on a computer set to be in the US, and see what you get in excel?

I've just created a file with formattings of both $00.00 and £00.00, and POI is able to retrieve both symbols properly. So, I think the problem is with how you're setting the format, and not with POI
Comment 2 Bob Smith 2010-10-04 05:53:39 UTC
> Can you please try opening your sample file on a computer set to be in the US,
and see what you get in excel?

I have set my locale to US, and when I open the sample file I am shown "£1". I have tried this on two different systems - running Vista and Windows 7.

For reference, in the following posts I will attach screenshots of how the formatting dialog looks when opening the sample file in an English locale, and in a US locale.
Comment 3 Bob Smith 2010-10-04 05:54:54 UTC
Created attachment 26116 [details]
The Excel format dialog in an English locale
Comment 4 Bob Smith 2010-10-04 05:56:05 UTC
Created attachment 26117 [details]
The Excel format dialog in a US locale
Comment 5 Nick Burch 2010-10-04 06:08:42 UTC
OK, this is odd

In the english locale, it shows up as the default currency symbol. This is what I'd expect, as excel is actually writing the file with a $ and not a £

In the US locale, excel seems to know that it really does need to be a £, and not be the default currency. Thus, it shows up as a custom format with a £ in it

Quite how Excel is managing to know this I've no idea...

I'll need to do some more digging when I have a minute
Comment 6 Yegor Kozlov 2010-10-08 04:42:22 UTC
Fixed in r1005726, junit added. 

The problem workbook demonstrates an unexpected behavior. Excel has a set of built-in formats referenced by an integer id in the range [0, 164]. If a custom format is applied then it is created with a new id above 164. This behavior is observed in Excel 2003 and Excel 2007, both for the .xls and .xlsx formats.

In your case a custom format was applied, but its formatId was 6, i.e. in the built-in range. As the result, DataFormatter picked up a wrong built-in format with Dollar instead of Pound. 

I fixed the order in which formats are checked, now overridden formats have higher priority than built-in ones regardless from formatId. 

Yegor
Comment 7 Bob Smith 2010-12-20 08:03:57 UTC
I am re-opening this, because although the fix works for .xlsx files, it does not work for .xls files.
Comment 8 Bob Smith 2010-12-20 08:06:18 UTC
Created attachment 26427 [details]
The original test-case file, saved as .xls

This file was created from the originally posted test-case by using Excel to save-as .xls. The original test case file does not exhibit the bug any more, but this file does.
Comment 9 Bob Smith 2011-01-05 12:22:18 UTC
> The original test case file does not exhibit the bug any more, but this file does.

To be clear, I mean that in version 3.7 the bug has been fixed for .xlsx/XSSF, but not for .xls/HSSF.
Comment 10 Nick Burch 2011-01-20 07:43:15 UTC
Looks like the .xls has overridden some standard formats as well:

Offset=0x000003DA(986) recno=44 sid=0x041E size=0x0018(24)
[FORMAT]
    .indexcode       = 0x0005
    .isUnicode       = false
    .formatstring    = "£"#,##0;\-"£"#,##0
[/FORMAT]

Offset=0x000003F6(1014) recno=45 sid=0x041E size=0x001D(29)
[FORMAT]
    .indexcode       = 0x0006
    .isUnicode       = false
    .formatstring    = "£"#,##0;[Red]\-"£"#,##0
[/FORMAT]
Comment 11 Nick Burch 2011-01-20 08:20:51 UTC
Should be fixed for HSSF too in r1061288 - needed a similar fix