Bug 59638 - org.apache.poi.ss.usermodel.DataFormatter forces a comma as a grouping separator
Summary: org.apache.poi.ss.usermodel.DataFormatter forces a comma as a grouping separator
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: SS Common (show other bugs)
Version: 3.14-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-27 07:13 UTC by Toni Helenius
Modified: 2016-06-07 07:55 UTC (History)
1 user (show)



Attachments
Cell format in Excel (20.72 KB, image/png)
2016-05-27 10:57 UTC, Toni Helenius
Details
proposed fix and test cases (patch) (2.70 KB, patch)
2016-06-06 21:22 UTC, Axel Howind
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Toni Helenius 2016-05-27 07:13:13 UTC
org.apache.poi.ss.usermodel.DataFormatter forces a comma as a grouping separator. This used to work with POI 3.13-final. I'm using Finnish locale. And we have a space (" ") character as a grouping separator. I debugged this and it indeed goes to POI correctly, the grouping separator.

The code:
DataFormatter formatter = new DataFormatter();
formatter.formatCellValue(cell);

I'm not sure but is it this commit that caused it:
https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/usermodel/DataFormatter.java?r1=1712605&r2=1712604&pathrev=1712605

I can see a comma added there. I didn't really debug that far. Just a stab in the dark.
Comment 1 Nick Burch 2016-05-27 10:47:57 UTC
On the whole, Apache POI only works with the formats defined in Excel, which are hard coded to US-English ones. We don't currently have any support for performing the localisation that Excel does on many formats when opened in other locales

Right now, if we render to a string the same way that US-English Excel does, we consider it "working"

We would like to get the list of "localised" format strings, for those formats which support it (some formats are hard-coded as US-English, but many do change). Were that done, we'd then be able to properly support localising formats for those who wanted it. DateFormatConverter has a little bit of that, but needs generalising for all format types

Volunteer needed!
Comment 2 Toni Helenius 2016-05-27 10:57:34 UTC
Created attachment 33899 [details]
Cell format in Excel

The decimal separator comes out as Finnish. It is also comma. US-English would have a period, right?

Attachment show how the cell shows up in Excel.
Comment 3 Axel Howind 2016-06-06 21:22:50 UTC
Created attachment 33921 [details]
proposed fix and test cases (patch)
Comment 4 Javen O'Neal 2016-06-07 06:26:06 UTC
Thanks for the patch, Axel! Applied in r1747139.
Updated changelog in r1747144.

Please reopen if anything else is needed.
Comment 5 Javen O'Neal 2016-06-07 07:55:15 UTC
(In reply to Nick Burch from comment #1)
> We would like to get the list of "localised" format strings

Java's understanding of localized number formatting: http://docs.oracle.com/cd/E19455-01/806-0169/overview-9/index.html
This may differ from Excel. 
https://support.office.com/en-us/article/Change-the-character-used-to-separate-thousands-or-decimals-c093b545-71cb-4903-b205-aebb9837bd1e
We should improve the comments in DataFormatter to make a distinction between thousands separator and decimal separator.

https://support.office.com/en-us/article/show-or-hide-the-thousands-separator-b9f8aee0-ef50-42e5-8fd7-6e3ab1493876
"NOTE: By default, Microsoft Office Excel displays the system separator for thousands. You can specify a different system separator by changing the regional settings in Control Panel."
Can someone compile a list of thousands and decimal separators for each region with Excel 2007+?