Bug 60040 - CellNumberFormatter wrong formating for Locale pt_BR
Summary: CellNumberFormatter wrong formating for Locale pt_BR
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: SS Common (show other bugs)
Version: 3.15-FINAL
Hardware: PC Linux
: P2 major with 4 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-24 14:53 UTC by Fernando Kasten Peinado
Modified: 2017-04-10 09:06 UTC (History)
2 users (show)



Attachments
Replacing some use of the chars '.' and ',' by the correct symbol (4.88 KB, text/plain)
2016-08-25 20:06 UTC, Fernando Kasten Peinado
Details
New Patch (4.36 KB, patch)
2016-08-25 20:33 UTC, Fernando Kasten Peinado
Details | Diff
Native US locale processing of formats and changing Decimal and Grouping afterwards (3.20 KB, patch)
2017-03-09 17:01 UTC, Sebastiaan Blommers
Details | Diff
ProposedSolution+UnitTest (7.20 KB, patch)
2017-04-10 09:06 UTC, samson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fernando Kasten Peinado 2016-08-24 14:53:20 UTC
After the commit 91e6a646 by Andreas Beeker 2015-09-07 22:34:21, the Locale used to formatting is not fixed to Locale.US anymore. 

But on Locales where DecimalFormat symbols are different from US, like in Brazil: 

'.' => ',' and ',' => '.' 

the code is formatting wrong, mainly because the code uses this symbols fixed in char format like in this code:

private void writeFractional(StringBuffer result, StringBuffer output) {
        int digit;
        int strip;
        ListIterator<Special> it;
        if (fractionalSpecials.size() > 0) {
            digit = result.indexOf(".") + 1;  // <===== HERE
            if (exponent != null)
                strip = result.indexOf("e") - 1;
            else
                strip = result.length() - 1;
            while (strip > digit && result.charAt(strip) == '0')
                strip--;
            it = fractionalSpecials.listIterator();
            while (it.hasNext()) {
                Special s = it.next();
                char resultCh = result.charAt(digit);
                if (resultCh != '0' || s.ch == '0' || digit < strip)
                    output.setCharAt(s.pos, resultCh);
                else if (s.ch == '?') {
                    // This is when we're in trailing zeros, and the format is '?'.  We still strip out remaining '#'s later
                    output.setCharAt(s.pos, ' ');
                }
                digit++;
            }
        }
    }
Comment 1 Fernando Kasten Peinado 2016-08-25 20:06:44 UTC
Created attachment 34174 [details]
Replacing some use of the chars '.' and ',' by the correct symbol
Comment 2 Fernando Kasten Peinado 2016-08-25 20:33:18 UTC
Created attachment 34175 [details]
New Patch
Comment 3 Javen O'Neal 2016-09-10 06:54:26 UTC
Thanks for the locale-specific information! It is always helpful to have Locale-specific feedback from users outside the US/UK.

I see that there is a variable CellNumberPartHandler ph that is a member of CellNumberFormatter. This class contains getDecimalPoint (but not grouping separator). Inside this class are some hard-coded '.' literals. Any chance these would cause bugs in your Locale?

I have never looked at CellNumberFormatter or CellNumberPartHandler before today, so perhaps you would be more knowledgeable here. Would it be better to use the part handler to contain the locale-specific separator characters rather than a new dfs variable?
Comment 4 Javen O'Neal 2016-09-10 07:14:45 UTC
Would you be willing to write a unit test for either o.a.p.ss.format.TestCellFormat (preferred) or o.a.p.ss.usermodel.TestDataFormatter?

It would be nice to have a test case to make sure there aren't regressions on this feature in the future.

https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/format/ -> https://svn.apache.org/viewvc/poi/trunk/src/testcases/org/apache/poi/ss/format/

https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/usermodel/DataFormatter.java -> https://svn.apache.org/viewvc/poi/trunk/src/testcases/org/apache/poi/ss/usermodel/TestDataFormatter.java
Comment 5 Sebastiaan Blommers 2017-03-09 17:00:18 UTC
I still had trouble with this patch, I reverted to the use of US locale (formats ok) and fix the DecimalFormat symbolics afterwards when all CellNumberFormatter logic has passed. Whenever CellFormat.getInstance is called that format (the object) is cached. Using a local final DecimalFormatSymbols in the CellNumberFormatter is not a good idea (especially when unit testing different locales).

I changed the corresponding POI version to 3.15 in this ticket.

Unit test with the use of locale
public class TestCellFormat {

	final static Locale NL = new Locale("nl", "NL");
	char pound = '\u00A3';
	char euro = '\u20AC';

	// Accounting -> 0 decimal places, default currency symbol
	String formatDft = "_-\"$\"* #,##0_-;\\-\"$\"* #,##0_-;_-\"$\"* \"-\"_-;_-@_-";
	// Accounting -> 0 decimal places, US currency symbol
	String formatUS = "_-[$$-409]* #,##0_ ;_-[$$-409]* -#,##0 ;_-[$$-409]* \"-\"_-;_-@_-";
	// Accounting -> 0 decimal places, UK currency symbol
	String formatUK = "_-[$" + pound + "-809]* #,##0_-;\\-[$" + pound + "-809]* #,##0_-;_-[$" + pound
			+ "-809]* \"-\"??_-;_-@_-";
	// French style accounting, euro sign comes after not before
	String formatFR = "_-#,##0* [$" + euro + "-40C]_-;\\-#,##0* [$" + euro + "-40C]_-;_-\"-\"??* [$" + euro
			+ "-40C] _-;_-@_-";
	// Dutch style accounting, euro sign comes before
	String formatNL = "_-\"€\"* #,##0.00_-;\\-\"€\"* #,##0.00_-;_-\"€\"* \"-\"_-;_-@_-";

	@Test(enabled = true)
	public void testAccountingFormatsNL() throws IOException {

		LocaleUtil.setUserLocale(NL);

		// Has +ve, -ve and zero rules
		CellFormat cfDft = CellFormat.getInstance(formatDft);
		CellFormat cfUS = CellFormat.getInstance(formatUS);
		CellFormat cfUK = CellFormat.getInstance(formatUK);
		CellFormat cfFR = CellFormat.getInstance(formatFR);
		CellFormat cfNL = CellFormat.getInstance(formatNL);

		// For +ve numbers, should be Space + currency symbol + spaces + whole
		// number with commas + space
		// (Except French, which is mostly reversed...)
		assertEquals(" $   12 ", cfDft.apply(Double.valueOf(12.33)).text);
		assertEquals(" $   12 ", cfUS.apply(Double.valueOf(12.33)).text);
		assertEquals(" " + pound + "   12 ", cfUK.apply(Double.valueOf(12.33)).text);
		assertEquals(" 12   " + euro + " ", cfFR.apply(Double.valueOf(12.33)).text);
		assertEquals(" " + euro + "   12,33 ", cfNL.apply(Double.valueOf(12.33)).text);

		assertEquals(" $   16.789 ", cfDft.apply(Double.valueOf(16789.2)).text);
		assertEquals(" $   16.789 ", cfUS.apply(Double.valueOf(16789.2)).text);
		assertEquals(" " + pound + "   16.789 ", cfUK.apply(Double.valueOf(16789.2)).text);
		assertEquals(" 16.789   " + euro + " ", cfFR.apply(Double.valueOf(16789.2)).text);
		assertEquals(" " + euro + "   16.789,20 ", cfNL.apply(Double.valueOf(16789.2)).text);

		assertEquals(" $   - ", cfDft.apply(Double.valueOf(0)).text);
		assertEquals(" " + euro + "   - ", cfNL.apply(Double.valueOf(0)).text);

		// Test in US locale
		LocaleUtil.setUserLocale(Locale.US);

		// For +ve numbers, should be Space + currency symbol + spaces + whole
		// number with commas + space
		// (Except French, which is mostly reversed...)
		assertEquals(" $   12 ", cfDft.apply(Double.valueOf(12.33)).text);
		assertEquals(" $   12 ", cfUS.apply(Double.valueOf(12.33)).text);
		assertEquals(" " + pound + "   12 ", cfUK.apply(Double.valueOf(12.33)).text);
		assertEquals(" 12   " + euro + " ", cfFR.apply(Double.valueOf(12.33)).text);
		assertEquals(" " + euro + "   12.33 ", cfNL.apply(Double.valueOf(12.33)).text);

		assertEquals(" $   16,789 ", cfDft.apply(Double.valueOf(16789.2)).text);
		assertEquals(" $   16,789 ", cfUS.apply(Double.valueOf(16789.2)).text);
		assertEquals(" " + pound + "   16,789 ", cfUK.apply(Double.valueOf(16789.2)).text);
		assertEquals(" 16,789   " + euro + " ", cfFR.apply(Double.valueOf(16789.2)).text);

		// For -ve numbers, gets a bit more complicated...
		assertEquals("-$   12 ", cfDft.apply(Double.valueOf(-12.33)).text);
		assertEquals(" $   -12 ", cfUS.apply(Double.valueOf(-12.33)).text);
		assertEquals("-" + pound + "   12 ", cfUK.apply(Double.valueOf(-12.33)).text);
		assertEquals("-12   " + euro + " ", cfFR.apply(Double.valueOf(-12.33)).text);

		assertEquals("-$   16,789 ", cfDft.apply(Double.valueOf(-16789.2)).text);
		assertEquals(" $   -16,789 ", cfUS.apply(Double.valueOf(-16789.2)).text);
		assertEquals("-" + pound + "   16,789 ", cfUK.apply(Double.valueOf(-16789.2)).text);
		assertEquals("-16,789   " + euro + " ", cfFR.apply(Double.valueOf(-16789.2)).text);
		assertEquals(" " + euro + "   16,789.20 ", cfNL.apply(Double.valueOf(16789.2)).text);

		assertEquals(" $   - ", cfDft.apply(Double.valueOf(0)).text);
		assertEquals(" " + euro + "   - ", cfNL.apply(Double.valueOf(0)).text);
	}
}
Comment 6 Sebastiaan Blommers 2017-03-09 17:01:58 UTC
Created attachment 34813 [details]
Native US locale processing of formats and changing Decimal and Grouping afterwards
Comment 7 samson 2017-04-10 09:05:11 UTC
- I would suggest a solution similar to Fernando's Solution (i.e. replacing the string literals with a correct locale based separators).
 Here I have attached a possible solution and added a unit test to TestDataFormatter.java (see the attached CustomFormatting.patch file). 
 It seems like the CellNumberPartHandler class is used to parse the format (i.e. "#,##0 ;(#,##0) ;"-""), and such formats are not locale specific.
Comment 8 samson 2017-04-10 09:06:17 UTC
Created attachment 34903 [details]
ProposedSolution+UnitTest