Bug 63211

Summary: DataFormatter incorrectly formats data formats with escaped percent character
Product: POI Reporter: spdooley
Component: SS CommonAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: major    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Cell format & value in Excel

Description spdooley 2019-02-26 23:59:04 UTC
When a format contains an escaped percent character (i.e. \% or "%"), the escaping is ignored and the resultant formatted value is 100 times larger than it should be. Some examples to illustrate the problem:

// Should be 12.5% but is 1250.0%
new DataFormatter(Locale.US).formatRawCellContents(12.5, -1, "0.0\\%;\\-0.0\\%");
// Should be 12.5% but is 1250.0%
new DataFormatter(Locale.US).formatRawCellContents(12.5, -1, "0.0\"%\";\\-0.0\"%\"");
// Should be -12.5% but is -1250.0%
new DataFormatter(Locale.US).formatRawCellContents(-12.5, -1, "0.0\\%;\\-0.0\\%");
// Should be -12.5% but is -1250.0%
new DataFormatter(Locale.US).formatRawCellContents(-12.5, -1, "0.0\\%;\\-0.0\\%");
Comment 1 PJ Fanning 2019-02-27 13:25:58 UTC
12.5% is 0.125 in decimal. formatRawCellContents should assume you are providing a decimal.
Comment 2 spdooley 2019-02-27 15:09:24 UTC
Created attachment 36470 [details]
Cell format & value in Excel

In Excel, when the percent sign is escaped with a backslash, it does not treat the value as a percentage and merely uses the decimal value as is without any adjustment.
Comment 3 spdooley 2019-02-27 15:20:36 UTC
See attachment for details on how this is handled in Excel. It seems the problem is related to DataFormatter#cleanFormatForNumber. At line 675, there is a check for a quote in backslash and these characters are removed from the format. This does not seem correct. Instead, it seems that when there is a backslash or a quoted value, it should be converted to quoted value in the Java format (e.g. \% -> '%' and "%" -> '%').
Comment 4 spdooley 2019-02-27 15:31:23 UTC
Additionally, see the Excel documentation regarding format code guidelines: https://support.office.com/en-us/article/number-format-codes-5026bbd6-04bc-48cd-bf33-80f18b4eae68. From the text and spacing section:

To display both text and numbers in a cell, enclose the text characters in double quotation marks (" ") or precede a single character with a backslash (\). Include the characters in the appropriate section of the format codes. For example, you could type the format $0.00" Surplus";$–0.00" Shortage" to display a positive amount as "$125.74 Surplus" and a negative amount as "$–125.74 Shortage."