Bug 63211 - DataFormatter incorrectly formats data formats with escaped percent character
Summary: DataFormatter incorrectly formats data formats with escaped percent character
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: SS Common (show other bugs)
Version: unspecified
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-26 23:59 UTC by spdooley
Modified: 2021-11-25 13:16 UTC (History)
0 users



Attachments
Cell format & value in Excel (15.10 KB, image/png)
2019-02-27 15:09 UTC, spdooley
Details

Note You need to log in before you can comment on or make changes to this bug.
Description spdooley 2019-02-26 23:59:04 UTC
When a format contains an escaped percent character (i.e. \% or "%"), the escaping is ignored and the resultant formatted value is 100 times larger than it should be. Some examples to illustrate the problem:

// Should be 12.5% but is 1250.0%
new DataFormatter(Locale.US).formatRawCellContents(12.5, -1, "0.0\\%;\\-0.0\\%");
// Should be 12.5% but is 1250.0%
new DataFormatter(Locale.US).formatRawCellContents(12.5, -1, "0.0\"%\";\\-0.0\"%\"");
// Should be -12.5% but is -1250.0%
new DataFormatter(Locale.US).formatRawCellContents(-12.5, -1, "0.0\\%;\\-0.0\\%");
// Should be -12.5% but is -1250.0%
new DataFormatter(Locale.US).formatRawCellContents(-12.5, -1, "0.0\\%;\\-0.0\\%");
Comment 1 PJ Fanning 2019-02-27 13:25:58 UTC
12.5% is 0.125 in decimal. formatRawCellContents should assume you are providing a decimal.
Comment 2 spdooley 2019-02-27 15:09:24 UTC
Created attachment 36470 [details]
Cell format & value in Excel

In Excel, when the percent sign is escaped with a backslash, it does not treat the value as a percentage and merely uses the decimal value as is without any adjustment.
Comment 3 spdooley 2019-02-27 15:20:36 UTC
See attachment for details on how this is handled in Excel. It seems the problem is related to DataFormatter#cleanFormatForNumber. At line 675, there is a check for a quote in backslash and these characters are removed from the format. This does not seem correct. Instead, it seems that when there is a backslash or a quoted value, it should be converted to quoted value in the Java format (e.g. \% -> '%' and "%" -> '%').
Comment 4 spdooley 2019-02-27 15:31:23 UTC
Additionally, see the Excel documentation regarding format code guidelines: https://support.office.com/en-us/article/number-format-codes-5026bbd6-04bc-48cd-bf33-80f18b4eae68. From the text and spacing section:

To display both text and numbers in a cell, enclose the text characters in double quotation marks (" ") or precede a single character with a backslash (\). Include the characters in the appropriate section of the format codes. For example, you could type the format $0.00" Surplus";$–0.00" Shortage" to display a positive amount as "$125.74 Surplus" and a negative amount as "$–125.74 Shortage."
Comment 5 PJ Fanning 2021-11-25 05:07:40 UTC
added r1895312