When reading values from number cells, for some type of formatted numbers, HSSFDataFormatter returns incorrect values: 1. Simple numbers: adds an underscore after the number in the formatted string, using default number format, eg. a cell with the number 4455667788 is formatted as a string like '4455667788_'. 2. Account numbers are formatted with different precision than in Excel, eg. the number 0,896665 in excel, with 2 digits precision, is shown as 0,90. HSSFDataFormatter returns 0,896665. The same number, as currency or number is formatted as it appears in Excel (number, as reported above, adds an underscore in the end of the string, it returns '0,90_'. 3. Fraction numbers always returns the string '1 ??/??'. The XLS document was generated using Excel 2007 / portuguese, in Excel 97/2003 format. Other kind of number formats work fine, like currency, date, hour, percentage, scientific and phone numbers. Brazilian zip codes (ddddd-ddd) are formatted as dddddddd- and taxpayer registry numbers (ddddddddd-dd) as ddddddddddd-
Brazilian zip codes (ddddd-ddd) and taxpayer registry numbers (ddddddddd-dd) will require custom formats which you can easily add via addFormat(String, java.text.Format). See docs and source for examples of SSN, phone-number, etc. As far as the other problems, can you attach a sample xls that demonstrates the behavior? (In reply to comment #0) > When reading values from number cells, for some type of formatted numbers, > HSSFDataFormatter returns incorrect values: > 1. Simple numbers: adds an underscore after the number in the formatted string, > using default number format, eg. a cell with the number 4455667788 is formatted > as a string like '4455667788_'. > 2. Account numbers are formatted with different precision than in Excel, eg. > the number 0,896665 in excel, with 2 digits precision, is shown as 0,90. > HSSFDataFormatter returns 0,896665. The same number, as currency or number is > formatted as it appears in Excel (number, as reported above, adds an underscore > in the end of the string, it returns '0,90_'. > 3. Fraction numbers always returns the string '1 ??/??'. > The XLS document was generated using Excel 2007 / portuguese, in Excel 97/2003 > format. > Other kind of number formats work fine, like currency, date, hour, percentage, > scientific and phone numbers. > Brazilian zip codes (ddddd-ddd) are formatted as dddddddd- and taxpayer > registry numbers (ddddddddd-dd) as ddddddddddd-
Created attachment 22825 [details] Repro xls file for issue #3 Attaching repro xls file for Issue no:3 for fractions. This is reproduced in 3.1 and 3.5 as well.
Internally this class uses java.text.DecimalFormat and java.text.SimpleDateFormat. Fraction formatting support was not included in HSSFDataFormatter because it would require a dependency on org.apache.commons.math.fraction.FractionFormat or a similar custom Format class. Does the POI development team have a recommendation for this matter?
Could we dynamically load commons-math at runtime if it's there? If it is, you get fractions, if not, you don't (but also don't need another jar as a dependency)
Nick, your suggestion to dynamically load commons-math at runtime sounds like a good compromise. Unfortunately with my current project load, I have no bandwidth to devote to this right now. I can look at this when my schedule lightens up if a patch is not submitted in the meantime.
I'd suggest we leave the bug open for now then. If someone else fancies adding this new code, they can do and attach the patch to the bug. If not, please do work on it yourself once you have a spare moment :)
I have been implementing some code in Apache Tika (part of the Lucene community) to improve our number format support as part of a JIRA Ticket. As part of this is related to fractions I would like to progress the support of fractions with POI, therefore I have made changes to support dynamic usage of Commons Math. Still have a few tweaks to make and then will attach. Cheers, Dave
Great news, we'll look forward to seeing the patch :)
I've just made some improvements to how POI handles excel formatting, in relation to bug #48872. I think this will help with a few of your problems. Fractions still remains as a todo though. I feel I've a better handle on how it all works now, which should make reviewing and applying a patch easier, so if we could get a patch for fractions that'd be great!
As of r1231144, fractions (including the one in this file) are correctly formatted by DataFormatter, as part of the merging of the code from TEXT