Bug 44996 - Text cells are treated as numeric if they happen to contain only 0-9
Summary: Text cells are treated as numeric if they happen to contain only 0-9
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.0-FINAL
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-14 04:29 UTC by Michael Bjerregaard-Pedersen
Modified: 2009-02-18 12:29 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Bjerregaard-Pedersen 2008-05-14 04:29:34 UTC
I'm not sure if my problem is identical to 39998, but here goes:

In the description of the class HSSFCell is says ” String cells cannot conatin numbers and numeric cells cannot contain strings (at least according to our model).”
This model causes some problems that are hard to work around when dealing with users input:

1. A string cell, that is a cell for which the format in Excel has explicitly been specified as “Text”, should be allowed to contain anything. But if the cell happens to contain only characters in the range ‘0’-‘9’ getCellType() returns NUMERIC and getRichStringCellValue() throws an exception.

2. A date cell, again explicitly formatted as such in Excel, is also treated as numeric cell when it contains a date like “01.07.2008”


If possible I would like HSSFCell to either quit the guessing and the assuming, or implement something like getRawCellType() and getRawStringCellValue() that would return the data the user entered and not a processed result.
Comment 1 Martin Schöffler 2008-05-14 06:14:58 UTC
(In reply to comment #0)
> If possible I would like HSSFCell to either quit the guessing and the assuming,
> or implement something like getRawCellType() and getRawStringCellValue() that
> would return the data the user entered and not a processed result.

As I understand it POI is already doing what you propose, as it does no guessing but only reads what excel wrote to the xls file.
Formatting a cell as text does not mean it is stored as text internally in the xls-file. Excel automatically stores text that contains only digits as a NUMERIC.
I had some success in forcing excel to store the value as a string by changing the value again after I set format to text. The workflow is as follows: entring value, changing format, changing value. 

But that is not a good workaround. It is best to retrieve the format as pointed out in bug 39998 and try to format the value accordingly.

Comment 2 Michael Bjerregaard-Pedersen 2008-05-15 05:46:35 UTC
I don't know the internals of xls-file, but I do not doubt they are tricky.

The reason I submitted this report is that the behaviour has changed from earlier POI version, leading me to think it was a bug.

I have made it work by making my program know what to expect in from each column thus making it easier to handle exceptions correct.

Comment 3 Nick Burch 2008-05-20 09:52:21 UTC
Could you please upload a little unit test showing the problem you're having?

I've just had a try (see commit r658336), and everything behaves as I'd expect it to
Comment 4 David Fisher 2009-02-18 12:29:08 UTC
Without a response to Nick's test this will be marked INVALID.