Bug 46890 - Text-formatted cells need to be recognized as CELL_TYPE_STRING - regardless from their content
Summary: Text-formatted cells need to be recognized as CELL_TYPE_STRING - regardless f...
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.2-FINAL
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2009-03-23 06:55 UTC by alesinskyy
Modified: 2009-03-24 08:35 UTC (History)
1 user (show)


Note You need to log in before you can comment on or make changes to this bug.
Description alesinskyy 2009-03-23 06:55:07 UTC
It seems that if cell contains valid number then POI always consider it type as NUMERIC - even if the cell carries text format.
Comment 1 Nick Burch 2009-03-23 07:16:09 UTC
POI will give you the cell type that excel set for the cell

You should check the cell formatting rules to see how your numeric cells are formatted
Comment 2 alesinskyy 2009-03-23 07:59:17 UTC
Yes, probably it gets, but it causes a lot of problems.
BTW, older versions of POI (e.g. 2.5.1) used to return string in such cases.

This change of behavior has broken DBUnit XLSDataSet, for example. 

And even worse - I have not found any clear reference to this (rather breakin) change neither in the documentation, nor in the changelog. 

It is relatively clear, that if cell is formatted as "text", then it contents has to be treated text, not number. That means that cell content may be taken "as-is".
Comment 3 Nick Burch 2009-03-23 08:19:03 UTC
Pretty sure that POI has always returned NUMERIC for cells that excel has stored as a floating point number, and only returns STRING for cells where the additional string record is used

If a cell is really numeric, then excel stores a float. You need to fetch back that float, then apply string formatting rules to make a string. You can't fetch "a string", as there isn't one in the file, all there is is the float + a formatting string.

If you'd care to upload a unit test that shows this differently from poi 2.5 to now, we'll take a look, but I'm fairly sure this is still an INVALID
Comment 4 alesinskyy 2009-03-24 05:51:14 UTC

I was wrong - POI 2.5.1 has behave practically the same. The real problem lies in DBUnit as such, not in POI.
Comment 5 alesinskyy 2009-03-24 07:47:31 UTC
Sorry, but I want to reopen this issue once more - the problem is that Excel (2003 and, likely, other versions as well) exhibits somewhat odd behavior - if cell containing valid numeric value is formated as text Excel still recognizes it as numeric till it would be edited once more, at least with "change-nothing" editing (in case if "Number stored as text" error checking rule is activated "Ignore error" should be confirmed for each affected cell as well).

It would be very nice to allow obtain string value (exactly as it is shown by Excel) of the such cell (which has numeric type, but text formatting) - now it throws exception. I'm not sure if it is attainable or this restriction comes directly from Excel.

Comment 6 Nick Burch 2009-03-24 08:35:14 UTC
For a numeric cell, the "text as shown by excel" isn't stored anywhere. Please follow standard POI practice, get the floating point number + format string, and format it yourself