Bug 67784 - Regression in 5.2.4 -- cell type was boolean, now formula
Summary: Regression in 5.2.4 -- cell type was boolean, now formula
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: unspecified
Hardware: All All
: P2 regression (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks: 67785
  Show dependency tree
 
Reported: 2023-10-17 13:48 UTC by Tim Allison
Modified: 2023-12-06 10:56 UTC (History)
1 user (show)



Attachments
file showing problem (6.07 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2023-10-17 13:48 UTC, Tim Allison
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2023-10-17 13:48:27 UTC
Created attachment 39153 [details]
file showing problem

In 5.2.3, we used to get "ERROR: #VALUE!" instead of "#VALUE!" and we used to get "TRUE" or "FALSE" instead of "1" or "0".

I haven't yet tracked down the cause, but in the debugger, I can see that in XSSFSheetXMLHandler#outputCell the value of this.nextDataType is ERROR for the first and BOOLEAN for the second in 5.2.3.  However, the type is FORMULA in 5.2.4 for both, thus bypassing the correct handling for the extracted string.
Comment 1 Tim Allison 2023-10-17 13:50:28 UTC
Again, my apologies for not catching this in the regression tests before the 5.2.4 release.
Comment 2 Tim Allison 2023-10-17 14:17:44 UTC
In XSSFSheetXMLHandler in 5.2.3, there was a check before overwriting nextDataType:

                    if (this.nextDataType == XSSFSheetXMLHandler.xssfDataType.NUMBER) {
                        this.nextDataType = XSSFSheetXMLHandler.xssfDataType.FORMULA;
                    }

In 5.2.4, that check is gone, and the nextDataType is set to FORMULA and the containing type (at the cell level) is overwritten.

			<c r="A1" s="1" t="b">
				<f aca="false">(2 &gt; 5)</f>
				<v>0</v>
			</c>

This change came in here: https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/xssf/eventusermodel/XSSFSheetXMLHandler.java?r1=1897331&r2=1905575
Comment 3 Tim Allison 2023-10-17 14:19:14 UTC
Anyone have an idea what this fixes?

commit message: 

"possible issue when streaming xssf cells with formulas"
Comment 4 PJ Fanning 2023-10-17 14:37:18 UTC
that day, all the other changes were for https://bz.apache.org/bugzilla/show_bug.cgi?id=66365 - so it probably relates to that

In the end of the day, POI has a lot of bugs and low test coverage. Can we start by working out of the 'regression' causes working code to now have a bug or if we have changed the issue because the old behaviour was a bug?

I much prefer not reverting code without having a test scenario that can be used to decide if the regression is needed and the test case can protect against future regressions.

I just added r1913064 and that seems to be the correct behaviour (to me).
Comment 5 PJ Fanning 2023-10-17 14:57:20 UTC
It really seems to me that XSSFSheetXMLHandler data type tracking is deficient.

In XSSFCell, we track the cellType and also have an extra CachedFormulaResultType so that we know can know a cell type is formula type but also know what type the cached result has.
Comment 6 PJ Fanning 2023-10-17 15:34:52 UTC
I'm trying out r1913067. No guarantee that it won't have other side effects but it reverts in Comment 2 and uses a reworked set of code to support the option to output formulas instead of cached results (for formula cells).
Comment 7 PJ Fanning 2023-10-17 16:09:27 UTC
Tim - my changes seem to work ok. Could you have a look at the latest code?

I have noticed that the non streaming XSSFExcelExtractor does not have the code for taking formula results for booleans and errors and formatting the results like the streaming version does. I'll raise an issue for that and will change XSSFExcelExtractor if there are no objections. It makes sense that both extractors work similarly.
Comment 8 Tim Allison 2023-10-18 10:10:36 UTC
PJ thank you for diving into this. I only had time yesterday to figure out the source of the change. I would never suggest mindlessly reverting mods without unit tests for both the original mod and the potential reversion or other fix.

I should have a chance to take a look later this morning.

Thank you, again!