Bug 61033

Summary: [PATCH] Add Workbook.setCellFormulaValidation to control whether formulas are validated during Cell.setCellFormula or not
Product: POI Reporter: Travis Burtrum <admin.apache>
Component: SS CommonAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: enhancement Keywords: PatchAvailable
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: 0001-Add-Workbook.setCellFormulaValidation-to-control-whe.patch
0001-Add-XSSFWorkbook.setCellFormulaValidation-to-control.patch
0002-Unit-tests-for-XSSFWorkbook.setCellFormulaValidation.patch

Description Travis Burtrum 2017-04-24 19:32:53 UTC
Created attachment 34946 [details]
0001-Add-Workbook.setCellFormulaValidation-to-control-whe.patch

Our application must set formulas, then evaluate them.  Cell.setCellFormula does expensive formula parsing, then FormulaEvaluator.evaluate() parses it again.  With this patch if we turn off formula validation our total runtime per job almost halves, dropping from 21 minutes to 11 minutes.

Let me know if anything should change.  I implemented this as part of the Workbook interface, but then realized that HSSFWorkbook and SXSSFWorkbook actually don't parse the formula or throw the ParseException anyway, they might in the future though so I guess this is still the correct place?

Thanks much!
Comment 1 Javen O'Neal 2017-04-25 03:19:03 UTC
Could you please add some unit tests covering the new functionality?
Comment 2 Travis Burtrum 2017-04-25 14:10:08 UTC
It's basically just a setter/getter for a boolean, and then not calling validation logic if false, I'm not sure of a great way to unit test that, especially since it only applies to XSSF too?

I guess maybe I could set it false and send in an invalid formula and assert I didn't get an exception?  Anything else?
Comment 3 Javen O'Neal 2017-04-25 15:52:51 UTC
From Greg Woolsey on dev@:
Test the positive case too, true + invalid formula allowed, test that the
default is true, test that flipping it from false to true with an invalid
formula in place, resulting in an invalid workbook, is somehow handled so
users don't get odd errors later. Maybe validate all formulas then?  That
would be expensive.  If we just allow an invalid state then, the javadoc
needs to clearly note that for the boolean setter and getter, and the
formula setter and getter.

It will make it easy for users to get things all mixed up but appear valid,
so the docs need to be vocal about it.
Comment 4 Javen O'Neal 2017-04-25 15:55:58 UTC
Combinations that need tested:

1. Evaluate formulas=false should not raise an exception when setting an invalid formula.
2. Evaluate formulas=false should not raise an exception when setting a valid formula.
3. Evaluate formulas=true should raise an exception when setting an invalid formula.
4. Evaluate formulas=true should not raise an exception when setting a valid formula.
Comment 5 Travis Burtrum 2017-04-26 18:16:25 UTC
Created attachment 34953 [details]
0001-Add-XSSFWorkbook.setCellFormulaValidation-to-control.patch

Alternative patch suggested by kiwiwingsā€ˇ on IRC to limit this addition to XSSFWorkbook since neither HSSF or SXSSF ever validate the formula or throw this exception anyway.
Comment 6 Travis Burtrum 2017-04-26 18:17:50 UTC
Created attachment 34954 [details]
0002-Unit-tests-for-XSSFWorkbook.setCellFormulaValidation.patch

Unit tests for either of the previous patches.
Comment 7 Travis Burtrum 2017-04-26 18:22:20 UTC
Also I've been pushing up my branches here too if it makes it easier to look at:

https://github.com/moparisthebest/poi/commits/formula_patch
https://github.com/moparisthebest/poi/commits/formula_patchv2

v2 is the patch to only XSSFWorkbook, the first patches Workbook
Comment 8 Dominik Stadler 2017-09-20 19:09:57 UTC
Applied via r1809071, thanks for providing the implementation and tests.