|Summary:||XSSFDataValidationConstraint doesn't parse static list text properly|
|Product:||POI||Reporter:||Greg Woolsey <greg.woolsey>|
|Component:||XSSF||Assignee:||POI Developers List <dev>|
patch to properly parse static list validation constraints
updates to use constants, add comments, and another unit test case
Description Greg Woolsey 2016-06-18 05:30:05 UTC
Created attachment 33959 [details] patch to properly parse static list validation constraints When you add a LIST type validation to a cell in Excel, and type a comma separated list of values to accept instead of using a range reference or named range, the value is stored in the XLSX sheet XML as a "formula1" string: "one, two, three" named ranges as list validation values are stored in formula1 as range_name note the lack of double quotes. Currently, XSSFDataValidationConstraint just splits all "formula1" strings by comma. This makes the first and last values include the double quotes, and all values contain any whitespace leading or trailing a comma. Excel automatically trims whitespace when parsing the CSV value strings. The attached patch checks for and removes the enclosing quotes, and uses a compiled pattern to split by commas with optional whitespace. If the formula isn't enclosed in double quotes, it is not parsed into the explicit values array. I can't tell from inspecting XLSX files how Excel knows whether a formula is a list of values or a range, other than the quotes, or falling back on a list if evaluating as a range fails.
Comment 1 Javen O'Neal 2016-06-19 04:45:14 UTC
Applied in r1749129. Could you write a unit test that shows a scenario where the previous code failed and the patched code works?
Comment 2 Javen O'Neal 2016-06-19 06:13:00 UTC
Comment 3 Greg Woolsey 2016-06-19 21:24:10 UTC
Created attachment 33964 [details] updates to use constants, add comments, and another unit test case My latest patch attachment uses the new QUOTE constant one more place and adds one more unit test, validating the formula1 value is set properly from the constructor that accepts an array of string literals. I think with these and your other tests and changes this is ready to go. Why the OOXML format overloads formula1 this way is baffling to me. I bet the old binary format had some similar form of overloading.
Comment 4 Javen O'Neal 2016-06-20 05:01:56 UTC
It seems like there are more entry points to XSSFDataValidation than necessary, and each of the entry points end up initializing the member variables in subtly different ways (leading left quote gets stripped, both quotes get stripped, neither quote gets stripped; whitespace is removed or not removed from the formula, etc). Additionally, even if the storage format overloads formula1, it doesn't mean we need to overload it. Since you have experience with this class, could you provide a recommendation on what constructors or methods should be deprecated to make this class both easier to maintain and fool-proof for users to use? Save that for a future bug, I guess. Applied via r1749265.