Apache OpenOffice (AOO) Bugzilla – Issue 66023
Unbalanced parentheses in expression should not trigger regexp functions
Last modified: 2017-05-20 11:11:46 UTC
SUMIF function fails to properly find fields that match criteria if the criteria contains either ( or ). Not all characters were tested. Repro Steps: - Set 'Criteria': In Cells A2:A6 String one String (two) String three) String (four String – five Data (Cells A2:B26) : IE: Test ID's Counts String one 1 String (two) 2 String three) 3 String (four 4 String one 1 String (two) 2 String three) 3 String (four 4 String one 1 String (two) 2 String three) 3 String (four 4 String one 1 String (two) 2 String three) 3 String (four 4 String – five 20 In Cells B2:B5 Enter formula (changing using A2->A6 respectively) : =SUMIF(A$10:A$25;A2;B$10:B$25) Results: Validity Values Totals Exected = 4* # String one 4 4 String (two) 0 8 String three) 0 12 String (four 0 16 String – five 20 20 formulas were entered into first field, and then dragged. Below is resultant formulas.: =SUMIF(A$10:A$26;A2;B$10:B$26) =SUMIF(A$10:A$26;A3;B$10:B$26) =SUMIF(A$10:A$26;A4;B$10:B$26) =SUMIF(A$10:A$26;A5;B$10:B$26) =SUMIF(A$10:A$26;A6;B$10:B$26)
Created attachment 36868 [details] Sample file showing defect.
Not a bug. A bracket is interpreted as regular expression if tools-options-OpenOffice.org Calc-calculate-enable regular expressions in formulas is enabled. Switch it off and you get your expected result. Frank
closed invalid
Is this a defect by design, or an unplanned consequence of two different features interacting? The reason I ask defect by design is that if someone was to evaluate OpenOffice "out of the box" as a replacement for Excel "out of the box", and one of there criteria happened to be the SUMIF function - This has just failed it. Also this prevents the use of both SUMIF & Regular Expressions in the same session. When someone is viewing the sheet, they can quite clearly see that the values match, however, as pointed out, you need to turn off an advanced feature (enable regular expressions) for it to work properly. Some pop-up of note in the formula editor should warn re: regular expressions. Also, How are these valid regular expressions: String three) String (four There are mising parenthesis and when using thess two regular expression validators (http://www.sweeting.org/mark/html/revalid.php or http://www.miningtools.net/regextester.aspx) it shows them as invalid. In which case, sumif should work for these two entries.
Hi, have a look at the Help system for regular expressions;list of Frank
I have read that section of the help file & found this section on parentheses. It does not however address the second part of this bug, invalid Regular expresions. From Help file: ---- ( ) Defines the characters inside the parentheses as a reference. You can then refer to the first reference in the current expression with "\1", to the second reference with "\2", and so on. For example, if your text contains the number 13487889 and you search using the regular expression (8)7\1\1, "8788" is found. You can also use () to group terms, for example, "a(bc)?d" finds "ad" or "abcd". ---- End help file section. In the above text, as in all regular expressions, parentheses must be balanced. In the example below, they are not, and therefore should not be considered a valid regular expression, and by that logic should be caught by the sum-if function, and probably all other functions as well. EG: String three) - Not balanced, only 1 ), no opening String (four - Not balanced, only 1 (, no closing There are missing a parenthesis. I used these two online regular expression validators (http://www.sweeting.org/mark/html/revalid.php or http://www.miningtools.net/regextester.aspx) it shows them as invalid. In which case, sumif should work for these two entries. On reflection, this may not necessarily be a bug in just the sum-if function, but in the dependant code that it uses to determine if a cell contains a regular expression. In which case it would be a larger issue, in calc in general. Re-opening because help content, as well as basic regular expressions require balanced parentheses. Example above does not contain balanced parentheses, but is recognised as a Regular Expression, as such, behaviour is incorrect, be is SUM-IF function, or CALC in general.
hi Eike, please comment on this one and proceed as needed. Frank
We could check whether unescaped opening and closing parentheses are balanced and only trigger regexp if so; same for braces {} and brackets []. Furthermore, the mere presence of parentheses only, without other meta characters being used, probably should not trigger regexp at all. By no means a P2 and also not a defect.
I wonder whether this issue could be closed now? 1. the thing is working as expected 2. it's impossible for OOo to discover faulty regex syntax, as you can't tell what the user is trying to do. Among other things, he/she might be experimenting with faulty syntax. 3. Even if you do find faulty syntax, what do you do? Certainly not silently kill the regex engine - surely that merely gives more unexpected results for the user. Anyone using regex has to take responsibilty for correct syntax. Yes, the check box to enable regex is not obvious, but that's a different issue. Hoping that's a help.
change target from 2.x to 3.x according to http://wiki.services.openoffice.org/wiki/Target_3x
Reset assigne to the default "issues@openoffice.apache.org".