Issue 66023 - Unbalanced parentheses in expression should not trigger regexp functions
Summary: Unbalanced parentheses in expression should not trigger regexp functions
Status: ACCEPTED
Alias: None
Product: Calc
Classification: Application
Component: editing (show other issues)
Version: OOo 2.0.2
Hardware: All All
: P4 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-02 07:13 UTC by rbanfield
Modified: 2017-05-20 11:11 UTC (History)
3 users (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Sample file showing defect. (8.12 KB, application/vnd.sun.xml.calc)
2006-06-02 07:14 UTC, rbanfield
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description rbanfield 2006-06-02 07:13:39 UTC
SUMIF function fails to properly find fields that match criteria if the criteria
contains either ( or ). 
Not all characters were tested.

Repro Steps:
 - Set 'Criteria': In Cells A2:A6

String one
String (two)
String three)
String (four
String – five


Data (Cells A2:B26) :

IE:
Test ID's	Counts
String one	1
String (two)	2
String three)	3
String (four	4
String one	1
String (two)	2
String three)	3
String (four	4
String one	1
String (two)	2
String three)	3
String (four	4
String one	1
String (two)	2
String three)	3
String (four	4
String – five	20

 

In Cells B2:B5 Enter formula (changing using A2->A6 respectively) : 
=SUMIF(A$10:A$25;A2;B$10:B$25)


Results:
Validity Values	Totals	Exected = 4* #
String one	4	4
String (two)	0	8
String three)	0	12
String (four	0	16
String – five	20	20

formulas were entered into first field, and then dragged. Below is resultant
formulas.: 
=SUMIF(A$10:A$26;A2;B$10:B$26)
=SUMIF(A$10:A$26;A3;B$10:B$26)
=SUMIF(A$10:A$26;A4;B$10:B$26)
=SUMIF(A$10:A$26;A5;B$10:B$26)
=SUMIF(A$10:A$26;A6;B$10:B$26)
Comment 1 rbanfield 2006-06-02 07:14:31 UTC
Created attachment 36868 [details]
Sample file showing defect.
Comment 2 frank 2006-06-02 10:47:10 UTC
Not a bug.

A bracket is interpreted as regular expression if tools-options-OpenOffice.org
Calc-calculate-enable regular expressions in formulas is enabled. Switch it off
and you get your expected result.

Frank
Comment 3 frank 2006-06-02 10:47:27 UTC
closed invalid
Comment 4 rbanfield 2006-06-04 18:44:00 UTC
Is this a defect by design, or an unplanned consequence of two different
features interacting?

The reason I ask defect by design is that if someone was to evaluate OpenOffice
"out of the box" as a replacement for Excel "out of the box", and one of there
criteria happened to be the SUMIF function - This has just failed it. Also this
prevents the use of both SUMIF & Regular Expressions in the same session.

When someone is viewing the sheet, they can quite clearly see that the values
match, however, as pointed out, you need to turn off an advanced feature (enable
regular expressions) for it to work properly. Some pop-up of note in the formula
editor should warn re: regular expressions.

Also, How are these valid regular expressions:
String three)
String (four

There are mising parenthesis and when using thess two regular expression
validators (http://www.sweeting.org/mark/html/revalid.php or
http://www.miningtools.net/regextester.aspx) it shows them as invalid. 

In which case, sumif should work for these two entries.



Comment 5 frank 2006-07-10 10:04:29 UTC
Hi,

have a look at the Help system for regular expressions;list of

Frank
Comment 6 frank 2006-07-10 10:04:45 UTC
closed invalid
Comment 7 rbanfield 2006-07-10 17:59:33 UTC
I have read that section of the help file & found this section on parentheses.
It does not however address the second part of this bug, invalid Regular
expresions. 

From Help file: ----
( )
Defines the characters inside the parentheses as a reference. You can then refer
to the first reference in the current expression with "\1", to the second
reference with "\2", and so on.
For example, if your text contains the number 13487889 and you search using the
regular expression (8)7\1\1, "8788" is found.
You can also use () to group terms, for example, "a(bc)?d" finds "ad" or "abcd".
---- End help file section.

In the above text, as in all regular expressions, parentheses must be balanced.
In the example below, they are not, and therefore should not be considered a
valid regular expression, and by that logic should be caught by the sum-if
function, and probably all other functions as well. 

EG:
String three)    - Not balanced, only 1 ), no opening
String (four     - Not balanced, only 1 (, no closing

There are missing a parenthesis. I used these two online regular expression
validators (http://www.sweeting.org/mark/html/revalid.php or
http://www.miningtools.net/regextester.aspx) it shows them as invalid. 

In which case, sumif should work for these two entries.

On reflection, this may not necessarily be a bug in just the sum-if function,
but in the dependant code that it uses to determine if a cell contains a regular
expression. In which case it would be a larger issue, in calc in general.

Re-opening because help content, as well as basic regular expressions require
balanced parentheses. Example above does not contain balanced parentheses, but
is recognised as a Regular Expression, as such, behaviour is incorrect, be is
SUM-IF function, or CALC in general.
Comment 8 frank 2006-07-11 10:09:00 UTC
hi Eike,

please comment on this one and proceed as needed.

Frank
Comment 9 ooo 2006-07-17 12:56:20 UTC
We could check whether unescaped opening and closing parentheses are balanced
and only trigger regexp if so; same for braces {} and brackets []. Furthermore,
the mere presence of parentheses only, without other meta characters being used,
probably should not trigger regexp at all. By no means a P2 and also not a defect.
Comment 10 drking 2007-10-28 20:17:44 UTC
I wonder whether this issue could be closed now?

1. the thing is working as expected
2. it's impossible for OOo to discover faulty regex syntax, as you can't tell 
what the user is trying to do. Among other things, he/she might be 
experimenting with faulty syntax.
3. Even if you do find faulty syntax, what do you do? Certainly not silently 
kill the regex engine - surely that merely gives more unexpected results for 
the user.

Anyone using regex has to take responsibilty for correct syntax. Yes, the check 
box to enable regex is not obvious, but that's a different issue.

Hoping that's a help.
Comment 11 Martin Hollmichel 2007-11-09 16:52:31 UTC
change target from 2.x to 3.x according to
http://wiki.services.openoffice.org/wiki/Target_3x
Comment 12 Marcus 2017-05-20 11:11:46 UTC
Reset assigne to the default "issues@openoffice.apache.org".