Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing
|Summary:||Simple wildcards * and ? as opposed to regular expressions|
|Component:||ui||Assignee:||AOO issues mailing list <issues>|
|Status:||CONFIRMED ---||QA Contact:|
|Priority:||P3||CC:||cno, dave.wedd, doneyourself, eva-email, issues, kpalagin, lukebenes, phoenix.wanglf, rainerbielefeld_ooo_qa, sebastian.sauer, steve.yin.aoo|
|Issue Type:||FEATURE||Latest Confirmation in:||---|
|Issue Depends on:|
Description ooo 2004-07-30 12:03:35 UTC
Follow-up task of Q-PCD issue 20494. Some competitors know only simple wildcards * and ? as opposed to regular expressions. Need option how to treat wildcard searches, especially when importing files.
Comment 1 ooo 2004-07-30 12:04:10 UTC
Comment 2 ooo 2004-07-30 12:29:30 UTC
reassign to myself
Comment 3 ooo 2004-07-30 12:30:05 UTC
Comment 4 watersrw 2004-10-17 14:54:34 UTC
Should this not apply to the other major components (Writer, etc.) as well? Unless you are an expert user, wildcards are a life-saver.
Comment 5 ooo 2004-10-20 14:15:31 UTC
Watersrw, This is not about the "Search & Replace" functionality, but the wildcards that some spreadsheet functions may take as arguments, producing different results when executed in different applications. Eike
Comment 6 colm.smyth 2005-02-10 19:20:00 UTC
This issue has been proposed as a SO9 requirement. Please clarify what kinds of functions and parameters should be able to use wildcards.
Comment 7 ooo 2005-02-11 11:22:15 UTC
The following spreadsheet functions are currently able to process regular expressions in the arguments that specify search criteria for data ranges: DSUM DCOUNT DCOUNTA DAVERAGE DGET DMAX DMIN DPRODUCT DSTDEV DSTDEVP DVAR DVARP MATCH COUNTIF SUMIF LOOKUP VLOOKUP HLOOKUP SEARCH Similar to the "Enable regular expressions in formulas" option under Tools.Options.Spreadsheet.Calculate, another option would be needed like "Enable simple wildcards in formulas". These two are mutually exclusive => radio button "None / RegExp / Wildcard", where Wildcard would be enabled by the Excel filter import for interoperable formula results.
Comment 8 frank 2005-04-05 14:00:41 UTC
*** Issue 46706 has been marked as a duplicate of this issue. ***
Comment 9 frank 2005-04-23 18:21:42 UTC
*** Issue 48020 has been marked as a duplicate of this issue. ***
Comment 10 navvy 2005-04-25 10:38:21 UTC
The problem is Excel formulas that do not give the correct result in Open Office. When I open an Excel spreadsheet in Open Office, I need to either get the correct results or to get a message warning that some cells contain formulas that cannot be handled correctly. Open Office does neither. As an Excel user, I don't want to know about regular expressions - I just want Open Office to give the same answers as Excel. I also don't want to accidentally create Open Office formulas that will give the wrong results in Excel. If there is a radio button in Tools/Options to change the formula-handling method, I would expect that to be automatically set correctly to match the way that Excel works, whenever an Excel file is opened. Alternatively, when the troublesome formulas are detected, a warning box should open when the file is opened, preferably giving a quick way to make the necessary changes.
Comment 11 frank 2005-06-30 12:14:16 UTC
*** Issue 51072 has been marked as a duplicate of this issue. ***
Comment 12 drensink 2005-07-31 01:13:09 UTC
Instead of disabling regular expressions in the options when opening an excel document that uses wildcard characters or adding a wildcard option to the options, wouldn't it be easier to actually convert the ? and * wildcard characters to regular expressions on import and then convert the regular expressions back to wildcards when exporting back to excel formats, giving a warning for any regular expressions that cannot be converted exactly to wildcards.
Comment 13 navvy 2005-08-02 16:35:00 UTC
I agree with drensink. Automatic translation is what the user needs and expects. Changing options that affect all workbooks, will cause more problems than it solves. If the user sees a warning that a regular expression could not be translated to Excel format, they will learn that Open Office is more powerful than Excel. Currently the inadequate translation gives the opposite impression, if the user has never heard of 'regular expressions'.
Comment 14 ooo 2005-08-03 16:30:45 UTC
Drensink, Navvy, Please note that "automatic translation" is not feasible. It might work in simple cases of literal string arguments as parameters during import, but as soon as the search expression is concatenated from substrings or obtained via formula functions it would fail. Furthermore it wouldn't work with round-trip documents, i.e. load from Excel and save to Excel file format again. The only viable way is another wildcard mode, which would also ease use for the not so experienced user for whom regular expressions are too complicated. Eike
Comment 15 drensink 2005-08-04 04:08:49 UTC
Eike, I did not quite understand your comment at first, because the strings I was using for the DSUM criteria were a string (substring) in a cell (reference to cell being a formula). I now see that you could not just go replacing them all because there is no way to know if they are just an * or ? or actual wildcards short of scanning the whole sheet to see if each is an input into a formula that handles wildcards which would not be so good logic and performance wise. I then wondered what would happen with a spreadsheet using wildcards was imported, but the user wanted to use regular expressions on that same sheet and still have the wildcards calculated. The obvious hangup with that is the ? and * do different things depending which option you want. A good answer to that would be for the functions like DSUM to have two implementations, one that calculates with regular expressions and one that calculates with wildcards. This would create yet another problem since DSUM used in calc is not the same as DSUM in excel, but the same formula name has been used by both excel and calc users, though this might be easy to take care of on import and export by translating DSUM in excel to something like DSUMWILDCARD in calc and something similar on export. This all assumes someone would actually want to use both wildcards and regular expressions on the same sheet, but that might be a bit crazy and I understand implementing the option you propose is probably far quicker, especially if you could have it coded in 2.0 or 2.0.1 instead of later versions.
Comment 16 drensink 2005-08-06 05:10:25 UTC
I gave my last comment more thought and realized you can maintain the same behavior openoffice uses now with regular expressions, make the excel imports and exports compatible, and mix the use of both regular expressions and wildcards on the same sheet and in any cells by doing the following. 1. Provide an optional argument to any formula function that will take regular expressions in openoffice, but takes wildcards in excel. 2. If the argument is set to true, assume the function arguments use ? and * as wildcards. 3. If the argument is false or not set, assume the function arguments may contain regular expressions. 4. For documents imported from excel, add the argument set to true to the cell containing the formula. 5. For documents in openoffice assume the argument is not set by default. 6. For documents in openoffice to be exported to excel the user can set the argument to use wildcards. 7. If the user doesnt set the argument, warn that excel does not fully support the use of regular expressions. This would allow the use of the same formula names for both cases and should be fairly easy to code into the existing regular expression code, since it then becomes a simple replacement of ? and * in the arguments fed to the functions with the equivalent regular expression.
Comment 17 ooo 2005-08-15 11:13:55 UTC
Hi Drensink, > 6. For documents in openoffice to be exported to excel the user can set the > argument to use wildcards. What should be exported if that argument is the result of a formula? > This would allow the use of the same formula names for both cases and should be > fairly easy to code into the existing regular expression code, since it then > becomes a simple replacement of ? and * in the arguments fed to the functions > with the equivalent regular expression. I'd rather implement a simple wildcard search than fiddling around with the argument's string content. Bear in mind that all other content that possibly could form a regular expression would have to be escaped otherwise. Eike
Comment 18 drensink 2005-08-16 03:32:08 UTC
This gets a bit more complicated than it seems at face value, but I will continue a bit more to see if there is an optimal solution to the problem. > What should be exported if that argument is the result of a formula? Keep in mind this would not be the argument with the wildcards/regular expressions in it, but an additional, optional argument whose value would be restricted to 0 or 1, True or False. If it is set to 1, then the formula uses wildcards and you export the other argument to the function with the wildcards in it as it is written by the author of the document and strip the 1 during export since Excel does not use it because by default it uses wildcards. If it is not set or set to 0, then you know you are exporting a document that is very likely not to be compatible with Excel, but could warn and continue exporting anyway. If the additional, optional argument is a result of a formula, which should not be necessary, then that formula should result in True or False, 0 or 1 otherwise it would be in error. That still would not matter because you would just check to see if it results in 0 and 1 and strip the whole formula off just like you would strip the 1 off. Though it would be easier to just restrict that argument to only the values 0,1,True,False and anything else would result in a formula error. > I'd rather implement a simple wildcard search than fiddling around with >the argument's string content. Bear in mind that all other content that >possibly could form a regular expression would have to be escaped >otherwise. The additional, optional argument would just determine which code path is taken. If you want to do a whole new implementation of the formula using wildcards instead of messing with the string content of the argument containing wildcards/regular expressions, the same principle would still work. The biggest problem I see with the supposedly simple solution, just turn off regular expressions and turn on wildcards is that it immediately eliminates the use of more powerful regular expressions in that document, preventing someone from importing a document containing wildcards from excel for the purpose of using regular expressions. The continued incompatibility on export of formulas with the same names using regular expressions would also still have to be dealt with in the simple case. The basis of my approach is almost exactly the same as doing it for the whole document or sheet, with the exception that it is done for each cell where those formulas are used, rather than the whole document at once. You still have to decide whether its a regular expression or wildcard and fix the code for each formula to handle it if you do it for the whole document.
Comment 19 drensink 2005-08-16 03:53:26 UTC
One more note: It would be an irritatant to a user to have to add the ;1 constantly to a formula. The option for the document could be added anyway so when it is set to wildcards the ;1 is automatically added upon finishing the formula input.
Comment 20 ooo 2005-08-30 12:02:53 UTC
Hi Drensink, > > What should be exported if that argument is the result of a formula? > Keep in mind this would not be the argument with the wildcards/regular > expressions in it, but an additional, optional argument whose value would be > restricted to 0 or 1, True or False. You didn't get my point. The argument doesn't have to be a direct value, it could be a formula result instead, e.g. FALSE(), or a cell reference where the user could play with the value, or any sophisticated formula, even with side effects. In these cases stripping away the parameter simply can't be done. Eike
Comment 21 drensink 2005-09-05 03:31:57 UTC
> You didn't get my point. The argument doesn't have to be a direct value I think I did get your point and thought I had explained that in the statement below to be interpreted as (The argument DOES have to be a direct value) "Though it would be easier to just restrict that argument to only the values 0,1,True,False and anything else would result in a formula error and if it was not that would be in error." I can understand that you might want to or have to generalize from this specific case because the next complaint would be that a formula result could not be used. I can also understand that there may be no existing code in the current formula handling to force an argument to be only a direct value. Since it seems all of the above is not a realistic option, would you be willing to agree a reasonable option would be to make the setting applicable to each sheet in a workbook, rather than to the entire workbook? This would allow an import from Excel using wildcards and then adding a new sheet to the workbook that uses regular expressions within the same workbook. The user could reference cells on those sheets within the same workbook that are results of regular expressions. Thankyou for being patient in explaining your objections and the reasons for those objections in detail.
Comment 22 ooo 2007-08-13 17:06:09 UTC
Just an overdue update: the OASIS ODF TC approved the proposal to include a table:use-wildcards attribute in the file format, see "Proposal for a new calculation setting: table:use-wildcards" under http://wiki.oasis-open.org/office/OpenDocument_v1%2e2_Action_Items#head-4854119a1f3d7df4f22fc453bb50e8c6e5b0acd7
Comment 23 villeroy 2008-01-11 21:59:26 UTC
Text comparison in Calc occurs in 1. Formulas by operator =A1=B1 2. Functions like MATCH etc. 3. Filters 4. Find/Replace It is influenced by following options: - Calc>Calculation>Case sensitive (affects 1. only) - Calc>Calculation>Search criteria = and <> must apply to whole cells (affects 2. and 3.) - Enable regular expressions in formulas (affects 2. only, but not 1.) Better labeled "Enable regular expressions in function arguments"? Filters, SortDescriptors and Find/Replace have their own options for regex and case. There are too many options and implications. The behaviour is inconsistent. For instance: Did anybody notice that case-insensitive cell values obsoletes many benefits of regular expressions? My radical proposal: One option to toggle between native mode and compatibility mode. Native mode: All strings in Calc are case sensitive and support regular expressions, "=" and "<>" do not apply to whole cells unless you use them with "^" and "$". Compatibility mode: Excel files are loaded in compatibility mode: All strings are case-insensitive except for EXACT and FIND. They support wildcards. "=" and "<>" apply to whole cells.
Comment 24 ooo 2008-03-17 14:44:44 UTC
*** Issue 87106 has been marked as a duplicate of this issue. ***
Comment 25 gvsa123 2008-10-26 04:03:42 UTC
I was working with an xls file and discovered this problem. From a users perspective, wouldn't it be simpler to simply have a message stating that OOo is trying to open a file that contains regular expressions/wild cards and asks whether the user wants to enable the regular expression option? i'm not sure about more complex formulas but it seems that there is no problem with OOo understanding and translating into excel's way of handling things, so long as the option is turned on. i would think therefore that simply letting the user know that the option has to be turned on because there are regular expressions in the formula would be good enough, so that the user doesn't get shocked to see all zeros in the cells as I did. I mean if there isn't any technical problem as regards OOo translating and understanding the way other spreadsheet applications handle formulas (OOo can open and save either way), then there's really no need to replace and auto translate formulas. if all the user needs to do to work with the spreadsheet (whichever format) and to save it (in whichever format as well) is to turn the option on or off, then so be it. a simple prompt like: "The spreadsheet file you are trying to open may contain regular expression. Do you want to turn these feature on?" or something...
Comment 26 Rainer Bielefeld 2008-11-28 16:47:05 UTC
*** Issue 96557 has been marked as a duplicate of this issue. ***
Comment 27 ooo 2009-07-28 12:43:51 UTC
*** Issue 102843 has been marked as a duplicate of this issue. ***
Comment 28 ooo 2010-07-19 13:46:53 UTC
*** Issue 101712 has been marked as a duplicate of this issue. ***
Comment 29 ooo 2010-07-19 13:49:57 UTC
*** Issue 113243 has been marked as a duplicate of this issue. ***
Comment 30 Regina Henschel 2012-09-18 12:51:55 UTC
*** Issue 121015 has been marked as a duplicate of this issue. ***
Comment 31 Regina Henschel 2012-09-18 13:05:43 UTC
*** Issue 121018 has been marked as a duplicate of this issue. ***
Comment 32 Sebastian Sauer 2013-12-01 11:08:27 UTC
Related section at ODF 1.2: http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part2.html#__RefHeading__1017868_715980110 Introduced on 06/May/10: https://tools.oasis-open.org/issues/browse/OFFICE-2679?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel Related AOO.org discussion: http://mail-archives.apache.org/mod_mbox/incubator-ooo-dev/201207.mbox/%3C73f2681577a4ff873095d71260bf3d27@xaox.net%3E Related LO bugreport: http://en.libreofficeforum.org/node/6950 What's needed? AOO and LO need to implement support for HOST-USE-WILDCARDS and on XLS/XLSX import need to set it to true.
Comment 33 Sebastian Sauer 2013-12-01 15:22:38 UTC
Related bugreport in LibreOffice bugzilla: https://www.libreoffice.org/bugzilla/show_bug.cgi?id=72196
Comment 34 Rainer Bielefeld 2014-03-20 17:47:09 UTC
erack no longer active here, so back to list.