Issue 76156 - Counting number of substrings within a string
Summary: Counting number of substrings within a string
Alias: None
Product: Calc
Classification: Application
Component: code (show other issues)
Version: OOo 2.2
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
Depends on:
Reported: 2007-04-06 17:56 UTC by discoleo
Modified: 2017-05-20 11:13 UTC (History)
1 user (show)

See Also:
Latest Confirmation in: ---
Developer Difficulty: ---

Proposed C++ implementation (2.39 KB, text/plain)
2007-04-06 17:58 UTC, discoleo
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description discoleo 2007-04-06 17:56:18 UTC
Unfortunately, Calc lacks even basic support for string functions (see my other
issues on strings, like issue 66590). This makes the program unsuitable for any
work that relies heavily on strings.

One of the functionalities that I am missing is a function that counts the
number of occurrences of a substring within a given string.

Lets say, we have the string: "ab cd ab de", and we want to count how often "ab"
occurs inside this string.

I have included in the attached file the C++ code for a new function,
ScInterpreter::ScFindCount(), that performs this task.

This function counts the number of occurences of a given string within another
//   - IF string is NOT found, it returns 0 
//   - otherwise it returns the number of occurences of the substring
//   - parameters:
//     1. string: this is the search item
//     2. string: this will be searched for the first string
//     3. is overalpping of search results allowed:
//        - default: FALSE (NO)
//     4. START position: default = begining of string (=1)
//     5. END position: default = LEN(string)

I noticed now, that the function will probably NOT recognise wildcards, so I
will look forward to correct that. I hope that such a function will make it in
the OpenFormula specification, too.
Comment 1 discoleo 2007-04-06 17:58:00 UTC
Created attachment 44257 [details]
Proposed C++ implementation
Comment 2 discoleo 2007-04-06 19:12:26 UTC
1. I just wanted to add a real life example.

My primary work involves auditing the whole medical activity taking place in a
big county. One frequent task is to count the number of different diagnoses for
individual patients.

The string looks like this: "B18.1; G20; I20.9; I12.9". These tokens represent
individual ICD10-coded diagnoses. So, I would like to count how many diagnoses
are there (i.e. FindCount(";",'the_string') + 1).

2. I found the code for *REGEXP* searches inside the ScInterpreter::ScSearch()
function. Unfortunately, I do NOT understand much of it and I fear that there is
NO good documentation on it.

I will try to implement a ScSearchCount() function later, just by cloning the
ScSearch() function.
Comment 3 frank 2007-04-10 09:48:04 UTC
Hi Niklas,

please have a look at this patch.

Comment 4 niklas.nebel 2007-05-23 14:03:13 UTC
We generally don't want to add many new internal functions. Additional functions
can preferably be added as add-in components.

Also, issue type "patch" should be limited to complete patches, ready to apply.
Comment 5 mrprogrammer 2010-08-19 18:57:19 UTC
If A1 contains "ab cd ab de" and B1 contains "ab", we can use
which I realize is messy but it gets the job done.
Comment 6 mrprogrammer 2010-08-19 18:59:43 UTC
Oops, I meant: If A1 contains "ab cd ab de" and A2 contains "ab" …
Comment 7 Marcus 2017-05-20 11:13:50 UTC
Reset assigne to the default "".