Issue 4568

Summary: Word count (task)
Product: Writer Reporter: robpegoraro <rob>
Component: uiAssignee: bettina.haberer
Status: CLOSED FIXED QA Contact: issues@sw <issues>
Severity: Trivial    
Priority: P3 CC: issues, kpalagin, masaya.k, matthauck, openoffice, stp
Version: OOo 1.0.0Keywords: oooqa
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: TASK Latest Confirmation in: ---
Developer Difficulty: ---
Issue Depends on: 10356, 19692, 14050, 17964, 41454    
Issue Blocks:    
Attachments:
Description Flags
Another program's word count dialog
none
OOo macro that counts words in a selection
none
Updated word count macro
none
WordCount macro, with support for multiple selection
none
Example of faulty OOo navigation by "word" in Japanese (CTRL-Left/Right) none

Description robpegoraro 2002-05-06 22:10:34 UTC
Anybody used to Word--or, really most other word processors--isn't going to 
think to look under "Properties" to find this option. Any chance this could be 
moved to the Tools menu? 

Also, as a guy who writes for a living, I'd find this command much more useful 
it worked on selected text as well as the entire document. (Or, better yet, if 
it could give me a word count on both the selected text as well as the entire 
document; that way I wouldn't have to worry about having one word selected by 
mistake, which happens all the time in Word.)
Comment 1 stefan.baltzer 2002-05-13 14:11:51 UTC
Reassigned to Christian.
Comment 2 openoffice 2002-05-13 16:09:56 UTC
My 2 cents worth:

I think counting words in a selection has been asked for before. One
might consider something similar to sums in Calc: If you select cells
in a Calc table, it will automatically display the sum of the selected
cells in the lower right corner. In the same way, one might display
the word count of the current selection in Writer.

The question is if this could be implemented fast enough, and whether
it's worth the bother. Regarding speed, this only makes sense if
someone with a 500 page document presses Ctrl-A, and still gets the
result almost instantaniously. To do this, one would probably have to
implement some sort of word-count cache at the paragraph level. Which
brings up the question how much effort the feature is worth...
Comment 3 openoffice 2002-05-15 18:18:00 UTC
I have received a private mail from the submitter. 
Two comments:
1) He's probably right that word-count-as-selection may not be that
useful.
2) Maybe this would be the job for a macro.
3) The screenshot is attached to this bug.


Here is the mail:
---------------------------------------------------------
First of all, thanks for replying. I appreciate knowing that my
comments haven't gone down a bit bucket somewhere :)

Second of all, I think we may have different ideas of how this feature
would/could/should work. I am *not* interested in an always-on, "live"
word count; that just gets distracting. What I do want is,
essentially, the equivalent of the word count in the Mac word
processor Nisus Writer, but stripped of all the grammar metrics that I
never pay attention to anyway (see screen shot attached).

But if it's too much work to count both selected text and all text, I
would certainly be content with an equivalent of what AbiWord offers,
where it counts either the selected text or, if nothing's selected,
the whole document. (My quibble with this setup, though, is that it's
too easy to leave one character or word selected by mistake; a word
count on both selected and all text eliminates that problem).

Both of these programs, I should add, can generate a word count nearly
instantaneously--at least on the 200-to10,000-word documents I usually
edit. If, OTOH, the standard is going to be "almost instantaneously"
in 500-page documents, it seems to me that a lot of other features
ought to be dropped from OpenOffice.

My last argument here: However slow a count of selected text might
run, it will still be faster than the current alternative--copying the
text you have in mind, pasting it into another window and then running
a word count on that.

I hope I can be persuasive on this... pls. let me know if I'm not
clear on any of these points.

Thanks,

Rob 
Comment 4 openoffice 2002-05-15 18:20:28 UTC
Created attachment 1664 [details]
Another program's word count dialog
Comment 5 davidfraser 2002-05-21 10:06:42 UTC
looks very similar (duplicate?) to bug 1793
Comment 6 illsleydc 2002-05-21 20:13:23 UTC
A very quick search came up with a macro which, if works as advertised
will do word count of slected text - sorry, I don't have an OOo build
to hand at the moment.

Anyway, could be a stop-gap for students and writers (I fall into the
formaer and will definately find it useful) until something more built
in can be done. Anyway its at
http://www.darwinwars.com/lunatic/bugs/oo_macros.html#swc

David
Comment 7 openoffice 2002-05-22 18:15:41 UTC
Created attachment 1744 [details]
OOo macro that counts words in a selection
Comment 8 openoffice 2002-05-22 18:23:27 UTC
dvo->illsleydc: Thanks for the macro.

I've changed the macro David found. It now displays word + character
count for selection and the document. Also, the macro seems to
sometimes skip the first or last word in a selection; I fixed this too.

dvo->robpegoraro: Please evaluate whether this meets your needs. You
can add the macro to your installation through the tools->macro
dialog. Through tools->customize, you can assign it to a
key-combination, so that e.g. Ctrl-C would pop up the count dialog.
Please report whether this is usable or not; if it's usable, I'd like
to get this included in the FAQ, since this has been asked rather often.

dvo->davidfraser: Thanks for collection the various word count bugs. I
knew this had been asked for before, but not how often... :-)


dvo->davidfraser: David, thanks
Comment 9 robpegoraro 2002-06-03 21:09:38 UTC
After trying this macro out for the past week, I've found that it 
overestimates things in longer selections. Here is one example, using 
the installation instructions at 
http://www.openoffice.org/dev_docs/instructions.html (starting 
with "The Windows version of OpenOffice.org 1.0" and ending 
with "Founder, OOoDocs"):

The entire text measures 894 words in the Properties:Statistics 
dialog (890 in MS Word, FWIW). But selecting the full text and 
running the macro gets a different result: 937 words. It also over-
counts the number of characters in the document--5,541, versus 5,417 
as reported in Properties:Statistics. 

This margin of error seems to be smaller on shorter selections. If I 
select the first paragraph from that sample text, for instance, the 
macro reports 86 words, just one more than OpenOffice reports in 
Properties:Statistics when the paragraph is pasted into another 
window.
Comment 10 openoffice 2002-06-04 09:33:40 UTC
dvo->robpegoraro: Ah, I see. Two differences I'm aware of off-hand are:
1) The statistics character count does not include paragraph end
marks, but the macro count does. 
2) I believe text in text fields is counted differently. 
I'll have another look at it when I have some time this week.
Comment 11 openoffice 2002-06-04 13:36:56 UTC
I've updated the macro, and now it counts the same for the given
example. The differences are:
1) the paragraph-end markers were previously counted as two characters
in the macro, but not at all in the document statistics
2) the word delimiters previously were space " ", braces "(" ")", and
tabs in the macro, but space, tabs, and punctuation "." "," ";" "-".
In both cases, I adapted the macro to match the document statistics.

Note 1: This is not a matter of right or wrong, but rather of what you
want. Is "www.openoffice.org" one word, or three? You can adjust this
in the document statistics, and one can also adjust this in the macro,
if desired.

Note 2: Fields are still being counted differently.

Note 3: I bet the differences between Word and OOo can be accounted
for different what-is-a-word definitions, too.
Comment 12 openoffice 2002-06-04 13:38:11 UTC
Created attachment 1856 [details]
Updated word count macro
Comment 13 illsleydc 2002-06-04 14:22:20 UTC
FWIW there appears to be some kind of an effort in standardising what
a word count should mean. It inculdes someone from Sun but doesn't
appear to be moving at all and anyway I find it hard to believe that
MS will ever change their algorithm bearing in mind the hassel caused.

So.. should we mimic them, or do something else...

For the something else, is there any kind of written-up de-facto way
of doing this, robpegoraro what does your boss condier as a word?

Also, as an aside, do i18n people use different separators etc?
Comment 14 illsleydc 2002-06-04 14:24:40 UTC
Oops, missed the URL of the standards people, sorry.

http://lisa.org/oscar/seg/
Comment 15 openoffice 2002-06-05 08:58:17 UTC
Internationalization does indeed use different separators. For all I
know, some languages (like Japanese or Chinese) don't have seperators,
but rather have certain characters always constitute a word,
regardless of what preceeds or follows them. We use the
com::sun::star::i18n::BreakIterator service to handle this internally.
(I.e., this determines how the cursor moves when you press
Ctrl-Left/Right-Arrow)

This mechanism is also accessible through the API, hence a more
sophisticated version of the word count macro could use it and would
then always count correctly for any language. I personally was looking
for a quick solution though. I'm happy to help anyone else who wants
to spend the time.

My changes have adapted the macro's behaviour to the OOo's statistics
dialog behaviour, which should address the points that Rob raised.
Comment 16 robpegoraro 2002-06-10 17:45:34 UTC
I can confirm that the new macro does seem to work as designed--I'm no
longer seeing discrepancies in the results I get with the macro and
the regular Properties:Statistics option. 

However, I did notice one other thing: The macro doesn't appear to
register non-contiguous selections of text. After using Ctrl-click to
highlight multiple blocks of text, the macro reports a word count of
zero for the selected text. 
Comment 17 openoffice 2002-06-10 18:01:02 UTC
Created attachment 1907 [details]
WordCount macro, with support for multiple selection
Comment 18 openoffice 2002-06-10 18:11:40 UTC
Ahh, adding support for multiple selections is easy enough, so I
included an updated version of the macro. 
Comment 19 openoffice 2002-11-26 10:00:49 UTC
*** Issue 9533 has been marked as a duplicate of this issue. ***
Comment 20 lohmaier 2003-01-12 20:04:48 UTC
*** Issue 5073 has been marked as a duplicate of this issue. ***
Comment 21 christian.jansen 2003-03-24 08:01:18 UTC
Reassiged to Bettina.
Comment 22 lohmaier 2003-05-10 21:56:48 UTC
*** Issue 11265 has been marked as a duplicate of this issue. ***
Comment 23 lohmaier 2003-05-10 21:58:33 UTC
*** Issue 12022 has been marked as a duplicate of this issue. ***
Comment 24 lohmaier 2003-05-10 22:00:16 UTC
*** Issue 13361 has been marked as a duplicate of this issue. ***
Comment 25 lohmaier 2003-05-10 22:01:42 UTC
*** Issue 13479 has been marked as a duplicate of this issue. ***
Comment 26 lohmaier 2003-05-30 13:39:45 UTC
*** Issue 14644 has been marked as a duplicate of this issue. ***
Comment 27 lohmaier 2003-05-30 13:41:44 UTC
*** Issue 12334 has been marked as a duplicate of this issue. ***
Comment 28 lohmaier 2003-05-30 13:43:31 UTC
*** Issue 9572 has been marked as a duplicate of this issue. ***
Comment 29 ingenstans 2003-05-30 14:20:55 UTC
I have unmarked 14644 as a dsuplicate of this issue, since it is about 
the ay that the word count in document properties is inaccurate (and I 
believe that's a regression in 1.1 beta) and not about accessing or 
extending the document properties word count, which is what this 
dicussion covered. 
Comment 30 lohmaier 2003-05-31 14:43:18 UTC
*** Issue 14645 has been marked as a duplicate of this issue. ***
Comment 31 lohmaier 2003-06-09 23:45:52 UTC
*** Issue 15429 has been marked as a duplicate of this issue. ***
Comment 32 lohmaier 2003-06-12 17:28:59 UTC
*** Issue 15526 has been marked as a duplicate of this issue. ***
Comment 33 lohmaier 2003-06-12 17:51:34 UTC
*** Issue 15398 has been marked as a duplicate of this issue. ***
Comment 34 rtrout 2003-06-24 05:46:38 UTC
Need to add bug 1793 and bug 3155 as duplicates of this. 
Comment 35 lohmaier 2003-07-01 22:31:07 UTC
*** Issue 14063 has been marked as a duplicate of this issue. ***
Comment 36 robpegoraro 2003-08-07 06:53:20 UTC
FYI, the macro from last June works in 1.lRC2. But... my point in 
filing this bug report was to see this feature eventually added to 
the writer application, not left as an optional add-on for users who 
can navigate the macro dialog box. 

Am I correct in assuming that this feature didn't make the cut for 
1.1? If so, any particular reason why?
Comment 37 fa 2003-08-11 19:15:24 UTC
Michael Meeks is hosting a patch against 1.1 that adds Word Count to the tools 
menu.  He hasn't to my knowledge submitted it to IZ yet for various reasons, but it is 
there and it works.

http://ooo.ximian.com/patches/RC3/word-count.diff

Dan
Comment 38 lohmaier 2003-09-28 00:53:32 UTC
*** Issue 20246 has been marked as a duplicate of this issue. ***
Comment 39 lohmaier 2003-10-02 01:13:33 UTC
Platform/OS to ALL, target-milestone: not determined...
Comment 40 guido.pinkernell 2003-10-23 18:38:04 UTC
*** Issue 5995 has been marked as a duplicate of this issue. ***
Comment 41 guido.pinkernell 2003-10-23 18:39:35 UTC
*** Issue 19128 has been marked as a duplicate of this issue. ***
Comment 42 lohmaier 2003-11-11 16:01:48 UTC
*** Issue 22313 has been marked as a duplicate of this issue. ***
Comment 43 johnathlon 2003-11-20 03:41:22 UTC
the ximian patch has moved, it is now at:
http://ooo.ximian.com/patches/OOO_1_1/word-count.diff
Thanks a lot for the macro (I'm a novice user so it took a long time
to get it to work, but I did, and I set it to work with <ctrl>+w, man,
cool. I know the darwinwars guys did most of the work but, it is still
cool. And if there is any way this macro could get integrated into the
code (but not as a macro of course), than it would probably satisfy
everyone that keeps asking for a better word count. I give it an A+,
exactly what I wanted.
John
Comment 44 lohmaier 2003-12-27 23:43:25 UTC
*** Issue 23912 has been marked as a duplicate of this issue. ***
Comment 45 erikanderson3 2004-01-30 05:27:07 UTC
In response to:
> --- Additional comments from David Illsley Tue Jun 4 05:22:20 -0800 2002 
> 
> FWIW there appears to be some kind of an effort in standardising what
> a word count should mean. It inculdes someone from Sun but doesn't
> appear to be moving at all and anyway I find it hard to believe that
> MS will ever change their algorithm bearing in mind the hassel caused.
> 
> So.. should we mimic them, or do something else...


David --> 

I can see *all* sorts of problems trying to explain to the PHBs of the world why
OOo and Word have different counts, and none of the images in my head are
pleasant.  Like it or not, MS Word has been the de facto standard.  I don't
think word count behavior should be changed away from Word's without some very
compelling (and PHB-understandable) reason.  I make my living on word counts as
a translator, and I cannot afford to have clients accusing me of padding my
bills when the numbers don't match.  Nor would I be happy to realize I'd shorted
myself if OOo's count was low compared to Word's.  

Just my two yen.
Comment 46 erikanderson3 2004-01-30 08:39:06 UTC
In response to:
> ------- Additional comments from dvo Tue Jun 4 23:58:17 -0800 2002 -------
>
> Internationalization does indeed use different separators. For all I
> know, some languages (like Japanese or Chinese) don't have seperators,
> but rather have certain characters always constitute a word,
> regardless of what preceeds or follows them. We use the
> com::sun::star::i18n::BreakIterator service to handle this internally.
> (I.e., this determines how the cursor moves when you press
> Ctrl-Left/Right-Arrow)

dvo -->

I translate from Japanese to English, and have studied some Chinese and Korean.  

The whole concept of "word" simply doesn't exist as folks used to Indo-European
languages would recognize it.  Certain characters don't always constitute a
word, though certain combinations sometimes do.  Those dealing with writing as
authors, teachers, editors, translators, etc use character counts, not including
spaces.  I note that OOo only offers up character counts with spaces included
(please see issue # 10356 where I've attached a file [soon to be two] showing
some of the differences).  This causes problems for CJK languages given the way
these have traditionally been counted.  

Finding some means of parsing CJK languages to use CTRL-Left and CTRL-Right
becomes problematic; other than having a complex dictionary and grammatical map
combination of some sort, I'm not sure how else it would work (but then I'm not
much of a coder ;).  Thing is, there's some disagreement in linguistic circles
as to what constitutes a "word" in Japanese due to the way it agglutinates --
for example, do particles (similar to prepositions and articles in English)
count as separate words, or as suffixes to the nouns they follow?  I'll attach a
sample of Japanese here to illustrate the problem.  
Comment 47 erikanderson3 2004-01-30 09:30:34 UTC
Created attachment 12806 [details]
Example of faulty OOo navigation by "word" in Japanese (CTRL-Left/Right)
Comment 48 chadley78 2004-02-04 20:17:27 UTC
With all these duplicate issues - don't you think that someone should 
consider putting them all together.  I mean the # of dups is greater than 
the number of votes!

Let's face it, burying the word count in the properties and then the lack of 
selection count is not good.  The current Word Count scenario stinks.

I'd really like to see Word Count have a better place and usefulness in 
OOo 2.0.
Comment 49 rblackeagle 2004-02-04 22:17:59 UTC
Since my comments have been lost among the mass of "duplicates", I'll add it here.

What is needed is a full-featured word count.  A user needs to select words (not
spaces), footnotes or not, bibliography entries or not, and a selection or
entire document.  That would solve most complaints about word count I have seen.

This is an OLD issue.  So far what we have is a user-supplied and modest word
count macro.  What we need is an integrated word count option.  I am amazed that
no one is working on this at this late stage.  After two years and an enormous
number of requests for this feature, it is still marked as "NEW".  We need a new
category "ANCIENT" to refer to filed issues with lots of requests but never
assigned to a developer.
Comment 50 lohmaier 2004-02-14 23:35:01 UTC
removing 23974 from the list of "depends on" issues since it's a duplicate of
issue 14050
Comment 51 kalessin 2004-05-18 19:12:27 UTC
When a student writes an essay, or a content-provider writes an article, they
firstly and foremostly want to know how their assigned word limit is being used
up.  This tells them where they need to trim their text. It's vital information.

In MS Word, in consequence, the selective word count is used *dozens* of times
per day to check the proportional sizes of text sections, so that they can make
the best use of their space.

The absence of an intuitive, built-in word count is the single greatest barrier
that I have personally found to OpenOffice adoption amongst people that I have
recommended it to as an MS Word replacement.

OpenOffice Writer simply *does not count words*, at least not in the way that MS
Word users need and expect it to. For a basic function, performed dozens of
times a day, they shouldn't have to install version-sensitive macros, configure
keybindings, or open new documents for the sole purpose of pasting text to count
words.  

As I said, students and other writers that I have recommended OO to, have
pointed to the absence of selective word counting as THE deciding factor in
sticking with MS Word. 

It has been years now that this bug has persisted on this forum, with numerous
worthwhile implementation suggestions, without being resolved. There have been
thirty-odd duplicate bugs reporting this omission.

If I felt my C was up to scratch I would gladly code this myself. As it is, can
I at least note that other open-source projects have implemented perfectly
functional alogrithms to this end, and their code is publicly available?

Personally, I would love to see two extra columns in the navigator showing word
count by section, both absolute and in percentage terms (and Flesch readbility
stats would be fabulous too, though I'd leave them in Statistics).  

But all these features that could easily surpass Word's counting system are
nowhere near as important as simply getting *some* kind of solution in place to
address this fundamental usability issue.
Comment 52 annfielding 2004-05-21 19:44:59 UTC
Please fix this. I'm in graduate school doing a humanities subject and it's 
incredibly annoying not to be able to check selections of text. It can't take 
that long to fix the thing, and it would make a huge difference to a LOT of 
your users.
Comment 53 alexbrenner 2004-05-22 16:38:12 UTC
I quite agree.  I made the jump to OO quite recently and am mostly getting along
fine with it; but was completely astonished to discover how primitive and
unflexible the word count feature.  As a law student working to very tight word
limits knowing how many words are in my footnotes and titles and so on is
absolutely crucial.  It's all very well to install a macro but if I have work
due in on a Monday the last thing I want to be doing is spending my time
fiddling around with macros!!  Selective word count is surely an obvious thing
to put into a later build.  FWIW it should be more obvious too; I can't see that
popping the word count or stats feature into the tools menu would cause that
much confusion for long-time users, it's as much as learning a new ALT-x-x key
combo.  But a properly working word count is the priority!
Comment 54 arthit 2004-05-23 03:09:27 UTC
related: Issue 24038
Flesch-Kincaid Grade Level readability statistics (enhancement)
Comment 55 bettina.haberer 2004-07-08 12:57:30 UTC
Counting Words in a selection wil be implemented in OO.o 2.0. Please have a look
at http://specs.openoffice.org/writer/wordcount/Enhanced_Wordcount.sxw.
Comment 56 lohmaier 2004-07-09 22:22:16 UTC
reopening issue.
THere's more to be improved than only word-count in selection. Counting in
selection is 17964
But there are at least 10356 (esp. important for asian languages) 14050
(journalists, science) and 19692 still to be done.

I change the issuetype to task to reflect that this is not a issue with
implemention details, but a issue for collecting and referencing issues related
to word count.
Comment 57 bettina.haberer 2004-08-17 11:04:59 UTC
Ok, I reset the target to office later, as there is no resource for considering
more cases concerning word count in OO.o 2.0. 
Comment 58 pmjd 2004-08-18 12:47:54 UTC
I CANNOT believe you are going to miss this vital feature out of OO.o 2.0. All I
can say is you guys really know how to shoot yourselves in the foot. Instead of
messing about with new enhancements could you at least get the basics right?
This is one of quite a few basic features that are missing from OO.o that stop
me from using this professionally or reccomending it to others in my field of work.

I really hope you reconsider because you will miss out on alot of users at
school and university level who need word counts for essays or professional writers.

Please reconsider.
Comment 59 annfielding 2004-08-18 14:13:49 UTC
This is a disaster. Without a decent word count feature OpenOffice is useless 
for anyone working as a journalist, or as a student in any humanities subject. 
This is BY FAR the greatest flaw in the software, and should be your first 
priority. 
Comment 60 umr5174 2005-05-30 17:42:09 UTC
As far as french and spanish languages are concerned, what a "word" is is not a
matter of opinion or a Microsoft standard,
but is decided by the Académie française and the Real academia espanola,
respectively, in their famous dictionnaries.
Some former phrases are regarded as one french word, e.g. "ad hoc".

The 8th edition of the Académie française dictionary is freely online :
http://atilf.atilf.fr/academie.htm*
the 9th edition (from A to négaton) too :
http://www.academie-francaise.fr/dictionnaire/

W.W.W. site of the Real academia espanola and their dictionary :
http://www.rae.es/
Comment 61 erikanderson3 2005-09-16 18:22:03 UTC
Regarding Asian (double-byte) "word" and character count, I just attached a
sample text to Issue 17964 comparing Word's count with OOo's count.  I used 2.0
Beta 1, and was most disappointed to find that the only changes from 1.1 were
the location (now in Tools -> Word Count) and the minor (but quite welcome)
addition that we can now count selections.  However, *none* of the remaining
word count issues have been addressed in any way visible to the user.  Asian
text is still counted incorrectly (borking any count of mixed Asian - Latin text
as well), and footnotes and endnotes are still not includable (or is that
excludable?).  I have not looked into more complex issues such as text boxes and
the like.  

Please have a look.  This issue is indeed ANCIENT and in dire need of some action.  
Comment 62 kpalagin 2007-04-08 20:49:13 UTC
Seems that issue http://www.openoffice.org/issues/show_bug.cgi?id=41454 would 
be good to include in this task.
Comment 63 bettina.haberer 2007-11-13 15:48:36 UTC
Counting the words and characters in a document and in a selection via Tools /
Word Count has been implemented. 
Comment 64 bettina.haberer 2007-11-13 15:49:16 UTC
Thus issue is closed.
Comment 65 spectrewriter 2010-04-23 09:51:06 UTC
Someone please explain why this is STILL not a standard feature?  The problem
comes up over and over again, since 20002.

The only functional solution isn't even documented here, a script from a guy
named Yawar Amin out of Canada.  Several of the Writer's Tools are nice, but it
does NOT solve the need for a running/live word count, because it still
disappears as soon as you continue with the document.  (Amin's is floating, on
top, and i find plenty of places to put it, BUT...

Ther is space all over the GUI to put something like word/char count, and even
the percentage of target.  I know, its doing a lot, open source, etc., but come
on... 8 years, and still nothing?  PLEASE, reopen this, give it a target
milestone that's meaningful, and let OOo catch up with Microsoft's Word 2007 on
this issue?  THANKS!