Issue 15524

Summary: . (dot) should be word separator
Product: Writer Reporter: maccy <openofficeissuezilla>
Component: codeAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P4 CC: issues, stefan.baltzer, thomas.lange
Version: OOo 1.1 Beta2   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---

Description maccy 2003-06-11 18:32:01 UTC
I get lots of "words" like "z.B" ("z.B." is an abreviation for "for example"
in German) from other users sent in their user dictionaries. OO.o only sees
". " as a word separator but not a dot which isn't followed by a blank. This
should be changed IMHO.
Comment 1 khendricks 2003-06-16 16:12:12 UTC
Hi,  
  
No, the dictionary author should change their dictionary to include abbreviations.  
  
The whole purpose of including embedded dots in a word is to allow abbreviations to  
be spellchecked properly.  
  
This works the same way under English (i.e. or e.g.)  And you can ask that the  
abbreviation be included in the dictionary so that it is properly spellechecked.  
  
Resolving this as works for me.  
  
  
Kevin  
  
  
Comment 2 maccy 2003-06-16 16:29:02 UTC
you are joking, aren't you? This is the very first spell checker/word
processor which does it this way and it's very annoying. If the
users would type typrographically correct, there would have to be
a "half space" in between the abbreviations. This is at least for
German type setting rules correct. No Space is wrong (but MANY
people do it this way), a full space would be correct for a
typewriter whichhas no half space but a half space is the only
correct glyph to put at that place; that is, why I will never add
abreviations like "i.d.R." or "z.B." into the dictionary.
Comment 3 khendricks 2003-06-16 16:43:50 UTC
Hi, 
 
Sorry, abbreviations are allowed in English words and in many other languages and 
so imbedded periods are possible (i.e. or e.g.)  So either you can choose to add a 
space after letter of an abbreviation to get it to parse properly or you  can choose to 
ignore them but either way periodsin the middle of a string of letters can not be a 
universal word separator. 
 
Perhaps you can convince someone to make it a locale depdendent separator. 
 
Either way, this issue is not lingucomponent and belongs instead in the 
sw.openoffice.org since they control the breakiterator code that determines what 
gets sent to the spellchecker in the first place. 
 
So I am reassigning it to writer so that you can argue with Hamburg who are the 
breakiterator authors for a locale dependent use of period as a full break in German. 
 
Good Luck! 
 
Kevin 
 
 
 
 
Comment 4 h.ilter 2003-06-17 09:36:43 UTC
Reassigned to SBA.
Comment 5 stefan.baltzer 2003-06-17 13:59:41 UTC
SBA: The word count of URLs is another subject to keep in mind when
discussing what a dot should do. How many words is
"Me.and.my.f.r.i.e.n.d.s@bla.com" expected to be...? To me, an URL is
ONE word, regardless the number of dots within. I think this is a far
more important issue than the counting of abbreviations. Texts with
abbreviations require the user to know them in order to be understood
while URLs can not be avoided. 

Those in the deepest need for exact word counts are those who get paid
by the word: Journalists, editors and the like. There must be a kind
of rule for them when to count +1 and when not. I am not aware of the
rules journalists get paid. 

Generally spoken, we don't want to end up in pumping up the number of
options. There are already far too many, so we have to be very strict
on that.

Reassigned to Michael.
Comment 6 maccy 2003-06-17 15:39:56 UTC
Stefan, you are just talking abount word counts, which is quite
sencondary compared to the annoying spelling mistakes OO.o comes
up with. And the spelling mistakes due to wrong word counts is
what this bug is all about.
Taking your example:
The string "Me.and.my.f.r.i.e.n.d.s@bla.com" should be handed
over by OO.o to the spell checker in this chunks:
Me
and
my
f
r
...
bla
com
OO.o should not hand over the whole string at once to the
spellchecker. I don't care what OO.o does when it counts words,
but for spell checking your example string should be cut into
pieces before sent to myspell.
Comment 7 michael.ruess 2003-06-17 15:46:41 UTC
Unfotunately this is the only way to handle e.g. the word count and
the auto-capitalization after a full stop in a suitable manner.
Thus the behaviour will not and cannot be changed at all.

I think, implementing for the dictionaries, that the spellchecker will
also recognize "z. B." from a user dictionary is very hard and VERY
time-consuming.

MRU->TL: please give your opinion about this.
Comment 8 maccy 2003-06-17 16:00:43 UTC
Actually it is impossible to add all abbreviations into
the dictionaries because everyone uses his own abbreviations.
OpenOffice is the only word processor which claims that
x.y.z. is a spelling error.
If the dot (not followed by space) can't be configured as a
word separator then there has to be something like a
preprocessing of "words" before they are being sent to
the spell checker.
Comment 9 thomas.lange 2003-06-18 09:34:50 UTC
No definetly the "." should not be a word seperator.
If that would be the case it will be no longer possible to spellcheck
abbreviations for example like "i.e." or "Dr.".
Those abbreviations need to get passed to the spellchecker as a single
text in order to allow the spellchecker to verify them.

On the other hand this of course requires to allow for a dot at the
and of any word since the word might be located at the end of a sentence.
I think this is a minor drawback since not checking abbreviations at
all is far less acceptable.

Kevin is absolutely right with his opinion that the dictionary authors
have to include the abbreviatinsd to their dictionaries.

Though I see the point with the half space between "z. B." there is
still one main reason for using "z.B." :
- In SO we have two different third party spellcheckers included.
  All of them obtained from vendors specialized in the field of
  spellchecking. Unfortunately both of them use ISO-8859-1 or similar
  encoding and thus do not know about the half space.
  And presenting them with the choice of "z.B." and "z. B." only the
  first one is the accepted.

Given all this we will stick to the current behaviour.


About:
> Actually it is impossible to add all abbreviations into
> the dictionaries because everyone uses his own abbreviations.

Correct. And thus someone using personalized abbreviations has to take
the consequences: the word not being known.

> OpenOffice is the only word processor which claims that
> x.y.z. is a spelling error.

This of course could be changed to words containing a "." within never
being checked at all. But this would also prevent the detedction of
spelling errors in known abbreviations.

Since I also like to use personalized abbreviations e.g. "WE" for
weekend which could be excluded from spellchecking by disabling the
option "Check uppercase words" (which is the default), I propose to
introduce a new option "Check abbreviations" which defaults to true
and specifies if words containing a "." within should be checked or not.

I'll pass this on to user experience for decision abput such an option
or finding other solutions.

TL->BH: Please take over

Since the decision about such an option can be done quite easily I
change the target to OOo 2.0.
Comment 10 maccy 2003-08-31 19:52:30 UTC
Thomas, you say that you like it if the abbreviations are also being 
spellchecked but there a few major points I want to emphasize:

- Abbreviations written without space inbetween are incorrect, thus
dictionary authors would have to add incorrect "words" into the
dictionary.
- Even if some authors might consider to add wrong abbreviations into
their dictionaries, the number of them is too high and every special
vocabulary (like the legal stuff, the military stuff etc.) have other
special abbreviations.
- The behaviour of OO.o is very uncommon and confuses the majority of
people working with OO.o, especially the ones who are evaluating OO.o
and think about switching from MS Office.
Comment 11 falko.tesch 2003-09-01 09:47:32 UTC
This issue is re-targeted to "Office later"
Reason:
As already stated our German spell-checker is not capable of Unicode.
Furthermore for those in need of this feature they will use the
AutoCorrection (since it is way to awkward to select a em-dash from
"Special Characters" everytime).
And finally IMHO Word doesn't do it any better than us anyway.
Comment 12 bsb 2004-07-12 12:41:08 UTC
To me personally, the period should be a word separator. I noticed that from 
time to time I fail to press the space bar hard enough to insert the space after 
a period. So I end up with words like "end.Beginning" which are treated as one 
word. The most annoying thing is that I can't use Ctrl+<Left/Right Arrow> to 
position the cursor right after the period and insert a space. The second, quite 
less annoying result of the period not being a separator, is that this "word" is 
entered in the word completion list with the '.' in the middle. 
Comment 13 bettina.haberer 2010-05-21 14:46:05 UTC
To grep the issues easier via "requirements" I put the issues currently lying on
my owner to the owner "requirements".