Issue 74198 - spell checker proposes to separate words which he don't knows
Summary: spell checker proposes to separate words which he don't knows
Status: UNCONFIRMED
Alias: None
Product: General
Classification: Code
Component: spell checking (show other issues)
Version: 3.3.0 or older (OOo)
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: needhelp, oooqa
Depends on:
Blocks:
 
Reported: 2007-02-05 20:57 UTC by timi_openoffice
Modified: 2014-02-24 17:29 UTC (History)
5 users (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description timi_openoffice 2007-02-05 20:57:31 UTC
The german words "drittstärkstes" and "sozialräumliches" for example are not 
known by the spellchecker (The german language makes largly use of composita, 
and much of them the spellchecker doesn't know) and are marked as wrong. So far 
no problem.

But the spellchecker suggests "dritt stärkstes" and "sozial räumliches" - and 
both are definativly wrong. In german it isn't always clear if an expression is 
written as one or as two words, but the bad suggestion brings even much more 
confusion. When the suggestion is composed of two words, it is in german almost 
always a bad suggestion.

So combinations of two words should never be suggested when checking german 
texts. I asume that for languages who makes large use of composita the 
situation will be similar.
Comment 1 nemeth.lacko 2007-03-19 23:05:53 UTC
Thanks for the report. I'm adding Bjoern Jacke (maccy at ooo), author of the
German OOo dictionary to this issue.
Comment 2 maccy 2007-03-20 00:04:55 UTC
this special cases of compound words of "ordinal+superlative" are hard to put
into  a compound *rule* and on the other hand the number of this kind of
compound constructs is very high but while most of the possible compounds are
rarely used. As long as I don't see reasonable rule to support this class of
compounds, I think it's best to add the most common ones that are not already
known manually.
Laci, you can assign this to me if you want...
Comment 3 timi_openoffice 2007-03-20 08:15:24 UTC
Okay. What's about the idea to no longer suggest strings of two words? There 
are some cases where this makes sense (old orthographie: fahrradfahren, new 
orthographie: Fahrrad fahren). However, in practice I've made the experience 
that most of the suggestions containing of two words are bad.
Comment 4 nemeth.lacko 2007-03-20 09:56:07 UTC
MySpell/Hunspell has an affix file option (NOSPLITSUGS) to forbid this sort of
suggestion. It is implemented for Dutch, because Dutch dictionary lacks of
compound word support, and most of its suggestion was wrong. But it is not true
for Hungarian and German. Word-level spell checker of OOo suggests two "good"
solution: add "drittstärkstes" and "sozialräumliches" to the custom dictionary
or split these words. Naturally the best solution is to add these words to the
German dictionary, like most of the other compound words of German.
Comment 5 Rainer Bielefeld 2007-04-17 20:29:49 UTC
I often see these suggestions and I also think that that' annoying, but on the
other hand, from time to time I "forget" a blank and am thankful that
spellchecker does not accept my sentence "Ich fielhin und schlug mir ein Knie
auf", but suggests "fiel hin" for "fielhin". 

So we should not overshoot the mark, it's correct that the spell checker
suspects that the user might have created accidentally a compound of 2 words. An
option to forbid those suggestions might be a good solution for users who do not
have that "missing blank problem".

I do not see this as a DEFECT, but as an ENHANCEMENT.
Comment 6 timi_openoffice 2007-04-19 16:14:05 UTC
>> I often see these suggestions and I also think that that' annoying, but
>> on the other hand, from time to time I "forget" a blank and am thankful
>> that spellchecker does not accept my sentence "Ich fielhin und schlug
>> mir ein Knie auf", but suggests "fiel hin" for "fielhin".

It may be nice to have this suggestion. However, when the word "fielhin" is 
underliened, it is clear to everyone where's the problem. Also without 
suggestion.

On the other hand, "sozial räumlich" is a _wrong_ suggestion 
for "sozialräumlich". Also when this word is attached to the list, there will 
stay other words with wrong suggestions. And that makes a very bad impression 
to the user: When I see a long word which is underlined, I normaly leave it how 
it is because in most cases it is a problem of the spellchecker. I've stopped 
even to use the suggestions because many of them are bad. And I don't think I'm 
the only user doing so. This price to high in comparation with the very small 
benefit.