Issue 2838

Summary: AutoCorrections does not match case of the words that AutoCorrect replaces.
Product: Writer Reporter: royalozma <royalozma>
Component: editingAssignee: AOO issues mailing list <issues>
Status: CLOSED FIXED QA Contact:
Severity: Trivial    
Priority: P3 CC: arnaud.versini, barta, binbjguo, charles.fmj, erikanderson3, hdu, issues, kschenk, lupton.charles, Mathias_Bauer, mike.hall, mlissner, olivier.noreply, pagalmes.lists, pedlino, rgb.mldc
Version: OOo 1.1Keywords: rfe_eval_ok, usability
Target Milestone: ---   
Hardware: PC   
OS: Windows 98   
Issue Type: PATCH Latest Confirmation in: 4.1.1
Developer Difficulty: ---
Issue Depends on: 22961    
Issue Blocks:    
Attachments:
Description Flags
Patch to resolve this issue none

Description royalozma 2002-01-15 13:52:54 UTC
AutoCorrect "Incorrect spelling subsystem" does not respect case-mode (Lowercase
versus Titlecase versus Allcaps) of the AutoCorrected misspelled words.

If you typed "Teh" in a middle of a sentence, it becomes transformed into "the"
instead of "The". Thus failing to respect the case of misspelled word.

Also, I have basically figure out how to "abuse" the AutoCorrect "misspelling
dictionary" to be used as a simple "transliterator".
  Eg: I store all the transliterations of Esperanto words (as the incorrect
  spelling) and the corresponding Esperanto accented forms (as the correct
  spelling)

 * I add in "regxo" (in the AutoCorrect Dictionary) to be replaced with the
   word "reg^o" (`g^' is actually a `g' with a `^'. UCS code is 011D)
 * When I typed in "Regxo" in a middle of a sentence, it gets replaced with
   "reg^o" instead of "Reg^o".
 * Also "REGXO" gets replaced with "reg^o" instead of "REG^O".

Also: I can not add in multiple entries of a same word in different case
presentation into the AutoCorrect "misspelled word list", and to have different
replacement depending on the case presentation.

 * theparaoh will be AutoCorrected as "the paraoh" (the default, if none the
   below case presentations match)
 * TheparaoH could be AutoCorrected as "Tutankhamun" (case specific presentation
   override)
 * ThepaRaoH could be AutoCorrected as "Nefertari" (another case specific
   presentation override)

[The above are only examples]
Comment 1 royalozma 2002-01-17 14:15:04 UTC
INTRODUCTION

One time, I have tried using the exception dictionary feature to make
StarOffice to Orthographically verify (Spell Check) text in a language
that StarOffice does not support. I had limited success with that
stunt. That language is Esperanto, and it is only implicitly supported
in OpenOffice.org via its ISO 10646 feature.

I refer and expect that the Orthographical Verifier (Spell Checker) to
remove words from the exception dictionary from the recommendations. I
have to double check the exception dictionary everytime when I have to
make a vocabulary adjustment, regardless if the language is English,
Esperanto or another language.



MY NOTES ABOUT THE SYSTEM:

Here is my analysis of the Orthographical Verifier (aka Spell
Checker):

Subsystem that detects incorrections:
 * Check if the word exists in the user's exception dictionaries.
    > If so, mark the word incorrect, otherwise go to next step.
 * Check if the word exists in the user's additional words.
    > If so, mark the word correct, otherwise go to next step.
 * Check if the word exists in MySpell.
    > If so, mark the word correct, otherwise mark the word incorrect.

Subsystem that handles recommendations:
 * Check if the incorrect exists in the user's exception dictionaries.
    > If so, recommend the user's recommendation for the user's
      exception.
 * Upload user dictionaries to MySpell.
    > I know that this feature is unimplemented.
    > Technical issues such as dictionary caches will not discussed
      here.
 * Call MySpell for recommendations for the incorrect spelling. 
 * Filter out all user's exceptions from MySpell's recommendations.
    > OpenOffice.org lacks this step!
    > If the "MySpell" does not support uploaded dictionaries,
      OpenOffice.org's recommender should filter out user exceptions
      from the Myspell's recommendations.

I hope that you can understand what I am expecting with the
OpenOffice.org's
recommendation subsystem.



Someone has mentioned that I can create my very own MySpell
dictionaries. Where can I get these tools. Can I use them without
having the entire OpenOffice.org build on my system? I want to create
my own customized English and Esperanto(*) vocabularies for use with
OpenOffice.org

(*) Until OpenOffice.org supports this language, I have to maskerade
    this language as another but supported language.
Comment 2 royalozma 2002-01-17 14:47:58 UTC
Ignore my last submission to this issue!
That submission about the Spell-Checker was mistakenly submitted into
the wrong issue.

It should appear under issue #2836.

This correction notice does not affect the status about the
Auto-Correction error which I have discovered!
Comment 3 thorsten.martens 2002-01-21 12:33:42 UTC
Seems to be more a wordprocessor- than a framework-issue to me. So please have a lool !
Comment 4 stefan.baltzer 2002-05-16 16:51:51 UTC
Reassigned to Michael.
Comment 5 michael.ruess 2002-05-17 11:58:31 UTC
MRU->TL: It is the same behaviour as in SO 5.2. Do we want to
implement such enhancement? Or is it really a bug?
Comment 6 royalozma 2002-05-17 12:16:38 UTC
Resigned as an Enhancement!
Comment 7 thomas.lange 2002-05-22 09:09:40 UTC
TL: Since it is about auto-correction I pass this one on to you.
Comment 8 Oliver Specht 2002-06-03 08:17:55 UTC
OS->FT: As an enhancement I think it's best to put it on your list. 
Comment 9 falko.tesch 2003-09-29 10:39:02 UTC
We should respect case-mode when using AutoCorrection.
If the user wants replaces "Thses" by "thesis" than we shall not later
this replacentment any further (e.g. replacing is to "Thesis" just
because it is the first word of a sentence)
Furthermore we must not replace "thses" with "thesis" when "Thses"
(with a CAPITAL T) is defined.
Comment 10 falko.tesch 2003-09-29 10:39:26 UTC
started
Comment 11 royalozma 2003-09-29 12:52:11 UTC
This issue is about when the user keys in a misspelled word that
starts with a uppercase letter (eg: Teater as in Meet me at the
Civic Teater), but the replacement begins (eg: teater) with a
lowercase letter (eg: theater). The AutoCorrect replaces Teater
(as in Meet me at the Civic Teater) with theater (as in Meet me
at the Civic theater) no matter what.

Im not implying when someone types a misspelled word beginning with a
mandatory uppercase (eg: Cleopatra), but with a initial lowercase
(eg: clepatra) to wrongfully be corrected with a initial lowercase
(eg: cleopatra instead of cleopatra).

I explicitly require misspellings that beings with uppercase letters
to be replace with uppercase letters, but to dont force initial
lowercases on words whose replacements has mandatory uppercases.

-- Examples: --
Keyed in:  HAIL QUEEN CLEPATRA!
Expected:  HAIL QUEEN CLEOPATRA!
Result:    HAIL QUEEN Cleopatra!
(Replacement for "clepatra" is "Cleopatra", but not "CLEOPATRA".)

Keyed in:  I will see you at the Civic Teater.
Expected:  I will see you at the Civic Theater.
Result:    I will see you at the Civic theater.
(Replacment for "teater" is "theater", but not "Theater")

-- I am not saying that when I keyed in: --
Have you met clepatra?
-- To appear as: --
Have you met cleopatra?
-- I actually meant: --
Have you met Cleopatra?
(Replacement for "clepatra" is "Cleopatra", but not "cleopatra" nor
"Clepatra")


Comment 12 lonnco 2004-01-14 13:11:02 UTC
I can´t correct the orthography atomatically. When i try to do it says that its 
done but it isnt.
Comment 13 erikanderson3 2004-04-14 08:36:58 UTC
This is definitely a troublesome issue.  Case comes close in importance to
spelling in terms of producing professional documents.  I cannot easily use OOo
in a professional context due in part to this shortcoming.  

Of additional frustration is the inability to set the AutoCorrect replace list
to respect different cases.  Royalozma's examples are apt.  If I assign ':a' to
'ä' (lower-case umlaut a), I can neither use ':A' to get an 'Ä' (upper-case
umlaut A), nor can I set ':A' to 'Ä' in the auto-replace list without
overwriting the lower-case pair.  

I look forward to seeing this bug squashed.  Do we have any idea of a target
milestone for this one?
Comment 14 royalozma 2004-04-14 16:38:22 UTC
Yes, we want to use a natural keyboard entry flow when entering in those
whizzbang words & symbols all which use non-keyboard symbols via exploiting the
Autocorrect feature. The replacement doesn't follow the case pattern of the
original keyed-in word.

Yes, I have even program Auto-correct to replace :-) with the smiley face symbol
at U+263A.

Although anyone with X-windows can override keyboard mappings with the "xmodmap"
utility, everyone has this know-how.

This problem as explained above prevents me from using the transliteration
"regxo" to help me to type "reg^o", "Reg^o" or "REG^O" according to the case
pattern which I use to type in "regxo". [Note! The "g^" represents the letter G
with the circumflex.]


Another problem that is related here: When I add entries pertaining to "OPPS! I
forgot the initial uppercase" problems to Autocorrect Dictionary.

For example: When I add in "wicca" (word to be replaced) with "Wicca" (the
replacement) into the Autocorrect dictionary with the intentions to
transparently fix all lowercase entry of mandatory uppercase words, this
Autocorrect problem prevents the ALL CAPS entry of "WICCA" by incorrectly
replacing it with "Wicca".

Sure that I can press CTRL+Z to reverse Auto-Correct when it is not intended
(eg: all caps "WICCA" has been incorrected as "Wicca"), but it can restrict
natural keyboard entry flow.


Either include a "case-insensitive" mode to the Auto-Correct dictionary, or
include a mode where the case-form of the "replacement" is based on the first
two letters(*) in actually typed "word-to-be-replaced".

Note! If the replacement begins with an uppercase, the replacement must begin
with an uppercase even if the word-to-be-replaced was keyed with an initial
lowercase.

(*) Skip all the non-letter characters, so the :a :o :u trick will work. (eg:
With "M:archenland" -> "Märchenland", the letters "ma" will be used for analysis)
Comment 15 sgautier.ooo 2004-09-07 15:31:07 UTC
reassigning & adding keywords according to new RFE process - Sophie
Comment 16 ace_dent 2006-09-21 07:34:28 UTC
Some common ground with Issue 22961 :
'AutoComplete unable to differentiate between Capital Letters and Small Letters.'
Comment 17 kreationz 2008-02-23 22:02:38 UTC
This is still a problem in version 2.3.1 I can confirm that. I'm just getting
started in programming, but I'm think of downloading the source specifically for
this and the auto-complete issue.
Comment 18 patboland 2008-08-22 01:16:29 UTC
It seems this is still an issue in version 2.4.1.  In the situation where the
Autocorrect replacement table replaces "abbrevn" with "abbreviation", we want:

"abbrevn" replaced with "abbreviation";
"Abbrevn" replaced with "Abbreviation";
"ABBREVN" replaced with "ABBREVIATION";

So, this is my idea of what is required when the autocorrect action is triggered:

1. If "lower case of (typed string)" = "upper case of (typed string)"
then the typed string is probably not alpha; 
replace with "(replacement string)"

2. If "(typed string)" = "upper case of (typed string)"
then the typed string is upper case; 
replace with "upper case of (replacement string)"

3. If "(typed string)" = "lower case of (typed string)"
then the typed string is lower case; 
replace with "(replacement string)"

4. If none of rules 1-3 are satisfied,
then the typed string is mixed case, most likely title case;
replace with "title case of (replacement string)"
Comment 19 mike_hall 2008-12-14 21:10:13 UTC
This seems a fundamental requirement of any reasonable autocorrect facility and
it's probably not hard to do. OOo Later does not seem right. Any chance of
bringing it forward to 3.2?

cc'd self and mlissner
Comment 20 arnaud_versini 2010-07-06 10:29:00 UTC
Created attachment 70414 [details]
Patch to resolve this issue
Comment 21 Mathias_Bauer 2010-07-06 11:50:04 UTC
Thanks for the patch, I hope to get it reviewed in the next few days.
Comment 22 cedric.bosdonnat.ooo 2010-08-20 14:58:09 UTC
it turns out to be a patch now
Comment 23 tommy27 2011-03-03 06:43:45 UTC
this issue is fixed in LibreOffice 3.3.1

see release notes: http://www.libreoffice.org/download/new-features-and-fixes/
Comment 24 hdu@apache.org 2012-02-13 08:25:43 UTC
Applied as revision 1243429. Thanks Arnaud!
Comment 25 rgb 2012-04-27 22:42:55 UTC
It is not working on rev 1327774. I remember that it worked on a dev build a couple of month ago, but I the problem came back on last builds.
Comment 26 hdu@apache.org 2012-05-04 15:29:42 UTC
It seems the applied patch only fixed recognizing of non-case-matched text and that the other part would be fixed by issue 22961's patch when it has been extended for handling unicode letter cases.
Comment 27 binguo 2012-06-20 07:06:19 UTC
Verified it on Aoo_Trunk_20120616.1800.1350879 and it still reproduces, Autocorrect does not work well, so reopen it.

The detailed info about one unexpected scenario as below:

 * I add in "regxo" (in the AutoCorrect Dictionary) to be replaced with the
   word "reg^o" (`g^' is actually a `g' with a `^'. UCS code is 011D)
 * When I typed in "Regxo" in a middle of a sentence, it gets replaced with
   "Regxo" instead of "Reg^o".
 * Also "REGXO" gets replaced with "REGXO" instead of "REG^O".
Comment 28 Rob Weir 2013-03-11 15:00:53 UTC
I'm adding this comment to all open issues with Issue Type == PATCH.  We have 220 such issues, many of them quite old.  I apologize for that.  

We need your help in prioritizing which patches should be integrated into our next release, Apache OpenOffice 4.0.

If you have submitted a patch and think it is applicable for AOO 4.0, please respond with a comment to let us know.

On the other hand, if the patch is no longer relevant, please let us know that as well.

If you have any general questions or want to discuss this further, please send a note to our dev mailing list:  dev@openoffice.apache.org

Thanks!

-Rob
Comment 29 Pedro 2014-12-04 14:31:42 UTC
This bug is fixed in AOO411m6(Build:9775)  -  Rev. 1617669

Tested with sentences mentioned in Comment 11

HAIL QUEEN CLEPATRA!
I will see you at the Civic Teater.
Have you met clepatra?

(Notice that to check the first sentence you need to enable "Check uppercase words" which is NOT enabled by default and requires AOO to be restarted in order to work - shouldn't it work without restarting?)
Comment 30 Kay 2014-12-22 22:28:42 UTC
Closing based on Comment #29. Please reopen if needed.