Apache OpenOffice (AOO) Bugzilla – Issue 32169
Spell checking slow when multiple languages installed
Last modified: 2013-08-07 14:41:36 UTC
A user who has dictionaries for 25 languages installed complains that when he has a misspelled word in OOo Writer that has a red, wavy underlined and he right-clicks to correct it, it takes the program 2-3 minutes to bring up the menu with the corrections. After about 3 misspelled words corrected, it runs much faster, almost like it finally loaded the dictionary into cache. This would probably be caused because OOo on start builds hashtables for all the languages, registered for spellcheck, and 25 languages would consume an immense amount of memory for the respective hashtables. Admittently, 25 installed dictionaries is unusually much, but there is no point that merely having multiple dictionaries installed should hog the system down when only one or two are in use. Could this be changed such that hashtables are only created 'on demand', i.e. for the language(s) that are in use?
reassigned to SBA.
Add me to CC:. This is a real problem, yes.
i can not really confirm this, but this could depend on an fast pc and lot of memory. i only have to wait 15 seconds for the spellchecking dialog. all available dictionaries installed on this system.
Can the issue type be changed to "Defect"? In GNOME applications like Gaim, if I type something in it takes a maximum of three seconds for a right-click to give me a list of spelling suggestions. If I do the same thing in OpenOffice, it freezes the application for more than a minute and I'm afraid of right-clicking on something again. Or I'll just be afraid that during the freeze OO will crash and I'll loose what I've written until that point. Even after I turn off a lot of the dictionaries, the performance still isn't that great for that right-click.
SBA->Ftack: Please verify your findings on OOo 1.1.4 and comment her, thank you. Please provide more data (see JWs comment: PC power counts, too): - Operating System, CPU speed, RAM - the amount and size of currently opened documents - Where are the dicts stored (local hard drive, network...) - the amount and size of dictionaries - what happens if you use fewer dicts (one, two, five, ten, twenty, all) I know this is a lot if you didn't experienced this yourself (You wrote "A user who has...") SBA->Pjanik: If you have a problem too, please join and add YOUR data, best if you can compare OOo 1.1.4 with the latest build (680m74).
sba: unfortunately I can't add data for SRC680_* now, because I have yet to add our dictionaries into my build system, but you can see e.g. http://www.openoffice.org/issues/show_bug.cgi?id=41466 which was just filled. There is a piece of good info there.
*** Issue 41466 has been marked as a duplicate of this issue. ***
SBA: I talked to TL about this. When a misspelled word is not found in the applied language, the context menu tries to find a matching language (For the context menu entry "Word is [LANGUAGE]" and "Paragraph is ..."). Therefore dicts get loaded until a matching one is found. Worst case: None found => all loaded. This can be adressed by reducing the number of languages to search for. In what way this shall be done, must be discussed by the I-team and the spec changed accordingly: http://specs.openoffice.org/appwide/linguistic/Spellcheckdialog.sxw
What about to load one dictionary for context, locate the word. If it is not matching, unload the first dictionary then load the next... Of course the dictionary what is match with the language of document and/or all dictionary what is use in the document should remain in the memory...
TL->kami: Im plementing it that way will result in numerous dictionaries loaded/unloaded **each** time the context menu is opened. That way the memory footprint will well be improved but the time for opening the menu will increase badly. Curently the time will be bad only for the first opening of the menu. If implementes as you suggested it will be that bad for every opening of the menu. The only reasonable way seems to limit the number of dictionaries being searched but allow them to stay in memory for the next use. A fair set of languages to use for this might be to use only those languages that are already currently used in the document. A normal document is quite unlikely to use more than lets say 3-5 languages.
TL->FT: Please note that the context menu of the Writer is a different implementation than the one from Calc/Drwa/Impress (which use the EditEngine). I'm not sure if it is a problem for those applications to have a list of all used languages as well.
I didn't look under the hood of spellchecking, but if Writer load the dictionary selected by documents name, and load the dictionaries based on all used language by documents. It will use 1 to 5 dictionary. This solution has smaller footprint, and it loads additional dictionary only when it is really need...
Any brainstorming?
Are there any option in settings to turn off "the context menu tries to find a matching language" behavior?
Please change the current behavior like described below. This solution is an interims solution until the new context menu, defined in the following spec., has been finally implemented: http://specs.openoffice.org/appwide/linguistic/Spellcheckdialog.sxw New interims behavior: If the context menu for a misspelled word is called, only the following wordbooks are searched for this word and the search ends immediately after the first match: 1. The operating system's language 2. all languages already used in current document 3. English as the default wordbook
Like discussed today. To keep it simple: 1. The OOo document default language from Tools-Options - Western (not the current document language) 2. OOo user interface language (from Tools-Options - User interface language) 3. OOo locale setting (from Tools-Options) 4. English (US) as the default wordbook
.
Fixed in CWS contextmenu. Files changed: svx: - editview.cxx sw: - olmenu.cxx
. re-open issue and reassign to sba@openoffice.org
reassign to sba@openoffice.org
reset resolution to FIXED
Is the final behavior exclude this: 2. all languages already used in current document
Not for the final (the time is to short) but as mentioned by FL this should be part of the spec to be implemented later.
SBA: Verified in CWS contextmenu. The performance gain (speed and RAM use) is well-worth. I found a tiny problem that needs to be discussed... My Scenario: - A document with texts in many different languages. All characters are formatted to en_US. - I have German als default document language set => I get proposals "Word is German"/Paragraph is Germn" in the German Text part. Good. - I get no proposals "Word is" for Italian -> Good NEXT... - I switch UI to Italian -> I get proposals "Word is" for Italian -> Good NEXT... - I switch UI to German, and switch system Locale (Linux SUSE 9.2) to Italian -> I get NO proposals "Word is" for Italian -> BAD - I load a document that is partly formatted to Italian -> I get NO proposals "Word is" for Italian -> BAD This must be sorted out later because the Performance thingie is fixed. Set to verified.
SBA: Small correction: System locale does work. (bringing proposal "Word is Italian" for i.e. "stramato") BUT... Having some text formatted to Italian does not work (no proposal "Word is Italian" for italian words formatted as English in the same document).
The old and new behavior may be seletable from options. Also it would be nice if you add spellchecking independent language selector. If you click on bad world, you can't select the language of that word and/or paragraph, only one (the guess) chance to change the language. We should add a submenu with list of supported language, where we can change the language of word/paragraph.
SBA->Kami: You write [...] "...may be seletable from options" [...] "...it would be nice if..." [..]"...should add a submenu with..."[...] These quotes tell me that you intend to write three Feature requests. Feel free to do so. Please note that the only "guarantee" for whatever implementation is to have a willing and able community developer who wants to build this in. Especially that "Yet-another-option" thing was overdone in the past decade, leading to a jungle of "power user options" that overwhelm and "shock off" the average user. Therefore we try hard to reduce options, well aware of the fact that every user prefers another office to meet his/her needs - wich is impossible to acheive by whatever default. Thus "The Office" was, is and will always be a compromise that (hopefully) meets the needs of the majority of the users (...not only the few ones who "communicate most and loudest", sorry :-) The performance thingie is OK in 680m106. Closed.
I really don't know how my ideas useful. I just write down my ideas, request, etc. and checking others opinion. I think this work what you requested in the readme file... You wanted to build community for this great software. I want to help as way as I can. Whats wrong with it? I do not want to be the the loundest, but I want to help your work. Check my issues. Okay, not all of them is perfect or good, but I try to help you with ideas and bugreports. I am doing lot for the Hungarian edition... That all what can I do after my workdays. About options. Sometimes less is more I know. But what about to build two level of options? One for normal users and one for advanced users. They can be UI driven or first could use UI and the second may use mozzilla-like about:config options... But I think this may be part of another thread...