Issue 74850

Summary: Extend word completion to multiple words
Product: Writer Reporter: mrosin <mattr>
Component: editingAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: issues
Version: OOo 2.1   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
Sample use of a glossary
none
Sample glossary perl script glossing.pl none

Description mrosin 2007-02-25 13:55:06 UTC
Automatic word completion is noted by many journalists as a key benefit and 
difference of OO Writer from MS Word. I particularly use it to reduce strain 
on my hands.

Currently it only supports single words, I believe. However, individuals often 
use specific phrases that they like or that are needed by a client i.e. to 
remain integrated with other documents. So a number of multi-word phrases are 
likely to appear many times in a large document (especially technical docs).

I would like to propose that this function be extended to include phrases as 
well. For example if it remembered "from client to server" then when I 
type "client t" it will see a potential multiword phrase, and suggest "client 
to server" as a completion.

To keep things sane I believe this would require the document to be indexed in 
real time and a running count of frequency of phrase usage for the current 
session be maintained. A simple heuristic would note phrases that appear often 
and add them to the word completion file (perhaps this is autocorr).

The suggestions would become even more useful if for example there are two 
commonly used long phrases that are similar up to a certain point, and the wor 
d completion function could suggest a choice between them in a way similar to 
Visual Studio style function suggestions, but this is really icing on the cake.

Since the autocomplete function is so evidently useful, and is I think a major 
point for it over competitors, I believe it should be improved.

Therefore I have submitted this recommendation, and also have at the same time 
submitted a recommendation for the ability to save the autocor file on a 
remote ftp server account and import it on a per-session basis into other OOo 
installations. Thank you for your consideration and enthusiasm! Hooray OOo 
Team!
Comment 1 michael.ruess 2007-02-26 08:36:49 UTC
Reassigned to SBA.
Comment 2 kpalagin 2007-02-26 13:14:41 UTC
In the mean time you can use Autotext feature, which requires some setup, but 
is already available.
In fact it would make sense to combine the two is some clever way.
Comment 3 mrosin 2007-02-27 07:24:05 UTC
Created attachment 43404 [details]
Sample use of a glossary
Comment 4 mrosin 2007-02-27 07:24:30 UTC
Created attachment 43405 [details]
Sample glossary perl script glossing.pl
Comment 5 mrosin 2007-02-27 07:55:52 UTC
Hi, I just uploaded a perl program I wrote for a glossary-like function that 
might help you in thinking about the proposal. 

The same file got uploaded twice.. Funny there is no link to delete one!

You can read the perldoc in the file to get the idea. When I translate 
documents from Japanese to English I often get word lists that can be very 
long. They are generally two column excel files, Japanese in the left column 
and English in the right. If you export that excel file to a tab separated 
file, my program will eat it up and globally replace all instances of the left 
column with its right hand column.

A semiautomated version of this is apparently a key component of a very 
popular professional application called Trados for technical translation. It 
is quite complicated and I guess expensive though.

Anyway, I have not used autotext but anyway here is an additional request. Be 
able to do what I do in this program, and use OO Calc to manage word lists 
(generally specific to either an industry or client), bonus points for 
allowing a remote database to also be used. I suppose this is something I 
could build into a product but why not make it a part of OOo? It would 
certainly be a good reason to install OOo in translation companies and might 
be useful for translating shorthand, or even in software localization. I doubt 
you'd use it but if you want I'm willing to license this code as gpl. I have a 
few test dictionaries and a sample file if you want them.

In short, I would like you to think about automatic word completion as a very 
important function that is a subset of automatic phrase completion, and that 
the database used by this function should be manageable, drawing from and 
saving to remote sources, and that a very related and useful feature might be 
enabled by adding another column to that database for automatic global phrase 
replacement i.e. glossing, if this is the word best used. In this model a new 
table of the database could be made for an industry, client or project, and 
could be shared within a company or through a public website or database. A 
way to diff and merge two completion files would be another plus. Thanks for 
your consideration.

In case you are interested, the examples I mentioned are Trados 
(http://www.trados.com/) and WordFast (http://www.wordfast.net/). Both are 
commercial and are integrated with MS Office.
Comment 6 pmike 2007-08-20 13:50:58 UTC
It's possible to cycle through variant by pressing Ctrl+Tab
Should this RFE be changed to UI improvement?
For the best ux OOo should display list of possible variants (limited to about
8-10 cases, followed by "..." if more exists) and should highlight selected case
with bold chars.
Comment 7 mrosin 2007-09-26 10:30:15 UTC
Hello,

I'm posting an update since this issue has been in limbo for a month and I have
three specific usage scenarios, I suppose this is a complex feature. (Also
thanks for note about AutoText, but it is really unusable for anything beyond
once or twice per document. We need something that works inline and quickly.)

Currently I'm translating factory layout schematics. For example there is
something called a "Synchronous Generator". 
Currently I type Syn<RETURN><SPACE>Generato<RETURN>
(it catches on General and Generated).
1. Since I have typed the same sequence a number of times this document, it
should know I want Generator, not General or Generated.
2. I would like it to just insert the whole two-word sequence "Synchronous
Generator" when I type Syn<RETURN>.

Now here is the second scenario. I am using a few similar phrases. In the
factory there are Lubrication Oil Unit and Lubrication Oil Sump Tank. There are
lots of similar examples of related phrases.
Currently I type Lub<RETURN><SPACE>Oil<SPACE>Unit
But I would like to type Lub and see the two candidate phrases, and if they are
both of similar frequency (should be) then they should be shown in alphabetical
order so they aren't constantly jumping up and down in the list as the frequency
changes slightly. Then half the time <RETURN> would suffice, and half the time
<DOWN ARROW><RETURN> would suffice.

Third scenario, perhaps mentioned earlier but it came up again here. I find
whenever I type the word "Unit" OOo wants to make it "Units" and I have to
escape by typing a <SPACE>. In the above second scenario, one of the candidates
includes the word "Unit" so it should not try to autocomplete to "Units".

I'm not going to go too much farther in this now but this kind of facility could
in the future also be linked with a semiautomatic tool that uses the same
database to help you unify the usages of words in a big document, in other words
an editor's tool. For example this weekend I have been asked to unify a 200 page
corporate document put together by 10 people. I have to make sure all the words
used are unified. So in the above example I checked and actually due to working
on the same document over two days I had both "Oil Lubrication Unit" and
"Lubrication Oil Unit". Java-based natural language analysis engines are
currently very good at picking noun phrases out of text and those that are very
similar except have the order changed, or those with the same meaning but using
different words, could be flagged. Anyway I would like the autocompletion
feature enhanced sooner but this is food for thought for the future.

Matt Rosin telebody at gmail dotcom
Comment 8 stefan.baltzer 2007-09-26 10:52:17 UTC
SBA: Confirming issue.
Reassigned to requirements.