Issue 21808

Summary: ellipses not recognized correctly
Product: Infrastructure Reporter: Unknown <non-migrated>
Component: Website general issuesAssignee: karl.hong
Status: CLOSED NOT_AN_OOO_ISSUE QA Contact: issues@lingucomponent <issues>
Severity: Trivial    
Priority: P3 CC: issues, karl.hong, thomas.lange
Version: currentKeywords: oooqa
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---

Description Unknown 2003-10-28 10:20:08 UTC
while spell checking, ellipses joining two words are treated as one big word,
and so incorrect (e.g. this is true for the ellipses character and
three periods in a row "Â…" or "..."

Comment 1 khendricks 2003-10-29 20:36:50 UTC

According to my grammar guide the ellipsis should be between two spaces like the 
following to be correct "word1 ... word2".

If you do that and use either 3 dots or the ellipsis char all will work fine.  I will check 
other grammar books to see if I can find a defintiive answer as to whether ellipsis can 
be used to connect two distinct words without any spaces.

Comment 2 khendricks 2003-10-30 14:09:50 UTC

Adding as CC to this just in case this is a break iterator issue.

Comment 3 khendricks 2003-10-30 14:12:23 UTC
Hi Thomas,

Should the single character ellipses "..." be a break character in breakiterator?

I can't think of any way to handle this issue from the lingucomponent side?

If so, then I think this issue belongs with sw and not lingucomponent.

What do you think?


Comment 4 thomas.lange 2003-11-03 09:47:26 UTC
Well the problem at hand is we have three cases:
a) regular word with ellipsis
b1) abbreviation like e.g. with ellipses
b2) abbreviation like Dr. with ellipses

And I wonder if there is a laguage which has an abbreviation with a
dot inside but none at the end like
That would yield another case

As far as I know there are laways three dots to be used regardless of
if the last word is an abbreviation and features a dot at the end on
it's own.

Thus you can see in the cases a, b1, b2 that it will be not that easy
for the breakiterator two decide what part belongs to the ellipses and
what to the possible(!) abbreviation.
And as can be seen in b1 at least a single dot must not break the word.

I do not see how this can properly be done without sth like a database
of abbreviations to be queried. And since the breakiterator shluld be
a simple, fast thing to use I wonder if that is what one likes to do.

However if it can be done I think it should be done by the breakiterator.

Thus I'm adding Karl as CC and pass this one on to him for some
comments / ideas.

->Karl: Please have a look.
Comment 5 khendricks 2003-11-03 12:36:25 UTC
I am not sure abbreviations get merged into the ellipsese.  I will double chck that just 
to make sure. 
Either way, I do think at least the single ellipses character U+2026 should be made a 
break point just like most general punctuation is. 
Comment 6 karl.hong 2003-11-04 00:24:31 UTC
We currently treat '.' as part of word in dictionary mode. It is difficult to 
detect how many dots and treat them differently. 

Since U+2026 is a single character, it does break the word, like other 
Comment 7 khendricks 2003-11-04 01:10:46 UTC
After what Karl said, I tried connecting two words with the single ellipses character 
and sure enough, it worked just fine. 
Your original bug report said this did not work. 
Please try using the "Insert Special Character and insert an ellipses U+2026. 
This works just fine.  
Perhaps we should close this as WorksForMe? 
Comment 8 karl.hong 2003-11-12 23:29:58 UTC
Close as invalid, since it works as expected on ellipses character.
Comment 9 ace_dent 2008-05-17 21:04:08 UTC
The Issue you raised has been marked as 'Resolved' and not updated within the
last 1 year+. I am therefore setting this issue to 'Verified' as the first step
towards Closing it. If you feel this is incorrect, please re-open the issue and
add any comments.

Many thanks,
Cleaning-up and Closing old Issues
~ The Grand Bug Squash, pre v3 ~
Comment 10 ace_dent 2008-05-17 23:06:11 UTC
As per previous posting: Verified -> Closed.
A Closed Issue is a Happy Issue (TM).

Comment 11 ace_dent 2008-05-19 11:27:08 UTC
Related problems with spell check and Ellipsis:
Issue 4297 - Spell Check Includes Punctuation (such as ellipses) as Part of Words
Issue 21808 - Ellipses not recognized correctly
Issue 29420 - Ellipsis are not recognized as a punctuation mark in spellcheck
Issue 60810 - Spellcheck dialog has problem with word followed by an ellipsis