Issue 13494 - Character processing has changed dramatically from 1.0.x
Summary: Character processing has changed dramatically from 1.0.x
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: OOo 1.1 Beta
Hardware: PC All
: P4 Trivial (vote)
Target Milestone: ---
Assignee: stefan.baltzer
QA Contact: issues@sw
Keywords: oooqa
Depends on:
Reported: 2003-04-16 07:54 UTC by nirendram
Modified: 2013-08-07 14:41 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Note You need to log in before you can comment on or make changes to this issue.
Description nirendram 2003-04-16 07:54:51 UTC
My Original Problem
Ctrl-LeftArrow in 1.1beta appears to treat each underscore as a word, and skips
one underscore each time.

The Revised Problem
Character processing appears to have changed quite a bit in 1.1beta as compared
to OOo 1.0.x.

The Systems Tested
OOo 1.0.2 and 1.1beta on Win98SE.

The Test
Test data: the line  xxAAxxBBxxCCxx
with the 'x' characters replaced by the characters to be tested. I performed a
Ctrl-Left and Ctrl-Right from end to end on the above line for each and every
non-alphanumeric character on my keyboard.

The characters I tested were:

The Results
I divided the characters into 4 categories:
1. Characters which, when entered successively,
   are treated as one word when traversing left
   and right with Ctrl-Left / Right.

2. Characters which, when entered successively,
   are treated as one word when traversing right
   but as separate words (each character = one word)
   when traversing left.

3. Characters which are each treated as separate
   words regardless of the direction of traversal.

4. Characters which are treated as part of a word.

In OOo 1.0.2, all characters except the '@' sign fell into category 1. The '@'
sign fell into category 4.

In OOo 1.1beta, the situation was as follows:
. Category 1 - ; ' , .
. Category 2 - ! @ # % & * ( ) _ -
               { } [ ] " / \ ?
. Category 3 - ~ $ + ^ = < > |
. Category 4 - None

It appears as if there is a vast difference in the way these characters are
treated in these two versions of OOo.

My results have been confirmed by at least one person (on 644m7 / WinXP). Please
see the thread 'Character processing changes in 1.1beta' in the users list.

Surely these changes cannot be by design, as they make things quite inconvenient.
Comment 1 prgmgr 2003-07-10 20:35:43 UTC
Thank you for using and supporting OOo.

Verified in 1.1 Beta 2.

PM->HI:  Defect or enhancement?  Is this related to the changes
         in the break iterator?
Comment 2 h.ilter 2003-07-14 13:46:04 UTC
HI->FME: I've prepared a doc for this. Please load ../fme/13494.sxw
Comment 3 frank.meies 2003-07-14 13:53:44 UTC
FME->KHONG: Looks like a breakiterator issue.
Comment 4 karl.hong 2003-07-14 23:09:07 UTC
We made big change from OOo1.0 to OOo1.1, in OOo1.0.*, we wrote a 
simple word breakiterator ourselves. In OOo1.1, we changed to use ICU 
breakiterators for word/line/sentence breaks. We could not say ICU is 
perfect, but it does take a lot of things into account.

We also made some changes in ICU breakiterator to meet our need. I 
will do some investigation to see what part of test result from this 
bug is contributed from ICU, and what part is from our patch. Some 
obvious problems like in category 2 needs to be fixed.

Comment 5 karl.hong 2003-08-08 23:56:46 UTC
Problems for category 2 and 3 is fixed.

All punctuations and signs entered successively are treated as a 
single word, as described in category 1.

Only single apostrophe and period are treated as part of a word, as 
described in category 4.
Comment 6 karl.hong 2003-08-09 00:14:20 UTC
As another bug points out, period, or full stop should not be part of 
word. I remove it. Now only apostrophe is treated as part of word.
Comment 7 karl.hong 2003-09-09 23:51:13 UTC
Verified in CWS i18n08.
Comment 8 oc 2003-09-22 15:45:34 UTC
Adjusting owner
Comment 9 oc 2003-09-22 15:45:56 UTC
adjusting resolution
Comment 10 stefan.baltzer 2003-11-06 11:49:24 UTC
SBA: Changed OS to "All". Verified in CWS i18n08.
Comment 11 jack.warchold 2004-08-06 15:03:09 UTC
seen good in 680_m49-4