Issue 17624 - Arabic Tashkeel (a.k.a. Diactric) bug : ZWSP characters are given spaces in some fonts!
Summary: Arabic Tashkeel (a.k.a. Diactric) bug : ZWSP characters are given spaces in s...
Status: CLOSED IRREPRODUCIBLE
Alias: None
Product: gsl
Classification: Code
Component: code (show other issues)
Version: OOo 1.1 RC
Hardware: PC Linux, all
: P3 Trivial with 9 votes (vote)
Target Milestone: OOo 3.1
Assignee: hdu@apache.org
QA Contact: issues@gsl
URL:
Keywords:
Depends on:
Blocks: 79434 21821
  Show dependency tree
 
Reported: 2003-07-30 11:44 UTC by kefah
Modified: 2008-12-06 20:21 UTC (History)
7 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Contains screenshots and an sxw file showing the bug. (93.89 KB, application/zip)
2003-07-30 11:49 UTC, kefah
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description kefah 2003-07-30 11:44:39 UTC
Hello,

I'm using OpenOffice RC1 release , I captured some issues that I want to to 
report to you.

The Taskheel must be Zero Width Spacing (ZWSP) character according to UNICODE. 
OO RC1, does not honor that and it adds a space, breaking the arabic word (as 
arabic letters with tashkeel should remain continuous). See the attached 
screen shot and the annotated screenshot.

Steps to reproduce:

a. Enabe CTL :
b. Open the attached sxw arabic text file. ( you can download arabic fonts  
from ArabEyes.org and those fonts work fine with other software). 
c. You will notice (according to the annotated screen shot that there is a 
space caused by the tashkeel. [ in Arabic many letters are 
connected together unless there is as space ] 

thanks for all the developers who made OpenOffice possible .

Best,
Kefah and Hamzeh from freesoft.jo
Comment 1 kefah 2003-07-30 11:49:33 UTC
Created attachment 8138 [details]
Contains screenshots and an sxw file showing the bug.
Comment 2 christof.pintaske 2003-07-30 11:52:36 UTC
cp->hdu: looks like something for you
Comment 3 kefah 2003-07-30 12:28:17 UTC
elaboration:

Tashkeel characters (diactrics : combining and format marks) are
"transparent characters" according to Unicode 3.0 reference, chapter
8.2, cursive joining, R1 "Transparent characters do not affect the
joining behavior of base (spacing) characters." and they give an
example. Also Table 8-2 on the same chapter defines tashkeel as
Transparent Class T.

The fix is in giving those charcters zero width spacing and laying
them on the previous letter.

- Kefah.
Comment 4 hdu@apache.org 2003-07-30 14:00:19 UTC
Confirmed and setting target to 11pp1 
 
For newer builds the problem is gone on win32 platforms, on unx platforms it is still 
there. The balancing of what to feed the external layout engines and how to merge 
the results back still needs some work. 
 
I also noted a regression when the multicolored words are used in CTL text. 
Comment 5 hdu@apache.org 2003-08-28 17:00:45 UTC
In CWS vcl7pp1r2 there are now many fixes for this problem. When the  
view in the "Online Layout" view mode looks good, everything is fine.  
  
Unfortunately this is not always the case, since there still seem to be some  
problems when the text + opentype font fed to icu don't result in properly  
layouted glyph vectors. It seems in some cases a Latin layout is performed  
instead of the Arabic one even though an ArabicLayoutEngine is used.  
Example: U+0639 U+0635 U+0628 U+064A U+0629 in KacstLetter 1.4.  
Still analyzing... 
Comment 6 hdu@apache.org 2003-10-16 14:31:40 UTC
Retargeting.
Comment 7 thorsten.ziehm 2004-03-12 13:27:08 UTC
Because of limited resources for OOo1.1.2 we decided to shift this task to later
release.
Comment 8 bayazidi 2004-04-20 13:30:18 UTC
it seems that this issue will keep jumping to later releases. what can be done
to have it included soon, can an OO developer point to the cause of this issue,
and paybe a patch for it will come soon.
Comment 9 munzirtaha 2004-08-11 15:43:52 UTC
I tested with 1.1.3 and noticed that with some fonts this bug disappears. For 
example with Lucida family fonts: "LucidaBright", "Lucidasans", and 
"Lucidatypewriter", also with Tahoma every thing is Ok!! 
Comment 10 arnonoss 2004-12-28 19:20:01 UTC
Looks like this issue is a duplicate of
http://qa.openoffice.org/issues/show_bug.cgi?id=14069 (the hebrew diacritics
taking their own letter space instead of overlapping the previous letter)
Comment 11 hurkanos 2005-02-12 19:16:27 UTC
Problems:
1.At end of line vocalization is not shown.
2.Vocalization not displayed after changing color of letter.
3.Before point, comma etc.  vocalization is not shown.
4.1. and 2. at previous versions as well.
5.Error persists at any alignment chosen.
6.I encountered the same problem with end of line vocalization in Hebrew.

Kind regards
Shimshon
Comment 12 Joost Andrae 2008-07-09 11:36:02 UTC
retarget to 3.1
Comment 13 hdu@apache.org 2008-12-02 16:01:38 UTC
Works in OOo3.0 (tested in CWS kashidafix).

The only remaining issue when loading the bugdoc is that changing the attribute (like font color, text 
background, font size) of partial words is currently not supported especially in CTL scripts. E.g. the 
bugdoc contains the word
U+062D U+0645 U+0651 U+0650 U+0634
The U+062D U+0645 (Hah + Lam) should get into one glyph, but after the U+062D there is an 
attribute change of the background color and the text color, and also after the U+0645 the text color 
changes. So what to do when the text color changes inside one glyph?

@fme: I suggest that the WriterEngine/EditEngine should prevent attribute changes inside CTL words, 
because they don't make much sense. Should I write a followup issue for this enhancement?

Other than that the most appropriate status of this issue is not reproducable anymore.
Comment 14 frank.meies 2008-12-03 06:57:20 UTC
[...] @fme: I suggest that the WriterEngine/EditEngine should prevent attribute
changes inside CTL words, because they don't make much sense. Should I write a
followup issue for this enhancement? [...]

Well, you can make some efforts to handle this case in the UI, but you certainly
cannot suppress attribute changes on the file format level.
Comment 15 Mechtilde 2008-12-06 20:21:05 UTC
worksforme -> closed