Issue 28203

Summary: Make kashida justification depend on script type, not language attribute
Product: Internationalization Reporter: bayazidi <isam>
Component: BiDiAssignee: stefan.baltzer
Status: CLOSED FIXED QA Contact: issues@l10n <issues>
Severity: Trivial    
Priority: P3 CC: farzaneh, frank.meies, hennerd, issues, kefah, munzirtaha, pavel, stefan.baltzer
Version: OOo 1.1.1Keywords: oooqa
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---
Issue Depends on:    
Issue Blocks: 79434    
Attachments:
Description Flags
sample file
none
Justification problem in arabic words.
none
PDF output of justifying bug in arabic words.
none
My langauge configuration
none
Fixing justify problems by changing "fa" to "ar"(farsi to arabic)
none
Fixing justify problems by changing "fa" to "ar"(farsi to arabic)
none
Right justifying, done via TeX(actually via unicode version of that, Omega)
none
Persian specific characters do not join correctly to others
none
pdf file created from jam.odt directly with OpenOffice, that has additional problems
none
Postscript file created from jam.ps that has fewer problems from jam.pdf
none
No kashida is added and some space is created between characters none

Description bayazidi 2004-04-22 08:43:33 UTC
in Arabic language, the Kashida (Arabic Tatweel) UNICODE 0640 is used to extend
an arabic word, to make justified text.
Currently, the Kashida is added in wrong places and positions, it is added in
the spaces between words, while it should appear inside the words, and between
letters that cal link to it.
Comment 1 sforbes 2004-06-24 14:23:47 UTC
confirmed with OOo 1.1.2 tech preview 4 on osx
Comment 2 sforbes 2004-06-24 14:26:05 UTC
Created attachment 16117 [details]
sample file
Comment 3 Dieter.Loeschky 2004-07-05 11:06:42 UTC
DL->US: Could you please handle this?
Comment 4 ulf.stroehler 2004-07-05 11:22:44 UTC
Seems we've lost our ability to layout a paragraph justified in OO.o 1.1.2.
Seems to work flawless in 680_m45.
Comment 5 frank.meies 2004-07-05 11:31:12 UTC
FME->US: The language character attribute is set to 'Hebrew'. If you set it to
'Arabic' everything should work fine.
Comment 6 ulf.stroehler 2004-07-05 11:39:01 UTC
US->FME: unfortunately not.
Comment 7 frank.meies 2004-07-05 13:49:19 UTC
FME->US: For win32, setting the correct language resolves the problem. 

FME->HDU: There seem to be some more Linux-specific problems remaining. Could
you please have a look?
Comment 8 hdu@apache.org 2004-08-06 15:57:32 UTC
Yes, when the script is set to a language that does not support kashida
justification the Writer will use a non-kashida justification algorithm. Since
the layout engine knows that Arabic is involved it will put the kashidas at the
incorrectly calculated places where the Writer leaves room for them.

The best fix would be to adjust the justification method dependent on the
unicodes involved. This means to always do kashida justification when Arabic
unicodes are involved.
Comment 9 ulf.stroehler 2004-08-06 19:13:29 UTC
I apologize!
The inability of the layout justified is *not* reproducible.
And the cashidas appear after setting the document locale to arabic.

I concur with HDU's suggestion to automatically detect the need for cashida
setting. Leaving it up to the user to correctly set the document locale is too
error-prone (especially in multilingual documents).
Comment 10 hdu@apache.org 2004-08-09 13:49:11 UTC
HDU->FME: See the request above to enable kashida justification depending on the
unicodes selected, not on the language. Alternatives are to select the language
depending on the unicodes or after the input method language. For this case to
kashida or not to kashida the unicode method would be sufficient though.

So I'm changing the issue type to enhancement.

HDU->US: please set an appropriate target and reassign to FME.
Comment 11 ulf.stroehler 2004-08-09 16:18:56 UTC
Transferring to FME.
Comment 12 frank.meies 2004-08-10 06:35:12 UTC
Changed issue type to 'enhancement'
Comment 13 frank.meies 2005-08-09 13:51:15 UTC
Removed keyword 'regression'
Comment 14 hossein.ir 2005-11-20 07:38:51 UTC
I am having this problem, even when setting language to arabic from OpenOffice
settings. Should I set the system-wide locale to Arabic?
I am having this problem in all versions from 1.1 to 2.0. This is really annoying.
Comment 15 hossein.ir 2005-11-21 08:48:01 UTC
Now I am using OpenOffice.org version 1.1.3 on Debian stable(sarge). Here's more
information about the version:
$ apt-cache show openoffice.org-bin
Package: openoffice.org-bin
Priority: optional
Section: editors
Installed-Size: 126976
Maintainer: Debian OpenOffice Team <debian-openoffice@lists.debian.org>
Architecture: i386
Source: openoffice.org
Version: 1.1.3-9
Replaces: openoffice.org1.1-bin, openoffice.org-gnome
Provides: openoffice.org1.1-bin
Depends: libart-2.0-2 (>= 2.3.16), libaudio2, libc6 (>= 2.3.2.ds1-4), libcurl3
(>= 7.13.1-1), libdb4.2++, libfreetype6 (>= 2.1.5-1), libgcc1 (>= 1:3.4.1-3),
libice6 | xlibs (>> 4.1.0), libmyspell3, libneon23 (>= 0.23.9.dfsg.3), libsm6 |
xlibs (>> 4.1.0), libstartup-notification0 (>= 0.8-1), libstdc++5 (>=
1:3.3.4-1), libstlport4.6, libx11-6 | xlibs (>> 4.1.0), libxaw7 (>> 4.1.0),
libxext6 | xlibs (>> 4.1.0), libxt6 | xlibs (>> 4.1.0), zlib1g (>= 1:1.2.1),
libfontconfig1, debconf (>= 1.2.0) | debconf-2.0, openoffice.org (>> 1.1.2+1.1.3)
Conflicts: openoffice.org1.1-bin
Filename: pool/main/o/openoffice.org/openoffice.org-bin_1.1.3-9_i386.deb


I am having the same problem on my windows XP home edition, using OpenOffice.org
version 1.9.122. I am having the same problem on SuSE 10 Proffessional. I don't
know the actual version, but as far as I know, it was released on 2005-08-31 and
it is a pre-release: OpenOffice.org 2.0-pre.
Actually, whenever a new version comes, I look there, and the problem is still
there. Unfortunately I couldn't test the OpenOffice.org release 2.0, but I think
it is no different.
I will attach a .sxw document, and the pdf output that shows this problem.
Another problem is visible in this file, and that is the line that is visible
instead of ZWNJ(zero width non-joiner) character. I think this problem is fixed
in newer versions(this line is visible, no matter what your settings are. If you
choose not to display ZWNJ in settings, it is still shown, and it will be shown
in the PDF output).
Comment 16 hossein.ir 2005-11-21 08:50:06 UTC
Created attachment 31654 [details]
Justification problem in arabic words.
Comment 17 hossein.ir 2005-11-21 08:52:09 UTC
Created attachment 31655 [details]
PDF output of justifying bug in arabic words.
Comment 18 frank.meies 2005-11-21 09:00:12 UTC
FME->h15n: I had a look at your document. The language attribute for the CTL
font was set to "Farsi". Changing the language attribute (Tools -> Options ->
Language Settings -> Languages -> Enable CTL support and Format -> Character ->
CTL Font -> Language) to "Arabic" should resolve the problem. Since this is not
very user friendly, this issue is about detecting Arabic language automatically
by analysing the Unicode string.
Comment 19 hossein.ir 2005-11-25 10:32:22 UTC
No, this is not my problem. Could you please take a look at my configuration? I
am aware of what you're talking about, and I know why this issue is marked as an
"enhancement", but I think this is not that one.
My language is "Persian" that uses arabic script, but it has some additional
characters, and I don't know if it caused the problem or not. For now, we don't
have Persian-specific locale settings in openoffice(after 2 month of my request
to join Persian team, I was added as an observer, and I found NOTHING was in the
project page, as persian specific things!), so I was forced to use arabic
settings hoping to have no problems.
These are my settings:

Language of:
Locale setting:  Arabic(Egypt)
Default currency: EGP Arabic(Egypt)

Default Language for Documents:
Western: English(USA)
Asian: Disabled
CTL: Arabic(Egypt)
For the current document only: not selected (this means settings are global)

I make an attachment of the config page.
Comment 20 hossein.ir 2005-11-25 10:34:02 UTC
Created attachment 31793 [details]
My langauge configuration
Comment 21 hossein.ir 2005-11-25 10:36:52 UTC
I should mention that "Farsi" is another name for the language "Persian" and
both of them are used in documents everywhere. It uses arabic script.
Comment 22 frank.meies 2005-11-25 10:55:56 UTC
FME->h15n: Currently we do the Kashida justification only if the language
attribute is set to Arabic. The correct solution would be to use Kashida
justification if the used script is Arabic, not depending on the language
attribute. That's what this issue is about. In the meantime we could add Farsi
to the languages for which Kashida justification is applied, provided that Farsi
is always used with an Arabic script. Would that help?
Comment 23 hossein.ir 2005-11-25 18:56:53 UTC
I hope it will solve the problem, if you think there's no such thing in when
using arabic("ar"). If style:language-complex="fa" causes problem, your job will
fix that. Could you please check this, with my previously attached file? Thanks.
Comment 24 hossein.ir 2005-11-27 14:41:16 UTC
I've unzipped the .sxw file, and changed style:language-complex="fa" to
style:language-complex="ar", and this fixed the problem. But I am not happy with
the justified text(I know the code should be changed as you mentioned, but I
wanted to experiment to see if this is the problem), becuase it is not truely
justified. They look as they are left-aligned. Could you please look at the
result? And also see the result of typesetting the same thing with
latex(actually omega, the unicode version) and compare the resutls to understand
what I am talking about? Thanks.
Comment 25 hossein.ir 2005-11-27 14:50:47 UTC
Created attachment 31833 [details]
Fixing justify problems by changing "fa" to "ar"(farsi to arabic)
Comment 26 hossein.ir 2005-11-27 14:54:08 UTC
Created attachment 31834 [details]
Fixing justify problems by changing "fa" to "ar"(farsi to arabic)
Comment 27 hossein.ir 2005-11-27 14:56:14 UTC
Created attachment 31835 [details]
Right justifying, done via TeX(actually via unicode version of that, Omega)
Comment 28 frank.meies 2005-11-28 08:32:47 UTC
FME->h15n: You should also set the style:country-complex attribute to a valid
value, e.g., EG for Egypt. 
Comment 29 aliganjei 2006-07-11 12:18:50 UTC
Is there any way to disable Kashida and use white-space justification for Arabic
script?
Comment 30 frank.meies 2006-07-11 12:42:02 UTC
fme->aliganjei: You can set the language attribute to any non Arabic language.
This will disable kashida justification in the text formatting engine. However,
the underlying character layout engine will most likely again insert kashidas
automatically - in front of the blanks. So my answer is "no".
Comment 31 hossein.ir 2006-10-04 08:59:47 UTC
I'm still having problems, when using the latest version(2.0.3). Some characters
do not join correctly, and a little space can be seen between them. These
characters seem to be Persian characters that are not available in arabic
language(peh, zheh, geh, cheh, Persian keh, Persian yeh). This happens when I
choose Arabic language for the characters. When I choose Farsi(Persian)
language, this problem decreases, but I face additional problems. Some
characters get printed on others.
Comment 32 hossein.ir 2006-10-04 09:06:44 UTC
Created attachment 39550 [details]
Persian specific characters do not join correctly to others
Comment 33 hossein.ir 2006-10-04 09:07:53 UTC
Created attachment 39551 [details]
pdf file created from jam.odt directly with OpenOffice, that has additional problems
Comment 34 hossein.ir 2006-10-04 09:09:52 UTC
Created attachment 39552 [details]
Postscript file created from jam.ps that has fewer problems from jam.pdf
Comment 35 farzaneh 2006-10-07 13:31:36 UTC
This issue reminds me of issue #49343 which offers a workaround for the problem
and is now closed. I wonder why no one noticed the duplication then.
I'm CCing SBA here as well as myself. Ping?
Comment 36 hossein.ir 2006-10-08 10:22:53 UTC
If you look closer, that one (#49343) is a duplicate, as it's opened 1 year
later. The probelm is still there, even in v2.0.3, and even when the Arabic or
Persian language is selected for characters. See my last comment, and samples.
Comment 37 hdu@apache.org 2007-07-03 09:38:35 UTC
*** Issue 34141 has been marked as a duplicate of this issue. ***
Comment 38 frank.meies 2007-07-03 11:46:53 UTC
Changed title.
Comment 39 hdu@apache.org 2007-08-14 10:35:45 UTC
*** Issue 34141 has been marked as a duplicate of this issue. ***
Comment 40 hossein.ir 2007-09-19 12:37:03 UTC
I suggest renaming it to "Error in justified Arabic text" again. I'm having the
problem in OpenOffice.org 2.3.
Still some characters do not join correctly, and a little space can be seen
between them. Previously, the Kashida was added in wrong places and positions,
it was added in the spaces between words. Now, "no Kashida is added". I don't
know why. Even setting everything (locale, character language attribute, etc)
does not help me. Please take a look at the first attachment (kashida.sxw) and
tell me that do you have the problem either? I've created a screenshot from the
problem as I see.
The good news is that in this version, I no longer have the problem of placing
characters in wrong places that was happening in windows version of previous
versions of OpenOffice, when using direct PDF output.
Comment 41 hossein.ir 2007-09-19 12:40:56 UTC
Created attachment 48324 [details]
No kashida is added and some space is created between characters
Comment 42 frank.meies 2007-10-02 14:35:49 UTC
Just to clarify things: This issue is about the kashidas painted between two
Arabic words in justified alignment. There might be two reasons for this bug:

1. Normal Western justified alignment is applied to the text because the
language attribute of the CTL text is not set to an Arabic language. This is a
very common problem, since many users do not set the language attribute
correctly. 

2. A problem in the underlying layout engine. 

This issue is for 1. The justification mode (Westers vs. Kashida) should be made
independent from the language attribute by evaluating the unicode. See issue
60594 for 2.
Comment 43 hdu@apache.org 2008-01-09 09:50:27 UTC
*** Issue 85074 has been marked as a duplicate of this issue. ***
Comment 44 Joost Andrae 2008-07-09 11:09:40 UTC
Retarget issue to 3.1
Comment 45 frank.meies 2008-09-17 15:24:59 UTC
fme->sba: This needs some thorough testing. We used to check the language
attribute to determine if kashida justification has to be applied to a text
portion. I added some code that tries to identify if a text portion is 'Arabic'
by looking at the unicode code points of the respective text.
Comment 46 frank.meies 2008-09-29 13:29:17 UTC
.
Comment 47 munzirtaha 2008-09-29 19:54:33 UTC
This bug is marked as fixed but I cannot see any reference to the fix. Can you please
clarify and point to a linux build I can download to test the fix
Comment 48 frank.meies 2008-09-30 07:45:13 UTC
@munzirtaha: This issue has been fixed in cws swarab02. There are currently no
public builds available for this, since this cws has not been integrated into
the main code line yet, but of course you can check out this cws and build it on
your own.
Comment 49 frank.meies 2008-11-27 10:46:34 UTC
Ready for QA.
Comment 50 stefan.baltzer 2008-12-10 09:02:34 UTC
Verified in CWS kashidafix.
Comment 51 hossenabad 2009-01-21 06:20:57 UTC
Another problem is with U200C [ZERO WIDTH NON-JOINER]. I use OOo 3.0.1 on Ubuntu
Linux. In jom.odt that h15n has attached, ZERO WIDTH NON-JOINERs that are
between Arabic sin or shin and other letters, it is appeared as Kashida. However
this situation is appeared for some font sizes but for sontsizes below a
specific value it is not appeared.
Comment 52 hdu@apache.org 2009-01-23 16:00:48 UTC
@hossenabad: based on your problem description I filed the separate issue 98410. The changes for this 
issue 28203 are already integrated in DEV300_m39 and the issue is resolved.
Comment 53 stefan.baltzer 2009-04-27 12:36:24 UTC
OK in OOO310_m10. Closed.