Issue 80557 - sl_SI Slovenian: alphanumerical sorting CE characters Đ
Summary: sl_SI Slovenian: alphanumerical sorting CE characters Đ
Status: CLOSED FIXED
Alias: None
Product: Internationalization
Classification: Code
Component: i18npool (show other issues)
Version: current
Hardware: All All
: P3 Trivial with 2 votes (vote)
Target Milestone: ---
Assignee: stefan.baltzer
QA Contact: issues@l10n
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-10 08:25 UTC by markomb
Modified: 2013-08-07 15:01 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description markomb 2007-08-10 08:25:48 UTC
Alphanumeric sorting in all aplications using Central European characters has an
isue with letter Đ (if you go to add special symbol this letter is char U+0110
and U+0111)

When I try to sort it thinks letters D and Đ are used as same letter
Example 

I have column:
DA
DG
ĐA
ĐB
ĐR
DS

after alphanumerical sorting it makes:
DA
ĐA
ĐB
DG
ĐR
DS

correct would be:
DA
DG
DS
ĐA
ĐB
ĐR
Comment 1 r 2007-08-17 09:19:55 UTC
Thank you, I'll take a look at it. I guess it is defined in the sl.xml file but
I have to ask on l10n-dev maillist.
Comment 2 clytie 2007-08-18 07:18:09 UTC
The collation order (sorting) is indeed specified in the locale. My language also uses the crossed D (đ, 
Đ), which comes after the ordinary D in our collation order.

We have an open collation bug [78054] posted against Calc, but it's about collating words with some 
combinations of diacritics. OpenOffice.org follows the collation order in our locale for the crossed D. It 
should do the same for other languages.
Comment 3 r 2007-08-18 11:07:17 UTC
Eike, I reassigned this to you. Like you told me last time, I am sending only
diff (the last line is the right/corrected one) of sl_SI.xml:

$diff sl_SI.orig.xml sl_SI.xml
191c191
<     <IndexKey phonetic="false" default="true" unoid="alphanumeric">A-C Č D-S Š
T-Z Ž</IndexKey>
---
>     <IndexKey phonetic="false" default="true" unoid="alphanumeric">A-C Č Ć D
Đ-S Š T-Z Ž</IndexKey>
Comment 4 r 2007-08-18 11:09:25 UTC
... and changing the Component
Comment 5 r 2007-08-18 11:11:34 UTC
ah, component should be l10n, sorry
Comment 6 r 2007-08-18 11:12:04 UTC
and subcomponent localedata
Comment 7 ooo 2007-08-18 12:39:06 UTC
Is this issue really about Slovenian (sl_SI)? Because for Slovakian (sk_SK) it
works as requested.

For collation the locale data files are not involved. Tailored collations are
done in i18npool/source/collator/data/ files. The locale data file's IndexKey
sequence is for the Alphabetical Index in Writer.

Anyway, when setting the language to Slovakian in the sort dialog options, or
when left as Default and setting the OOo locale to Slovakian under
Tools.Options.LanguageSettings.Languages "Language of Locale Setting", the sort
result is

DA
DG
DS
ĐA
ĐB
ĐR

so OOo does sort that correctly for Slovakian. The same should be true if the
system locale is set to sk_SK and OOo locale left at Default. However, the
IndexKey could be adapted.

Are you sure this is about Slovenian? See also the COMMON column in
http://www.unicode.org/cldr/data/charts/collation/sk_SK.html
http://www.unicode.org/cldr/data/charts/collation/sl_SI.html
Slovenian does not have a special treatment for đ and Đ.
Comment 8 ooo 2007-08-24 21:11:33 UTC
Could someone please clarify? Is this issue about Slovenian or Slovakian? See
previous comment.

Thanks
  Eike
Comment 9 r 2007-08-24 21:46:05 UTC
Eike, this issue is about Slovenian. We don't have letters "đ" and "Đ" in the
"COMMON column in http://www.unicode.org/cldr/data/charts/collation/sl_SI.html"
because, I guess, these two letters are not in our (Slovenian) alphabet. But we
still use them as we were part of Yugoslavia once, where they still use them.

If I set the language to Slovenian in the sort dialog options, the result is not
OK (of course).

What is best to do? To correct IndexKey in sl_SI.xml or add these 2 letters in
http://www.unicode.org/cldr/data/charts/collation/sl_SI.html?
Thanks
Comment 10 ooo 2007-08-27 11:47:10 UTC
Bobe,

As stated in #desc8 please note that collation and the OOo IndexKey are
different things. If you want future versions of ICU (and thus other
applications as well) to support the Slovenian collation including Đ you'll also
have to contact the CLDR. For OOo we may create a tailored collation in
i18npool/source/collator/data/

  Eike
Comment 11 ooo 2007-12-14 11:49:49 UTC
Karl, could you please take over and add collation and IndexKey in an i18n cws?
Thanks
  Eike
Comment 12 markomb 2008-01-07 12:26:53 UTC
Is there any progress in solving this issue? It is quite annoying to sort
manually these letters.

Thank you,

Marko
Comment 13 karl.hong 2008-01-08 04:58:36 UTC
Add tailoring data and index key characters as suggested.
Comment 14 markomb 2008-01-08 07:35:53 UTC
I am sorry to bother again but I searched everywhere and could not find any
locales and sl_SI.xml.

I instaled new version 2.3.1 over 2.3 (now I have both ?!?funny) and still no
locales. Can you give me directions where to find locales?
I am using Vista OS.
thanks! 
Comment 15 ooo 2008-01-08 11:28:13 UTC
@markomb: The locale data annd collation files we're talking here about are in
the OOo source code build tree, not in the installation. They get compiled into
binary libraries that are used during runtime.
Comment 16 markomb 2008-01-08 11:41:40 UTC
So I cannot use it until new version?
Comment 17 karl.hong 2008-01-08 19:18:39 UTC
ready for QA.
Comment 18 r 2008-01-13 10:34:58 UTC
khong, where can I find the newest version of sl_SI.xml? I need to post it to
Pavel, too, so he can include in in the new builds. Thanks.
Comment 19 ooo 2008-01-14 11:16:20 UTC
@bobe: No need to provide Pavel with single files. He builds the master
milestones, so once CWS i18n39, where the fix for this issue was added to, will
be integrated everything will be fine. For monitoring the CWS see
http://eis.services.openoffice.org/EIS2/cws.ShowCWS?Path=SRC680%2Fi18n39
Comment 20 stefan.baltzer 2008-01-16 15:54:29 UTC
Verified in CWS i18n39.
Comment 21 stefan.baltzer 2009-12-07 11:49:06 UTC
OK in OOO320_m7. Closed.