Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | Change of Microsoft LCID for Dzongkha | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Internationalization | Reporter: | lists | ||||||||||
Component: | code | Assignee: | ivo.hinkelmann | ||||||||||
Status: | CLOSED FIXED | QA Contact: | issues@l10n <issues> | ||||||||||
Severity: | Trivial | ||||||||||||
Priority: | P3 | CC: | issues, khirano, pema.geyleg | ||||||||||
Version: | current | ||||||||||||
Target Milestone: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | All | ||||||||||||
Issue Type: | PATCH | Latest Confirmation in: | --- | ||||||||||
Developer Difficulty: | --- | ||||||||||||
Attachments: |
|
Description
lists
2005-08-18 14:15:14 UTC
Created attachment 28884 [details]
patch to isolang.cxx
Created attachment 28885 [details]
patch to langtab.src
Created attachment 28886 [details]
Patch to lang.hxx
Created attachment 28887 [details]
patch to dz_BT locale
We validated the modified locale file dz_BT.xml as specified in locale.dtd before doing the patch. If the locale should be in a separate issue, please tell me. I included here because it is part of the modification of the Microsoft LCID number. Sigh.. these MS people make me sick.. that is yet another place where they hide LCID assignments. After having digged around a bit I doubt that the information provided there is correct. It seems they just renamed Tibetan_Bhutan to Dzongkha_Bhutan in only that document. No other MS page talks about Dzongkha, but some do about Tibetan. Also, as Tibetan and Dzongkha are two distinct languages, the numbering schema does not fit to what MS otherwise does. The value 0x0851 that was always used for Tibetan_Bhutan has a primary language ID of 0x51 that is also used in 0x0451, which is Tibetan_Tibet (or Tibetan_People's-Republic-of-China how they call it politically in/correct). So 0x51 is clearly Tibetan. Using the same 0x51 for a distinct language just doesn't make sense. Maybe someone realized that there are only 4000 people in Bhutan who speak Tibetan, changed it to Dzongkha but had no clue about LCIDs, I don't know. I think they simply got it wrong. Do you happen to know of any document that was created using MS-Word or Excel, contains Dzongkha and where the text has some distinct LCID value assigned? Is there even Dzongkha selectable in any version? I'm hesitating to change this right now as long as not clarified. For the locale data change, yes, please create a separate issue. Thanks Eike Eike, Maybe someone realized that there are only 4000 people in Bhutan who speak Tibetan, changed it to Dzongkha but had no clue about LCIDs, I don't know. I think they simply got it wrong. Here there is a controversy regarding this. Dzongkha language uses Tibetan script and there is no Dzongkha script assigned in Uniocde. The work of supporting Dzongkha language in Microsoft was carried out in Bhutan by Dzongkha Development Authority with the help of Orient Foundation for a duration of 3 years.On the other hand, no work for supporting Tibetan in Microsoft has been carried out from scratch. They just worked on what was already created for Dzongkha support. Microsoft has made the mistake of calling Dzongkha as Tibetan-Bhutan.That's where they have made a mess and we are in the process of asking them to change it to Dzongkha. Do you happen to know of any document that was created using MS-Word or Excel, contains Dzongkha and where the text has some distinct LCID value assigned? Is there even Dzongkha selectable in any version? Nope. Till windows XP, there is no support. However we can use Dzongkha computing using Microsft office 2003. The number of Tibetan speaker in Bhutan is 1000 according to the research conducted by George Van Driem in "the Languages of the greater himalayan region". Infact there are more Tibetan speaker in New York compared to our country. Thanks Pema Geyleg Eike, MS seems to be confused or unwilling to change. The Government of Bhutan is quite unhappy about having Tibetan-Bhutan and not Dzongkha. Dzongkha, as well as modern Tibetan, derive from Old Tibetan, but they are quite different spoken languages. Tibetan-Bhutan will never be used for Tibetan language in Bhutan. Tibetan speakers here are a minority of people who run from Tibet after the Chinese invasion of Tibet, and who are integrating now with the Dzongkha speakers. Meanwhile, if we maintain a OOo private number, compatibility with MS will not be possible, and migration to OOo difficult. In the URL above the identifier refers to Dzongkha, but in MS Vista we have only been able to find a keyboard for Tibetan (Bhutan), together with the fonts that were developed here in Bhutan. We have not been able to figure out the language number. The Government of Bhutan is contacting MS to try to have them change the denomination. We have submitted a separate issue with the locale, in which the number stays as 628. The change would involve changing the number also in the locale to 851. What a mess.. ok, I will address this for OOo2.0. Just for the records some comment about the implications the change may have: in case anyone tested the LCID value against only its primary ID, being Tibetan, and acts accordingly on the result, for example by invoking a spell checker or dictionary, will miss that in fact the language is not Tibetan. This currently should be no problem in OOo as this scenario is merely hypothetical, but might occur in other products that act on a file that was written in a MS file format using LCIDs. On branch cws_src680_oool10n20 tools/inc/lang.hxx 1.12.88.3 tools/source/intntl/isolang.cxx 1.19.12.5 svtools/source/config/languageoptions.cxx 1.12.122.1 svx/source/dialog/langtab.src 1.60.28.4 i18npool/source/localedata/data/dz_BT.xml 1.2.8.1 Note that the patches omitted languageoptions.cxx, and would had left two entries for Dzongkha in langtab.src. For lang.hxx I didn't apply the patch but did it somewhat different to document the change. I also didn't apply the dz_BT.xml patch but only changed -628 to -851 in order to keep the change for this CWS as simple as possible. Other changes of locale data will be addressed with issue 53550. Status fixed. Eike, We verified the first 4 files, but could not find revision 1.2.8.1 of dz_BT.xml in webcvs. The present version in cws_scr680_localedata6 (1.2.4.1) has the old number (628). Hi Javier, it's clearly in cvs: > cvs log -N -rcws_src680_oool10n20 dz_BT.xml RCS file: /cvs/l10n/i18npool/source/localedata/data/dz_BT.xml,v [...] total revisions: 6; selected revisions: 1 description: ---------------------------- revision 1.2.8.1 date: 2005/08/22 14:47:40; author: er; state: Exp; lines: +1 -1 #i53497# Dzongkha is MS's erroneous Tibetan_Bhutan ============================================================================= And also http://l10n.openoffice.org/source/browse/l10n/i18npool/source/localedata/data/dz_BT.xml lists that revision, currently the second from top. Eike Javier, the dz_BT.xml file have been added to branch oool10n20 -> verifed verified. closed |