Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | xslt-parser must support xml:lang attributes other than ISO 639-1 / ISO 3166-1 (ISO 639 code for Northern Sotho is wrong in UI localization) | ||
---|---|---|---|
Product: | Internationalization | Reporter: | Andrea Pescetti <pescetti> |
Component: | code | Assignee: | AOO issues mailing list <issues> |
Status: | CONFIRMED --- | QA Contact: | |
Severity: | Trivial | ||
Priority: | P3 | CC: | andre.schnabel, andreas, dwayne, issues, joerg.barfurth, khirano, maho.nakata, murrayc, ooo, stephan.bergmann.secondary |
Version: | OOo 2.0.4 | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Issue Type: | DEFECT | Latest Confirmation in: | --- |
Developer Difficulty: | --- | ||
Issue Depends on: | |||
Issue Blocks: | 74420 |
Description
Andrea Pescetti
2006-10-07 16:45:53 UTC
do you tested your builds with that language merged as "nso" ? If I remember right there was an issue with xml:lang ( xml readme / officecfg ) that only supports two-letter in old xml parsers ... maybe it is solved already, I will check I think the correct person to give input to this is er. If I recall, although we moved to ISO codes from the numeric codes, not everywhere could handle 3 letter codes. Looking at the list I don't see any 3 letter languages. I'm not sure if this problem is still true. Well, actually I can't say much about any changes in the XML parser used for configuration files and whether it's capable now to understand ISO 639-2 alpha-3 codes (let alone valid language tags complying with RFC 3066 or RFC 4646), I'm Cc'ing Joerg for this. Anyway, AFAIK the language codes used in postset.mk don't affect the configuration or other .xml files using xml:lang, but are related to UI resource files instead, or am I wrong on this? Ivo? If this is to be changed, also the codes in all localize.sdf files would have to be changed, IMHO. Eike While I am not really responsible for this [Sorry for the premature submission of the previous comment. Retrying.] While I'm not really responsible for this any more, I try to track what happens in the configuration area. I am not aware of a change of parser for configuration processing. I noticed that a new xslt-processor/parser was introduced into the build environment, but I don't even know if the parser introduced there is able to handle ISO639-2. Neither do I know whether xml_apis.jar and xerces.jar from $SOLARBINDIR (that is what the configuration build uses) were affected by that change (i.e. the introduction of Xalan). In any case it has to be ascertained, that not only the vanilla build environment, but also all variations possible through 'configure' use a parser that accepts ISO639-2. Otherwise the build may be broken even for people that don't build for ISO639-2 languages. (There are a few settings that are localized, but not translated through localize.sdf) @erack: the xcu files *are* a form of UI resource file. AFAIK they use the same iso language codes as resource files. And yes: to change the language codes used there, all the localize.sdf files need to be changed (or the extraction tools hacked to tranform affected codes on the fly). To the original submitter: the 'temporary hack' is not related to a particular product revision, and can't really be fixed by product development. It is contingent on general availability AND integration into the build environment of XML processing tools that accept ISO639-2 language codes. It has to stay (in all product source code lines) until the build environment(s) has evolved to this point. Then I could be fixed in any product branch. Note that it is necessary to not support only ISO 639-2 alpha-3 and ISO 3166-1 alpha-2 codes, but also language tags according to RFC 4646 http://tools.ietf.org/html/rfc4646 language ["-" script] ["-" region] *("-" variant) *("-" extension) ["-" privateuse] At least language-script-region-variant must be supported. Raising priority to P3, as more and more localizations suffer from this. Indeed a P2 might be justified if the new parser still doesn't handle an entirely valid ISO 639-2 alpha-3 code and breaks the build. Could someone save me some time by giving me a hint at roughly what code would need to be changed to fix this? I might attempt it. we are using already some three-letters isocodes like: brx, dbo, mai , mni ... I saw the usage in officecfg. I think this issue is fixed then, someone updated the libxml2 parser @ihi (In reply to comment #9) While the original three-letter-code problem might had been solved in the mean time, the problem will reappear once BCP47 language tags will have been implemented (issue 109846). Reset assigne to the default "issues@openoffice.apache.org". |