Apache OpenOffice (AOO) Bugzilla – Issue 10976
Errors in Norwegian locales
Last modified: 2013-08-07 15:02:39 UTC
The Norwegian locale files, found in i18npool/source/localedata_ascii/, are full of errors. There are three Norwegian files: nb_NO for Norwegian Bokmål, nn_NO for Norwegian Nynorsk, and no_NO (I don't know if this last one is used, it should be the same as for Bokmål). On behalf of the Norwegian localization team, I have corrected the locale files according to Norwegian standards. Among the things corrected are date formats, time formats, quoting characters and abbreviations. I attach the corrected locale files.
Created attachment 4440 [details] Corrected Norwegian Bokmål locale file
Created attachment 4441 [details] Corrected Norwegian Nynorsk locale file
Created attachment 4442 [details] Corrected Norwegian locale file (The same as nb_NO. I don't know if this one is ever used.)
DL->Gaute: Thanks for you patch. To accept your patch we need your signed copyright assignment. Could you please send us the copyright assignment asap? Please find more info at: http://www.openoffice.org/contributing.html
*** Issue 7418 has been marked as a duplicate of this issue. ***
*** Issue 7905 has been marked as a duplicate of this issue. ***
The JCA is now faxed, and I'm mailing it as well later today. I ask that you do not yet commit the corrected files. I will test them with 1.1 (beta2) first, and I suspect they need some minor revision as well.
Created attachment 6765 [details] Patch for all three Norwegian locale files, updated to 1.1 beta 2
The patch norwegian_locales.diff will correct the current Norwegian locale files in 1.1 beta 2. Please consider this patch for inclusion in the next 1.1 build.
What's this, this issue is still hanging around as being UNCONFIRMED since months. I'm grabbing it. Gaute, Thanks for filling out the JCA form. I won't accept this very patch for the following reasons: 1. The patch does not reflect the latest revisions of the data files as found in i18npool/source/localedata/data/. This is indicated, for example, by the value of the <Collator> entity, which still has the locale name included, or the absence (removal) of the <quarter#Abbreviation> entities, or the reintroduction of meaningless <DefaultName> entities with value "DIN 5008" and similar. Please take a look at the latest revisions available. 2. It unnecessarily duplicates a lot of data by replacing ref="en_US" respectively ref="no_NO" entries with the actual data. This is especially completely the case for the content of nb_NO.xml, which in fact is only an alias of no_NO that only differs in the assignment of the MS-LANGID. But these ref="..." replacements are also encountered in other details. 3. The currency format codes are screwed up, since all use [$kr-414] now, which instead has to be [$kr-814] for nn_NO and [$kr-14] for no_NO (this is the MS-LANGID thing mentioned above). Again the patch replaces the ref="..." and replaceFrom="..." and replaceTo="..." mechanism with actual data. 4. It seems that also some date format codes were screwed, I didn't take a detailed look and did no tests in the non-product version though to verify them, because of the other errors described above. Please follow the guidelines commented in the latest revision of i18npool/source/localedata/data/locale.dtd Please rework and submit a clean patch, thank you. Btw: are you sure about the <QuotationStart> and <QuotationEnd> typographic quotation values? They look strange to me, please bear in mind that all values have to be UTF-8 encoded.
OK, here's a new patch which is more carefully based on the new locale files (1.1b2). This patch makes heavy use of the ref="" mechanism, with no_NO.xml as the "template" file. The only major difference between nb and nn locales is the names of the weekdays (the months have the same names). Because of this, I was able to make nn_NO.xml considerably shorter. <QuotationStart/End> does look strange, but they did work in 1.0.x. Anyway, the proper single quote characters does not exist in ISO-8859-1. I don't like it, but we may have to resort to the regular inch character (') instead. (The Windows character set contains the necessary quote characters, but on other operating systems you may get empty boxes or question marks (which is *really* bad) instead.) Most Norwegian will use the double quotes anyway, which are «these». Thanks for your help and patience!
Created attachment 6799 [details] New patch, following suggestions from Eike
Thanks for the data. I doubt that this will make it into 1.1RC since I won't start on it before next week due to other tasks I have to complete. I therefore set the target to 1.1 final but will try to get it into RC if time permits and release schedule allows.
Forgot to set issue to accepted.
Gaute, As for the <QuotationStart> issue: internally OOo uses Unicode, and UTF-8 in XML file format, so specifying the real quotation marks should be no problem, we do it with all other locales too. For example, the German typographic double quotes aren't present in iso-8859-1 either but only in Windows-1252. When exported to an iso-8859-1 encoding they get replaced by simple ASCII quotes. Please specify the real quotation marks.
Hmm, I have bad experiences with quotes and Windows-1250. I've seen documents with Windows-1250 quotes appearing as question marks on Linux systems. That's really, really bad. (The old Norwegian locale files did exactly that.) If I understand correctly, I can specify the real Unicode characters (not Windows-1250). Then, the quotes will always degrade gracefully, and never appear as question marks or boxes, right?
As said before, internally OOo uses UCS2 Unicode, and UTF-8 in the XML file format. There is no such thing like "Windows-1250 encoding in a OOo document" if it wasn't imported from an 8-bit file format and a wrong encoding was chosen. The question mark issue isn't a problem of encoding but a problem of inadequate fonts. With proper fonts the right symbols are displayed. Yes, please specify the real Unicode symbols in UTF-8 encoding. Upon runtime the user can specify (under menu Tools.AutoCorrect.CustomQuotes) if and how the ASCII quotes should be replaced during typing. If for any reason the font you used didn't contain the proper symbols, you can always disable replacement (thus use plain ASCII only) or choose another character.
Sorry for the delay, here's a patch with single quotes in Unicode. Both start and end quotes should be the same, which is a raised comma just like the single end quote in en_US. And of course, if OpenOffice.org can't display a proper fallback character, that's not the locale's fault. Issue #11018 shows what happens in 1.0.2 with certain fonts on my Linux computer.
Created attachment 6955 [details] New patch, now with proper Unicode single quotes
Thanks. Note however that I applied the following changes to comply with the requirements lined out in i18npool/source/localedata/data/locale.dtd: Changed <Time100SecSeparator> from '.' (dot) to ',' (comma) as it must match the separator used in the format codes (formatindex="44" and formatindex="45" in this case). Changed date format code formatindex="20" from >YYYYMMDD< to >DD.MM.YY< to have a similar meaning as the same format index in other locales. Was the >YYYYMMDD< really intended? Committed to branch cws_srx645_ooo11rc: i18npool/source/localedata/data/nb_NO.xml 1.5.70.1 i18npool/source/localedata/data/nn_NO.xml 1.7.28.1 i18npool/source/localedata/data/no_NO.xml 1.6.28.1
Using a comma as <Time100SecSeparator> is indeed correct. I must have overlooked it when revising the patch. YYYYMMDD was a "filler" format, as nothing else was really appropriate. Since similarity with other locales is a concern, I believe DD.MM.YY is a better choice. However, we chose DD.MM.YY as DateFormatskey1, so it's now being used in two places. Switching to D.M.YY as DateFormatskey1 would solve this, and I believe it would be more consistent with other locales as well. Please consider changing DateFormatskey1 to D.M.YY, which I think would be the final tweak needed.
Duplicated format codes don't really matter, they just may not look nice in the listbox ;-) In fact all the formatindex="..." matching has historical roots, please refer to i18npool/source/localedata/data/locale.dtd and offapi/com/sun/star/i18n/NumberFormatIndex.idl for further information.
Present in ooo11rc CWS build, reassign to QA.
Reset resolution to fixed
Verified in internal build OOo11RC
Closed because fixed in OOo1.1.RC