Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | Unable to open file if name contains some national characters | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | General | Reporter: | wszenajch <wojciech.szenajch> | ||||||
Component: | code | Assignee: | thorsten.martens | ||||||
Status: | CLOSED FIXED | QA Contact: | issues@framework <issues> | ||||||
Severity: | Trivial | ||||||||
Priority: | P2 | CC: | aleks.ehrlich, issues, kpalagin, lars_o_hansen, nesshof, tml, www.openoffice.org, zemiak | ||||||
Version: | OOo 2.0 | Keywords: | oooqa | ||||||
Target Milestone: | OOo 2.2.1 | ||||||||
Hardware: | PC | ||||||||
OS: | Windows XP | ||||||||
Issue Type: | DEFECT | Latest Confirmation in: | --- | ||||||
Developer Difficulty: | --- | ||||||||
Issue Depends on: | |||||||||
Issue Blocks: | 51233 | ||||||||
Attachments: |
|
Description
wszenajch
2005-12-12 15:56:44 UTC
confirmed on Windows XP Pro SP2 german with OOo 2.0.1 RC4 english SBA: Worksforme on Windows2000 German. I tried these 13 directory names: 01_Western_Europe_ÄÖÜäöüßéôâëçøæå 02_Eastern_Europe_ČĢĴŅŇŘŸŜŁłżšýáíóńśąę 03_Russian_господин_Чени 04_Chinese_CN_民主党已经到手的总统宝座最 05_Chinese_TW_組建自己的律師團隊 06_Japanese_Katakanaドイツエイゴ 07_Japanese_Kanji_Hiragana_斎橋のな記録とはいえ 08_Korean_설립하여 선진기술을 09_Hebrew_יעךלויגכעןויגכע 10_Greek_ΣΑΡΤΙΣΗελλασ 11_Thai_ไมโครซิสเต็มส์ 12_Hindi_जालपृष्ठवेसब 13_Arabic_ويسكنسن SBA: Windows needs the respective Language support before it can deal with such characters. To do so on WinXP (English): Start - Control Panel - Regional Options OnTab page "Advanced", check the "code page conversion tables" of the language/characters you want to toy with. Please comment. Please do not resolve problem reported for MS Windows XP SP2 after testing with Windows 2000! You have separate 'OS' values for other Windows versions. Microsoft has NEVER been fully compatible with its own products. Report concerned Windows XP SP2 EN, PL and was confirmed for DE. So, please test it with XP SP2 platform. Windows XP SP2 is the main platform now for Office applications. Windows 2000 should be used for compatibility testing only, not for development or main tests. If OOo 2.0 was developed and mostly tested on MS Windows 2000 I am not supprised that it has such kind of bug. Comment to BA: "Windows needs the respective Language support before it can deal with such characters". Greek is included in my configuration. I gave doc file as anexample, because MS Word CAN OPEN such file without any problems (with this 'Σ' character). The same for Notepad and txt files. So, this is not an issue. Probably Microsoft changed something in fopen/open procedures in XP or XP SP2. SBA: Worksforme on WindowsXP English SP2. There is different "Sigma" characters. I found three (Had a look via insert-special character and selected the Font "Arial Unicode MS" wich contains almost all characters from this planet) Ʃ = x01A9 (Font subset "Latin Extended-B") Σ = x03A3 (Font subset "Basic Greek") ∑ = x2211 (Font subset "Mathematical opeerators") "Greek is included in my configuration" is a little vague... On my WinXP, the list in "Regional and Language Options" offers these: - 1006 (MAC - Greek I) - 1253 (ANSI - Greek) - 20423 (IBM EBCDIC Greek) - 28597 (ISO 8859-7 Greek) - 737 (OEM - Greek 437G) - 869 (OEM - Modern Greek) - 875 (IBM EBCDIC Modern Greek) I'd recomment to always check "more charsets than needed" (i.e. UTF-8) Please comment. SBA: Component set to "Framework". Prio set to P3. There is no crash or data loss plus a workaround (remove uncool characters from filename). Comment to:"I'd recomment to always check "more charsets than needed" (i.e. UTF-8)" I added all additional conversion tables (even including EBCDIC and A5) except Asian ones. There are small changes in OO 2.0 on XP SP2 PL - it behaves now in the same way as reported for XP SP2 EN with OO 2.0. But also Polish characters on XP SP2 PL are not accepted by OO 2.0. Maybe they were never accepted, see additional results below: There are TWO DIFFERENT open file procedures used by OO 2.0: 1. When you start OO and use File-->Open, and chose file name - this WORKS OK for ALL characters I reported. 2. When you use Windows Exploer for chosing file and chose mouse right click, and select Open with -->OO 2.0 - OO 2.0 FAILS with error message. The same situation is if you assign "DOC" extension to OO 2.0 and double click on the file. MS Word 2002 SP2 always opens this file correctly using both methods - this means that there is nothing to be changed in Windows configuration comncerning languages. I made the tests with XP SP2 PL. If bechaviour of OO with XP SP2 EN is different I will update this info tomorrow. I remember that I experienced such problem in the old times with the different software - caused by the fact that there may be (or must be) two different procedures of opening file depending on the way you are doing it (stupid - but this is Windows world). I do not remember details (this could be even with Windows 3.11). Maybe, in case of double clicking on file in Windows Explorer, its name is passed as parameter to soffice.exe? Name reported in Error window looks the same as shown by dir command in cmd window. But Word manages somehow to find the name correctly. After switching between several locales and returning back to Polish I managed to restore successfull loading of file with Polish characters in name in my XP SP2 PL. My cmd window started to show again Polish letters in filenames. I also observed the following when OO is called from command prompt with filename as parameter: I created c:\xxxxx directory and file wsą.doc inside. There is no wsx.doc file. In cmd window: cd C:\Program Files\OpenOffice.org 2.0\program soffice.exe c:\xxxxx\wsą.doc - opens file correctly soffice.exe c:\xxxxx\wsx.doc - retuns error message soffice.exe \xxxxx\wsx.doc - DOES NOT return error message although I am on drive c: (soffice is even not present in task manager). soffice.exe wsx.doc - returns error message adding C:\Program Files\OpenOffice.org 2.0\program to the name of file. soffice.exe c:\xxxxx\ws*.doc - DOES NOT open the file (in spite of matching template name) and does not return error message Now the Word 2002-SP2 for comparison: cd C:\Program Files\Microsoft Office\Office10 winword.exe c:\xxxxx\wsą.doc - opens file correctly winword.exe c:\xxxxx\wsx.doc - returns error msg. (invalid document name) winword.exe \xxxxx\wsx.doc - returns error msg. (invalid document name) winword.exe wsx.doc - returns error msg. (invalid document name) winword.exe c:\xxxxx\ws*.doc - opens file correctly using template winword.exe \xxxxx\ws*.doc - opens file correctly using template I renamed wsą.doc to wsąΣ.doc in c:\xxxxx directory. Command dir displayed it as wsą?.doc. Entering 'Σ' by pasting it into cmd window gived an '?' character. so, I used: winword.exe \xxxxx\ws*.doc - opens file correctly using template Summary: 1. OO 2.0 bechaves differently than MS Word 2002 SP2 when called from cmd command line on XP SP2 PL. MS Word always behaves as expected. OO2.0 fails in three cases. 2. I was not possible for me to enter 'Σ' to cmd window which is not Unicode I suppose. Maybe OO 2.0 uses also single byte character set (default for XP language version) when obtaining file name passed by Windows Explorer? reassigend to owner of subcomponent *** Issue 62531 has been marked as a duplicate of this issue. *** *** Issue 59118 has been marked as a duplicate of this issue. *** @hro: I can confirm that I can't open files with certain national caracters (tried with korean characters). But I've also failed to open pdf files with acrobat reader. So this may be related to the system configuration and not to OOo? I confirm that Acrobat Reader 7.0.7 is unable to open PDF file which name contains 'Σ' character, on the same configuration on the same testing configuration as described above in problem report. But MS Word 2002 is able to open it WITHOUT any changes of Windows XP SP2 configuration. The same is true even for notepad which is able to open TXT file with 'Σ' character in file name. So, it can be done. Maybe there is new API extension for doing this on XP? <quote> SBA: Windows needs the respective Language support before it can deal with such characters. To do so on WinXP (English): Start - Control Panel - Regional Options OnTab page "Advanced", check the "code page conversion tables" of the language/characters you want to toy with. </quote> Does not help on WinXP Pro SP2. I confirm the issue -- cannot open file _via_double-clicking_ on WinXP Pro SP2 with "language to match the language version of the non-Unicode programs" set to Russian and file names containing characters out of ASCII127+Cyrillic (e.g. ä -- a-umlaut). There is no problem to open such files from File|Open menu of OOo. I also confirm OOo 1.x worked OK with such filenames. Issue 51233 is a duplicate of this issue. Issue 60181 is a duplicate of this issue. To clarify: It has nothing to do with the language of OOo but with the native language of the Windows system, that means an english Windows XP always uses the CP 1252 as system code page (reagardless of installed codepages or languages). There a some code fragments that use the system code to translate unicode file names. @sba: Please verify if double cliking on such a file (containing characters in file path that are not available in the system code page) always fails (that would mean we have a generic problem in the framework component) or if the problem only occurs if an soffice process is already running, f.e. Quickstarter or so (that would mean the pipe communication passing the parameters to a running process is broken). I am not using Quickstarter and problem exists. There is workaround possible for limited set of characters: Use: Start - Control Panel - Regional Options Tab page "Advanced" - "Select a language to match the language version of the non-Unicode programs you want to use:" - to set your most often used characterset. I made English US XP SP2 working with codepage 1250 instead of 1252. This did not help in case of Greek characters but fixed problem of Polish characters. It seems that doublecliking on file makes soffice-Windows XP SP2 pair to use non-Unicode mode for filenames. MS Word works corectly witout such settings with Polish and Greek characters together in filenames. *** Issue 60181 has been marked as a duplicate of this issue. *** Target set. Was broken by the fix for issue 35209 that disabled all unicode command line support. Increase Prio. . I'll fix this by adjusting the CRT_ENTRY macros and export a new osl_setCommandLineArgsW(). . . Why has the target milestone changed? this is a showstopper in a lot of countries. I can't migrate a lot of my customers due to this bug! The target milestone change does basically mean that there are not enough votes, probably ;-). Technically it means that customers that have something to do with multi-codepage environments have to either use 1.1.5 or MS Office... But I don't understand why this has been working in 1.1.5 pefectly and now broken in OOo 2.0. 2.0 Exists now for almost a year and this hasn't been fixed already. Only because there are not enough votes, this doesn't mean there are no users. Most try, won't work, uninstall. They don't even know about bugzilla or even sufficient english to post here. I want them to move finally from 1.1.5 to 2.0.x someday. There's a PARTIAL WORKAROUND for this bug that helps in some cases (tested with ZH-TW). In Win XP, change the 'language for non-unicode programs' to match the character set of the filename you want to open. Restart. This setting can be found in Control Panel -> Regional & Language Settings -> Advanced tab -> choose the appropriate language from the first list box BTW, that dialog box also lets you choose which code page conversion tables to load. I don't know what affect those options have on this issue. WARNING: changing the 'language for non-unicode programs' could cause other applications to stop working properly (most likely old apps, not mainstream stuff). If that happens, you can simply change it back to the original setting. I agree with sgfan that this bug is a show stopper for some, and is probably under-reported. It's really harmful for new users' impressions of OO in certain environments. OO appears to trash the very first file they try to save (actually they just can't re-open it by double-clicking on the icon). For inexperienced users, the bug makes it look like OO simply doesn't work. *This bug was first reported almost one and half years ago (as issue 52240). It's been reported separately at least six times - see the apparent duplicate, Issue 64764, which has another 6 votes to add to the 14 this issue has now This issue is was confirmed with file name containing Slovak national characters ctšl also on following setups: - WinXP Pro SP2 English + OpenOffice 2.04 English - WinXP Home SP2 Slovak + OpenOffice 2.04 Slovak - WinXP Home SP2 Slovak + OpenOffice 2.04 English For others being annoyed by this bug, PARTIAL WORKAROUND proposed by taipeitech worked in my case, however I would like to see OpenOffice fixed to support Unicode file names. Thank you. OOo 2.1 WinXP SP2 WFM OOo 2.1 WinXP SP2 Doesn't work here w/ russian chars Confirmed issue exists on version 2.1 using a Windows XP SP2 machine. However, an older build was also used to check the same bug, and it was not reproduced on build 2.0.3 using Windows XP. I tried different combinations of name for the following languages : english, arabic, urdu. It seems that the later version(2.1) on Windows XP crashes, however the earlier version(2.0.3) on Windows XP works with the workaround mentioned in the comments (i.e. enabling langauge in control panel) Changing target. Hennes, are we really on target for 2.2.1? If necessary, I am ready to provide remote access to the system that shows problematic behavior, so that developer can do some diagnotics and investigation. Just drop me a message at kpalagin@openoffice.org and I will supply necessary information. Thanks a lot for your attention. I hope this will get fixed really soon. Still don'T unterstand why this hasnt been noticed during OOo2.0 development cuz since then this bug has existed. On my machine this bug is way more dangerous than you describe: when I try to open some document with a Korean filename by double clickin in Windows Explorer, it isn't just Open Office who crashes, WINDOWS ITSELF does the same; it freezes and I can't even summon Task Manager or shut Windows down. For two times I could only retrieve the battery from my laptop in order to get it back to work. I'm too frightened to make any further experiment. My system information: Windows XP SP2 Media Center Edition 2005 English Default system and user language both set to Catalan Language for non-Unicode programs set to Catalan Open Office 2.0.3 Catalan I don't want to switch the non-Unicode language to Korean because then all programs get downloaded and installed in Korean, be they written in Unicode or not. I don't want to use programs in Korean (my Korean is not that good), I just happen to need handling DOCUMENTS in Korean. This is very depressing because I have the same problem on Firefox 2.0.1 and Thunderbird 1.5.0.7. (https://bugzilla.mozilla.org/show_bug.cgi?id=235385). My friends' Korean documents lose their names and become underscores, and I can't e-mail them my own Korean documents. Since these people don't know English, changing the name is not always an acceptable option. Since I haven't lost faith on open source software, my workaround will be speeding up my migration to Linux. However, after this experience an average Asian would just give up thinking that all these programs are but a hobby for Western geeks, especially given that even Notepad can handle foreign characters in my configuration. No wonder now that Windows/IE hold a 99.8% market share in Korea! My own impression is that English is way too pervasive in the open source community, as if everybody was only using their mother tongue for themselves and English for international communication. The needs of multilingual people like me aren hardly considered. And in spite of national localizations, when you're out in the Internet looking for help, help is available only in English. *** Issue 75671 has been marked as a duplicate of this issue. *** *** Issue 76039 has been marked as a duplicate of this issue. *** On OO 2.2 I don't even get an error, just an hourglass and then nothing. My wife is _again_ clamoring for MS Word :-( I attached a trigger file that refuses to open (it's an empty OO document). I had to store it with 7-zip because the old version of Winzip I have on this machine refuses to understand the directory too! Created attachment 44438 [details]
7-zip archive with ODT file that can't be opened in OO2.2
Created attachment 44485 [details]
Suggested patch, works for me
Added to CWS 2.2.1 Patch will work but will override osl_setCommandArgs funktionality. But osl_setCommandArgs is depricated and should not be used as an API. will change the documentation. hoping to see this fixed in 2.2.1 Just wanted to remind that this bug already exits for 1,5 years. Additionally this bug did NOT exist in 1.1.x! Changed in CWS hro15 to be integrated in 2.2.1 *** Issue 69973 has been marked as a duplicate of this issue. *** Assigned to QA Fixed but failed in cws hro15 -> splashscreen appears for a short moment but office doesn't start. TM->HRO: as talked about, back to you. Implemented Unicode functionality for system integration wrappers. Hand over to QA Checked and verified in cws hro15 -> OK ! *** Issue 51233 has been marked as a duplicate of this issue. *** YESSS! Finally I've started migration of "my" companies from 1.1.5. aehrlich, absolutely. after 1,5 years I will migrate my customers too. took way too long! Testet with 3.0 Beta on Windows XP. Seams to be ok. Closing issue This happens in OS X as well Can't open file test_ç.doc OOo 3.0.1/3.1.0, MacOSX 10.4, 10.5. OOo cannot open files in national characters in it. Example: Názov.doc OOo 3.0.1, MacOSX 10.4: Save: Object does not exist, insufficient access rights to an object. Open: The file does not exist OOo 3.1, MacOSX 10.4: Save, Open: The file does not exist OOo 3.1, MacOSX 10.5: Save: Saved file has %XX codes insted of national chars OOo 3.1, MacOSX 10.5: Open: after renaming file to Názov.doc, The file does not exist This is a showstopper for me, as I need to work with national filenames, so I had to install a slow Neooffice yesterday :( I am able to run another tests (at least on MacOSX 10.4, which is my primary OS) zemiak, this issue seems to describe problem on Windows. Your problem is described by http://www.openoffice.org/issues/show_bug.cgi? id=69973. You may want to vote and post comments in there. Created attachment 73576 |