Issue 59251

Summary: Unable to open file if name contains some national characters
Product: General Reporter: wszenajch <wojciech.szenajch>
Component: codeAssignee: thorsten.martens
Status: CLOSED FIXED QA Contact: issues@framework <issues>
Severity: Trivial    
Priority: P2 CC: aleks.ehrlich, issues, kpalagin, lars_o_hansen, nesshof, tml, www.openoffice.org, zemiak
Version: OOo 2.0Keywords: oooqa
Target Milestone: OOo 2.2.1   
Hardware: PC   
OS: Windows XP   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Issue Depends on:    
Issue Blocks: 51233    
Attachments:
Description Flags
7-zip archive with ODT file that can't be opened in OO2.2
none
Suggested patch, works for me none

Description wszenajch 2005-12-12 15:56:44 UTC
OO 2.0 English is unable to open file (odt, doc tested) if name contains
national charcters other than native MS Windows language version. Tested with
Windows XP SP2 EN and Polish versions.

To reproduce:
1. Create any Word readable file (i.e. doc)
2. For English version of MS Windows and English version of OO 2.0 rename this
file adding to file name any of Greek characters except those present on CP 1252
(i.e. use 'Σ' character).
3. Try to open such file using OO 2.0. - On my English Windows OO displays error
message saying that it is unable to find the file appeared (containing wrong
character in place of 'Σ'). On Polish Windows OO 2.0 hanged being present in
task manager only.

Remarks:
1. The same effect appears if Greek character is in directory name.
2. OO 1.1 EN worked correctly on English XP SP2 with Polish characters in file
names.
3. OO 2.0 installed on English Windows failed to open names with Polish
characters, but OO 2.0 English installed on Polish version of Windows opened
such files correctly - this is why I have chosen Greek for testing.
4. I did not test OO 2.0 language versions other than English.
5. Linux version of OO 2.0 (included with SuSE 10 distro) opened correctly files
with names containing Polish and Greek characters (not all of them tested).
6. Reasign this issue if component I10n and subcomponent is not correct.
Comment 1 lars 2005-12-12 18:55:24 UTC
confirmed on Windows XP Pro SP2 german with OOo 2.0.1 RC4 english
Comment 2 stefan.baltzer 2005-12-13 14:07:31 UTC
SBA: Worksforme on Windows2000 German. 

I tried these 13 directory names:

01_Western_Europe_ÄÖÜäöüßéôâëçøæå
02_Eastern_Europe_ČĢĴŅŇŘŸŜŁłżšýáíóńśąę
03_Russian_господин_Чени
04_Chinese_CN_民主党已经到手的总统宝座最
05_Chinese_TW_組建自己的律師團隊
06_Japanese_Katakanaドイツエイゴ
07_Japanese_Kanji_Hiragana_斎橋のな記録とはいえ
08_Korean_설립하여 선진기술을
09_Hebrew_יעךלויגכעןויגכע
10_Greek_ΣΑΡΤΙΣΗελλασ
11_Thai_ไมโครซิสเต็มส์
12_Hindi_जालपृष्ठवेसब
13_Arabic_ويسكنسن
Comment 3 stefan.baltzer 2005-12-13 14:24:45 UTC
SBA: Windows needs the respective Language support before it can deal with such
characters. 
To do so on WinXP (English):
Start - Control Panel - Regional Options
OnTab page "Advanced", check the "code page conversion tables" of the
language/characters you want to toy with.

Please comment.
Comment 4 wszenajch 2005-12-13 14:25:39 UTC
Please do not resolve problem reported for MS Windows XP SP2 after testing with
Windows 2000! You have separate 'OS' values for other Windows versions.
Microsoft has NEVER been fully compatible with its own products.

Report concerned Windows XP SP2 EN, PL and was confirmed for DE. So, please test
it with XP SP2 platform.

Windows XP SP2 is the main platform now for Office applications. Windows 2000
should be used for compatibility testing only, not for development or main tests.

If OOo 2.0 was developed and mostly tested on MS Windows 2000 I am not supprised
that it has such kind of bug.
Comment 5 wszenajch 2005-12-13 14:32:56 UTC
Comment to BA: "Windows needs the respective Language support before it can deal
with such characters".

Greek is included in my configuration.

I gave doc file as anexample, because MS Word CAN OPEN such file without any
problems (with this 'Σ' character). The same for Notepad and txt files.
So, this is not an issue. 
Probably Microsoft changed something in fopen/open procedures in XP or XP SP2.
Comment 6 stefan.baltzer 2005-12-14 17:34:16 UTC
SBA: Worksforme on WindowsXP English SP2.

There is different "Sigma" characters. I found three (Had a look via
insert-special character and selected the Font "Arial Unicode MS" wich contains
almost all characters from this planet)
Ʃ = x01A9 (Font subset "Latin Extended-B")
Σ = x03A3 (Font subset "Basic Greek")
∑ = x2211 (Font subset "Mathematical opeerators")

"Greek is included in my configuration" is a little vague...
On my WinXP, the list in "Regional and Language Options" offers these:
 - 1006 (MAC - Greek I)
 - 1253 (ANSI - Greek)
 - 20423 (IBM EBCDIC Greek)
 - 28597 (ISO 8859-7 Greek)
 - 737 (OEM - Greek 437G)
 - 869 (OEM - Modern Greek)
 - 875 (IBM EBCDIC Modern Greek)

I'd recomment to always check "more charsets than needed" (i.e. UTF-8)
Please comment.
Comment 7 stefan.baltzer 2005-12-14 17:38:56 UTC
SBA: Component set to "Framework". Prio set to P3. There is no crash or data
loss plus a workaround (remove uncool characters from filename).
Comment 8 wszenajch 2005-12-15 09:01:48 UTC
Comment to:"I'd recomment to always check "more charsets than needed" (i.e. UTF-8)"
I added all additional conversion tables (even including EBCDIC and A5) except
Asian ones. There are small changes in OO 2.0 on XP SP2 PL - it behaves now in
the same way as reported for XP SP2 EN with OO 2.0. But also Polish characters
on XP SP2 PL are not accepted by OO 2.0. Maybe they were never accepted, see
additional results below:

There are TWO DIFFERENT open file procedures used by OO 2.0:

1. When you start OO and use File-->Open, and chose file name - this WORKS OK
for ALL characters I reported.

2. When you use Windows Exploer for chosing file and chose mouse right click,
and select Open with -->OO 2.0 - OO 2.0 FAILS with error message. The same
situation is if you assign "DOC" extension to OO 2.0 and double click on the file.

MS Word 2002 SP2 always opens this file correctly using both methods - this
means that there is nothing to be changed in Windows configuration comncerning
languages.

I made the tests with XP SP2 PL. If bechaviour of OO with XP SP2 EN is different
I will update this info tomorrow.

I remember that I experienced such problem in the old times with the different
software - caused by the fact that there may be (or must be) two different
procedures of opening file depending on the way you are doing it (stupid - but
this is Windows world). I do not remember details (this could be even with
Windows 3.11).
Comment 9 wszenajch 2005-12-15 09:18:05 UTC
Maybe, in case of double clicking on file in Windows Explorer, its name is
passed as parameter to soffice.exe? Name reported in Error window looks the same
as shown by dir command in cmd window. But Word manages somehow to find the name
correctly.
Comment 10 wszenajch 2005-12-15 13:07:26 UTC
After switching between several locales and returning back to Polish I managed
to restore successfull loading of file with Polish characters in name in my XP
SP2 PL. My cmd window started to show again Polish letters in filenames.

I also observed the following when OO is called from command prompt with
filename as parameter:

I created c:\xxxxx directory and file wsą.doc inside. There is no wsx.doc file.

In cmd window:
cd C:\Program Files\OpenOffice.org 2.0\program
soffice.exe c:\xxxxx\wsą.doc      - opens file correctly
soffice.exe c:\xxxxx\wsx.doc      - retuns error message
soffice.exe \xxxxx\wsx.doc        - DOES NOT return error message although I am
on drive c: (soffice is even not present in task manager).
soffice.exe wsx.doc               - returns error message adding C:\Program
Files\OpenOffice.org 2.0\program to the name of file.
soffice.exe c:\xxxxx\ws*.doc        - DOES NOT open the file (in spite of
matching template name) and does not return error message 

Now the Word 2002-SP2 for comparison:
cd C:\Program Files\Microsoft Office\Office10
winword.exe c:\xxxxx\wsą.doc       - opens file correctly
winword.exe c:\xxxxx\wsx.doc       - returns error msg. (invalid document name)
winword.exe \xxxxx\wsx.doc         - returns error msg. (invalid document name)
winword.exe wsx.doc                - returns error msg. (invalid document name)
winword.exe c:\xxxxx\ws*.doc       - opens file correctly using template
winword.exe \xxxxx\ws*.doc         - opens file correctly using template

I renamed wsą.doc to wsąΣ.doc in c:\xxxxx directory. Command dir displayed it as
wsą?.doc. Entering 'Σ' by pasting it into cmd window gived an '?' character.
so, I used:
winword.exe \xxxxx\ws*.doc         - opens file correctly using template

Summary:
1. OO 2.0 bechaves differently than MS Word 2002 SP2 when called from cmd
command line on XP SP2 PL. MS Word always behaves as expected. OO2.0 fails in
three cases.
2. I was not possible for me to enter 'Σ' to cmd window which is not Unicode I
suppose. Maybe OO 2.0 uses also single byte character set (default for XP
language version) when obtaining file name passed by Windows Explorer?
Comment 11 jack.warchold 2005-12-22 13:55:00 UTC
reassigend to owner of subcomponent
Comment 12 Olaf Felka 2006-03-16 10:37:16 UTC
*** Issue 62531 has been marked as a duplicate of this issue. ***
Comment 13 Olaf Felka 2006-03-16 10:39:33 UTC
*** Issue 59118 has been marked as a duplicate of this issue. ***
Comment 14 Olaf Felka 2006-03-17 10:32:38 UTC
@hro: I can confirm that I can't open files with certain national caracters
(tried with korean characters). But I've also failed to open pdf files with
acrobat reader. So this may be related to the system configuration and not to OOo?
Comment 15 wszenajch 2006-03-17 10:57:01 UTC
I confirm that Acrobat Reader 7.0.7 is unable to open PDF file which name
contains 'Σ' character, on the same configuration on the same testing
configuration as described above in problem report.

But MS Word 2002 is able to open it WITHOUT any changes of Windows XP SP2
configuration. The same is true even for notepad which is able to open TXT file
with 'Σ' character in file name. So, it can be done. Maybe there is new API
extension for doing this on XP?
Comment 16 aehrlich 2006-03-21 16:15:47 UTC
<quote>
SBA: Windows needs the respective Language support before it can deal with such
characters. 
To do so on WinXP (English):
Start - Control Panel - Regional Options
OnTab page "Advanced", check the "code page conversion tables" of the
language/characters you want to toy with.
</quote>
Does not help on WinXP Pro SP2.

I confirm the issue -- cannot open file _via_double-clicking_ on WinXP Pro SP2
with "language to match the language version of the non-Unicode programs" set to
Russian and file names containing characters out of ASCII127+Cyrillic (e.g. ä --
a-umlaut).
There is no problem to open such files from File|Open menu of OOo.
I also confirm OOo 1.x worked OK with such filenames.

Issue 51233 is a duplicate of this issue.

Issue 60181 is a duplicate of this issue.
Comment 17 hennes.rohling 2006-04-18 17:26:08 UTC
To clarify: It has nothing to do with the language of OOo but with the native 
language of the Windows system, that means an english Windows XP always 
uses the CP 1252 as system code page (reagardless of installed codepages or 
languages). There a some code fragments that use the system code to translate 
unicode file names.

@sba: Please verify if double cliking on such a file (containing characters in file 
path that are not available in the system code page) always fails (that would mean 
we have a generic problem in the framework component) or if the problem only 
occurs if an soffice process is already running, f.e. Quickstarter or so  (that would 
mean the pipe communication passing the parameters to a running process is 
broken).
Comment 18 wszenajch 2006-04-19 08:45:34 UTC
I am not using Quickstarter and problem exists.

There is workaround possible for limited set of characters:
Use: Start - Control Panel - Regional Options
Tab page "Advanced" - "Select a language to match the language version of the
non-Unicode programs you want to use:" - to set your most often used characterset.
I made English US XP SP2 working with codepage 1250 instead of 1252. This did
not help in case of Greek characters but fixed problem of Polish characters.

It seems that doublecliking on file makes soffice-Windows XP SP2 pair to use
non-Unicode mode for filenames. MS Word works corectly witout such settings with
Polish and Greek characters together in filenames.
Comment 19 hennes.rohling 2006-04-19 11:54:15 UTC
*** Issue 60181 has been marked as a duplicate of this issue. ***
Comment 20 hennes.rohling 2006-04-19 12:03:40 UTC
Target set.
Comment 21 hennes.rohling 2006-04-27 16:02:14 UTC
Was broken by the fix for issue 35209 that disabled all unicode command line
support.

Increase Prio.

Comment 22 hennes.rohling 2006-04-27 16:20:08 UTC
.
Comment 23 hennes.rohling 2006-04-27 16:33:02 UTC
I'll fix this by adjusting the CRT_ENTRY macros and export a new
osl_setCommandLineArgsW().
Comment 24 hennes.rohling 2006-05-19 15:36:21 UTC
.
Comment 25 hennes.rohling 2006-08-07 10:36:36 UTC
.
Comment 26 Michael Osipov 2006-08-07 12:33:01 UTC
Why has the target milestone changed?
this is a showstopper in a lot of countries.
I can't migrate a lot of my customers due to this bug!
Comment 27 aehrlich 2006-08-07 15:59:00 UTC
The target milestone change does basically mean that there are not enough votes,
probably ;-).
Technically it means that customers that have something to do with
multi-codepage environments have to either use 1.1.5 or MS Office...
Comment 28 Michael Osipov 2006-08-07 16:20:19 UTC
But I don't understand why this has been working in 1.1.5 pefectly and now
broken in OOo 2.0.

2.0 Exists now for almost a year and this hasn't been fixed already.
Only because there are not enough votes, this doesn't mean there are no users.
Most try, won't work, uninstall. They don't even know about bugzilla or even
sufficient english to post here.

I want them to move finally from 1.1.5 to 2.0.x someday.
Comment 29 taipeitech 2006-12-01 06:16:16 UTC
There's a PARTIAL WORKAROUND for this bug that helps in some cases (tested with
ZH-TW). 

In Win XP, change the 'language for non-unicode programs' to match the character
set of the filename you want to open. Restart.

This setting can be found in Control Panel -> Regional & Language Settings ->
Advanced tab -> choose the appropriate language from the first list box

BTW, that dialog box also lets you choose which code page conversion tables to
load. I don't know what affect those options have on this issue.


WARNING: changing the 'language for non-unicode programs' could cause other
applications to stop working properly (most likely old apps, not mainstream
stuff). If that happens, you can simply change it back to the original setting.



I agree with sgfan that this bug is a show stopper for some, and is probably
under-reported. It's really harmful for new users' impressions of OO in certain
environments. OO appears to trash the very first file they try to save (actually
they just can't re-open it by double-clicking on the icon). For inexperienced
users, the bug makes it look like OO simply doesn't work.

*This bug was first reported almost one and half years ago (as issue 52240).
It's been reported separately at least six times - see the apparent duplicate,
Issue 64764, which has another 6 votes to add to the 14 this issue has now


Comment 30 etko 2006-12-08 10:03:50 UTC
This issue is was confirmed with file name containing Slovak national characters
ctšl also on following setups:

- WinXP Pro SP2 English + OpenOffice 2.04 English 
- WinXP Home SP2 Slovak + OpenOffice 2.04 Slovak 
- WinXP Home SP2 Slovak + OpenOffice 2.04 English
Comment 31 etko 2006-12-08 10:12:07 UTC
For others being annoyed by this bug, PARTIAL WORKAROUND proposed by taipeitech
worked in my case, however I would like to see OpenOffice fixed to support
Unicode file names. Thank you.
Comment 32 ivanii 2007-01-23 16:37:49 UTC
OOo 2.1 WinXP SP2
WFM
Comment 33 Michael Osipov 2007-01-23 16:42:54 UTC
OOo 2.1 WinXP SP2
Doesn't work here w/ russian chars
Comment 34 haleem 2007-02-02 18:53:13 UTC
Confirmed issue exists on version 2.1 using a Windows XP SP2 machine. 

However, an older build was also used to check the same bug, and it was not 
reproduced on build 2.0.3 using Windows XP.
Comment 35 haleem 2007-02-02 18:55:48 UTC
I tried different combinations of name for the following languages : english, 
arabic, urdu.

It seems that the later version(2.1) on Windows XP crashes, however the 
earlier version(2.0.3) on Windows XP works with the workaround mentioned in 
the comments (i.e. enabling langauge in control panel)

Comment 36 hennes.rohling 2007-02-05 13:56:07 UTC
Changing target.
Comment 37 kpalagin 2007-02-10 11:15:32 UTC
Hennes,
are we really on target for 2.2.1?
If necessary, I am ready to provide remote access to the system that shows 
problematic behavior, so that developer can do some diagnotics and 
investigation. Just drop me a message at kpalagin@openoffice.org and I will 
supply necessary information.
Thanks a lot for your attention.
Comment 38 Michael Osipov 2007-02-10 14:07:00 UTC
I hope this will get fixed really soon. Still don'T unterstand why this hasnt
been noticed during OOo2.0 development cuz since then this bug has existed.
Comment 39 seogwang 2007-02-21 14:22:26 UTC
On my machine this bug is way more dangerous than you describe: when I try to
open some document with a Korean filename by double clickin in Windows Explorer,
it isn't just Open Office who crashes, WINDOWS ITSELF does the same; it freezes
and I can't even summon Task Manager or shut Windows down. For two times I could
only retrieve the battery from my laptop in order to get it back to work. I'm
too frightened to make any further experiment.
My system information:
Windows XP SP2 Media Center Edition 2005 English
Default system and user language both set to Catalan
Language for non-Unicode programs set to Catalan
Open Office 2.0.3 Catalan
I don't want to switch the non-Unicode language to Korean because then all
programs get downloaded and installed in Korean, be they written in Unicode or
not. I don't want to use programs in Korean (my Korean is not that good), I just
happen to need handling DOCUMENTS in Korean.
This is very depressing because I have the same problem on Firefox 2.0.1 and
Thunderbird 1.5.0.7. (https://bugzilla.mozilla.org/show_bug.cgi?id=235385). My
friends' Korean documents lose their names and become underscores, and I can't
e-mail them my own Korean documents. Since these people don't know English,
changing the name is not always an acceptable option.
Since I haven't lost faith on open source software, my workaround will be
speeding up my migration to Linux. However, after this experience an average
Asian would just give up thinking that all these programs are but a hobby for
Western geeks, especially given that even Notepad can handle foreign characters
in my configuration. No wonder now that Windows/IE hold a 99.8% market share in
Korea!
My own impression is that English is way too pervasive in the open source
community, as if everybody was only using their mother tongue for themselves and
English for international communication. The needs of multilingual people like
me aren hardly considered. And in spite of national localizations, when you're
out in the Internet looking for help, help is available only in English.
Comment 40 andreschnabel 2007-03-23 21:37:35 UTC
*** Issue 75671 has been marked as a duplicate of this issue. ***
Comment 41 kpalagin 2007-04-07 07:49:41 UTC
*** Issue 76039 has been marked as a duplicate of this issue. ***
Comment 42 filipg 2007-04-16 01:06:36 UTC
On OO 2.2 I don't even get an error, just an hourglass and then nothing. My wife
is _again_ clamoring for MS Word :-( I attached a trigger file that refuses to
open (it's an empty OO document). I had to store it with 7-zip because the old
version of Winzip I have on this machine refuses to understand the directory too!
Comment 43 filipg 2007-04-16 01:16:45 UTC
Created attachment 44438 [details]
7-zip archive with ODT file that can't be opened in OO2.2
Comment 44 tml 2007-04-17 15:52:36 UTC
Created attachment 44485 [details]
Suggested patch, works for me
Comment 45 hennes.rohling 2007-04-24 08:17:54 UTC
Added to CWS 2.2.1

Patch will work but will override osl_setCommandArgs funktionality. But
osl_setCommandArgs is depricated and should not be used as an API.

will change the documentation.
Comment 46 Michael Osipov 2007-04-24 17:25:05 UTC
hoping to see this fixed in 2.2.1
Just wanted to remind that this bug already exits for 1,5 years. Additionally
this bug did NOT exist in 1.1.x!
Comment 47 hennes.rohling 2007-04-26 12:18:39 UTC
Changed in CWS hro15 to be integrated in 2.2.1
Comment 48 kpalagin 2007-04-26 19:27:57 UTC
*** Issue 69973 has been marked as a duplicate of this issue. ***
Comment 49 hennes.rohling 2007-04-27 12:03:29 UTC
Assigned to QA
Comment 50 thorsten.martens 2007-05-03 13:36:49 UTC
Fixed but failed in cws hro15 -> splashscreen appears for a short moment but
office doesn't start.
Comment 51 thorsten.martens 2007-05-03 13:38:25 UTC
TM->HRO: as talked about, back to you.
Comment 52 hennes.rohling 2007-05-07 11:18:28 UTC
Implemented Unicode functionality for system integration wrappers.
Comment 53 hennes.rohling 2007-05-07 11:21:00 UTC
Hand over to QA
Comment 54 thorsten.martens 2007-05-07 11:24:30 UTC
Checked and verified in cws hro15 -> OK !
Comment 55 hennes.rohling 2007-05-07 13:12:28 UTC
*** Issue 51233 has been marked as a duplicate of this issue. ***
Comment 56 aehrlich 2007-07-03 18:14:40 UTC
YESSS! Finally I've started migration of "my" companies from 1.1.5.
Comment 57 Michael Osipov 2007-07-03 18:24:52 UTC
aehrlich,

absolutely. after 1,5 years I will migrate my customers too.
took way too long!
Comment 58 Raphael Bircher 2008-06-18 09:30:10 UTC
Testet with 3.0 Beta on Windows XP. Seams to be ok.

Closing issue
Comment 59 pedroestarque 2009-01-25 22:38:18 UTC
This happens in OS X as well
Can't open file test_ç.doc
Comment 60 zemiak 2009-06-02 07:10:38 UTC
OOo 3.0.1/3.1.0, MacOSX 10.4, 10.5. OOo cannot open files in national characters 
in it. Example: Názov.doc

OOo 3.0.1, MacOSX 10.4: Save: Object does not exist, insufficient access rights 
to an object. Open: The file does not exist
OOo 3.1, MacOSX 10.4: Save, Open: The file does not exist
OOo 3.1, MacOSX 10.5: Save: Saved file has %XX codes insted of national chars
OOo 3.1, MacOSX 10.5: Open: after renaming file to Názov.doc, The file does not 
exist

This is a showstopper for me, as I need to work with national filenames, so I 
had to install a slow Neooffice yesterday :(

I am able to run another tests (at least on MacOSX 10.4, which is my primary OS)
Comment 61 kpalagin 2009-06-02 07:23:38 UTC
zemiak,
this issue seems to describe problem on Windows.
Your problem is described by http://www.openoffice.org/issues/show_bug.cgi?
id=69973.
You may want to vote and post comments in there.
Comment 62 grehtietalders 2010-11-10 17:45:12 UTC
Created attachment 73576