Issue 85764 - Filenames containing accented characters do not open
Summary: Filenames containing accented characters do not open
Status: CONFIRMED
Alias: None
Product: General
Classification: Code
Component: code (show other issues)
Version: OOH680m5
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-31 10:56 UTC by kendy
Modified: 2017-05-20 11:35 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
The test files. (4.52 KB, application/octet-stream)
2008-01-31 10:57 UTC, kendy
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description kendy 2008-01-31 10:56:19 UTC
When you copy a file containing accented characters from Windows (eg. cp-1251) 
to Linux that uses UTF-8 locale, and the filename is not converted, OOo 
refuses to open the file.  This situation happens quite easily - eg. when you 
get a zip archive from a Windows user, or scp the file from Windows machine to 
Linux.  Other applications handle it OK (even though the file name does not 
display correctly), OOo has problems because it internally converts the paths 
to URLs.

Any ideas how to solve this, please?  I'll attach a test ZIP file containing 
the problematic file.  If you are on a Linux system with UTF-8 locale, unzip 
it, and you'll see you are unable to open the file named "XP encoded file 
ààà.doc"
Comment 1 kendy 2008-01-31 10:57:31 UTC
Created attachment 51287 [details]
The test files.
Comment 2 Mathias_Bauer 2010-05-17 14:20:48 UTC
This issue slipped off my attention - sorry for that.

This is a general problem. Our API (UNO as well C++ in sal) assumes that file
names are given as Strings. This either can be a URL (always encoded in UTF-8)
or a system file path (always assumed to have the system encoding).

Though this assumption is very limiting, as you example shows, it could work if
we didn't touch file names, but we convert between URLs and system file paths
back and fourth in several places, and the same assumptions about encodings are
applied.

A fix would require to change the sal file API to work with types that somehow
preserve the original file system path notation and also rethink our handling of
URLs and file paths in general.

There already is an issue for this: issue 66973

If you agree that your problem is as I described it, we can close this as a
duplicate.
Comment 3 Marcus 2017-05-20 11:35:05 UTC
Reset assigne to the default "issues@openoffice.apache.org".