Issue 51525 - Support legacy encodings when opening Microsoft Office documents (Ability to select code page on import of non-Unicode text/document)
Summary: Support legacy encodings when opening Microsoft Office documents (Ability to ...
Status: CONFIRMED
Alias: None
Product: General
Classification: Code
Component: code (show other issues)
Version: 680m109
Hardware: PC All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: ms_interoperability, oooqa, rfe_eval_ok
Depends on:
Blocks:
 
Reported: 2005-07-04 10:02 UTC by pmike
Modified: 2013-08-07 15:31 UTC (History)
1 user (show)

See Also:
Issue Type: FEATURE
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Excel 5.0 doc with CP-1251 chars (132.00 KB, application/vnd.ms-excel)
2005-07-07 13:58 UTC, pmike
no flags Details
OO Calc doc with _expected_ result (45.20 KB, application/vnd.oasis.opendocument.spreadsheet)
2005-07-07 13:59 UTC, pmike
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description pmike 2005-07-04 10:02:28 UTC
When importing a text in "native" encoding and there is no info about valid code
page, OO defaults to ANSI code page. Plain text is a good example. This lead to
corruption of any non-ANSI chars.
A dialog with manual selection of code page for import would help a lot.
Also, some "invalid" MSWord docs might be Unicode, but contain emedded objects
with non-Unicode text. So, recoding ANSI->Unicode->Native for selected
text_range/ objects will be also very useful.
Comment 1 mci 2005-07-05 13:20:26 UTC
Hi pmike,

thanks for using and supporting OpenOffice.org...

I think this would be a nice feature...

reassigned to requirements
Comment 2 lohmaier 2005-07-05 21:43:32 UTC
already present since the very beginning.

Use the filetype "Text (encoded)" and you will be prompted with a dialog to
choose the charset.

(And it is wrong that OOo defaults to ANSI)

Regarding the "invalid doc files": File a seperate issue and attach one of those.
Comment 3 lohmaier 2005-07-05 21:44:00 UTC
closing issue.
Comment 4 pmike 2005-07-07 12:51:59 UTC
Well, text files can be imported with required code page (albeit manually).
However, the main problem is lack of ability to specify code page for
non-unicode documents like MSWord 5.0/6.0/7.0 docs as well as old Excel worksheets.
So, I think this issue was closes too early and reopen it.

Regarding import of plain text files. I suppose it's much better to get rid of
"Text (encoded)" option and always offer to select code page. The only case to
skip excoding dialog if text file have only ASCII-127 chars (all codes below
0x7f, very easy to check).
Comment 5 lohmaier 2005-07-07 13:21:50 UTC
> Well, text files can be imported with required code page (albeit manually).

There is no way other than to manually choose the codepage if the one from the
locale doesn't match.

> However, the main problem is lack of ability to specify code page for
> non-unicode documents like MSWord 5.0/6.0/7.0 docs as well as old Excel 
> worksheets.

Please provide sample documents.

> Regarding import of plain text files. I suppose it's much better to get rid of
> "Text (encoded)" option and always offer to select code page. [...]

No. This will pi** off users that use plaintext a lot and are not english speakers. 
Comment 6 pmike 2005-07-07 13:57:35 UTC
>There is no way other than to manually choose the codepage if the one from the
locale doesn't match.
> No. This will pi** off users that use plaintext a lot and are not english
speakers. 

MS office is able (sometimes) to choose code page correctly. But it still offers
dialog to confirm its autoselection. It might be that active user locale can help?

> Please provide sample documents.

File 'pricelist.xls' is an old Excel document (v5.0 I suppose). OO can't import
russian (cp-1251) chars from this file.
File 'pricelist.ods' is the required result (corrected).
Comment 7 pmike 2005-07-07 13:58:33 UTC
Created attachment 27764 [details]
Excel 5.0 doc with CP-1251 chars
Comment 8 pmike 2005-07-07 13:59:20 UTC
Created attachment 27765 [details]
OO Calc doc with _expected_ result
Comment 9 lohmaier 2005-08-03 20:47:21 UTC
I misunderstood "text document" with plain text...
Sorry.
Comment 10 ace_dent 2008-05-16 00:53:07 UTC
OpenOffice.org Issue Tracker - Feedback Request.

The Issue you raised is currently assigned to 'Requirements' pending review, but
has not been updated within the last 3 years. Please consider re-testing with
one of the latest versions of OOo, as the problem(s) may have already been
addressed. Either use the recent stable version:
http://download.openoffice.org/index.html
or consider trying the new OOo 3 BETA (still in testing):
http://download.openoffice.org/3.0beta/
 
Please report back the outcome so this Issue may be Closed or Progressed as
necessary - otherwise it may be Resolved as Invalid in the future. You may also
wish to search for (and note) any duplicates of this Issue that may have
advanced further by checking the Issue Tracker:
http://www.openoffice.org/issues/query.cgi
 
Many thanks,
Andrew
 
Cleaning-up and Closing old Issues as part of:
~ The Grand Bug Squash, pre v3 ~
http://marketing.openoffice.org/3.0/announcementbeta.html