Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing
|Summary:||Save as HTML causes charset problems with Russian|
|Component:||code||Assignee:||AOO issues mailing list <issues>|
|Status:||CONFIRMED ---||QA Contact:|
|Priority:||P3||CC:||bluedzins, issues, xslf|
|Issue Type:||ENHANCEMENT||Latest Confirmation in:||---|
Description issues@www 2001-05-29 05:43:32 UTC
If I save document with Russian as HTML I get the document totally unreadable even in OO itself. First of all charset is set to iso8859-1. Second, body of HTML is in strange coding. Attemts to set charset=utf-8 or something similar is not success.
Comment 2 issues@www 2001-05-29 05:47:19 UTC
Created attachment 264 [details] Wrong encoded HTML with Russian
Comment 3 stefan.baltzer 2001-05-29 12:36:24 UTC
Reassigned to Eric.
Comment 4 eric.savary 2001-05-29 15:06:59 UTC
- HTML: currently, OOo only supports UTF-8 encoding. Set this under "Tools - Options - Load/Save - HTML compatibility - Charcter set" - Writer 6.0: I can see the cyrillic text ([etot text na russkom]...). So I assume you don't have Unicode fonts installed on your system to *display* cyrillic in OOo.
Comment 5 issues@www 2001-05-29 17:47:47 UTC
I set UTF-8 in Filters/HTML and now can see Russian. Thank you! But it should set charset=utf-8 by default in HTML header while OO uses only Unicode.
Comment 6 issues@www 2001-06-02 20:57:36 UTC
The problem described here is the same as in issue #471, which was marked as "resolved" after introducing the "charset" option for the html export filter. In my opinion, three things still have to be changed: 1. What happens to characters in the document that are not part of the character set which is set in the HTML export options? (E.g., I have a document containing both German and Russian text, and I try to save it as iso-8859-1)? I think, those characters have to be saved in the HTML file as numeric HTML entities (Ӓ). 2. By now, when I try to save the German-Russian document mentioned above in HTML and the HTML export charset is not set to utf-8, the document is saved quietly without any error message. - Only after reopening the saved document I have to mention that all non-Latin1 characters are destroyed. I think, there should be at least a warning displayed to the user, saying that the selected export charset does not match all characters in the document. 3. In my opinion, "Tools - Options - Load/Save - HTML compatibility - Character set" is not the right place for the character set option of the HTML export filter, because if I have documents in several languages I have to change the setting for every document. I think the charset setting should be possible directly in the "Save as.." dialog as it is for the file type "Text (encoded)". The default value should be set to a charset that best matches the characters in the actual document (e.g. iso8859-1 if there are only Western characters in the document, KOI8-R if there are only Russian characters, UTF-8 if there are characters from more than one 8- bit charset and so on...)
Comment 7 issues@www 2001-06-03 06:15:26 UTC
>I think the charset setting should be possible directly in the "Save as.." >dialog as it is for the file type "Text (encoded)". The default value should >be >set to a charset that best matches the characters in the actual document (e.g. >iso8859-1 if there are only Western characters in the document, KOI8-R if >there >are only Russian characters, UTF-8 if there are characters from more than one >8-bit charset and so on...) The trouble is in default charset. For Russian KOI8-R is used in UNIX, WIN1251 in Windows, ISO8859-5 sometimes in commercial Unices. If OO is cross-platform what is default charset? IMHO default charset should be UTF-8. Since OO now uses it and only it (see previous comments) the only problem is to properly set charset=utf-8 tag and may be to disallow encoding selection ability to not confuse users. But recoding to 8-bit charsets is a nice feature...
Comment 8 issues@www 2001-06-05 12:41:30 UTC
>IMHO default charset should be UTF-8 I agree to you.
Comment 9 eric.savary 2001-06-05 14:13:03 UTC
Ok so, let's rewrite it the way Dimitry does: IMHO default charset should be UTF-8. Since OO now uses it and only it. The only problem is to properly set charset=utf-8 tag and may be to disallow encoding selection ability to not confuse users.
Comment 10 lutz.hoeger 2001-06-05 15:07:14 UTC
Falko, please take care of this one. Are there any compatibility issues with old StarOffice versions?
Comment 11 falko.tesch 2001-06-13 11:03:20 UTC
If this is true this is a bug not a RFE
Comment 12 stefan.baltzer 2001-06-18 16:09:12 UTC
The circle closes... Reassigned to Eric.
Comment 13 eric.savary 2001-06-19 09:45:10 UTC
And the circle reopens! ;-) Falko: it is a RFE because OOo doesn't save default to UTF-8 for it hasn't been *planed*. So it doesn't work because it had not to be inplemented :).
Comment 14 falko.tesch 2001-07-02 07:57:55 UTC
Will be fix in 6.0 final
Comment 15 eric.savary 2001-07-09 09:14:36 UTC
Falko: which *OOo* build do you mean? Good morning! >;-) For this task we could add the comments of Christoph (#471 - ------- Additional Comments From firstname.lastname@example.org 2001-06-06 02:29 -------). What do you think about this?
Comment 16 eric.savary 2001-07-09 09:20:50 UTC
*** Issue 471 has been marked as a duplicate of this issue. ***
Comment 17 Unknown 2001-11-08 23:11:48 UTC
changing QA contact from bugs@ to issues@
Comment 18 eric.savary 2003-06-27 22:59:24 UTC
set to OOo 2.0
Comment 19 tamblyne 2003-08-12 04:10:18 UTC
*** Issue 18140 has been marked as a duplicate of this issue. ***
Comment 20 tamblyne 2003-08-19 05:09:21 UTC
*** Issue 17923 has been marked as a duplicate of this issue. ***
Comment 21 falko.tesch 2003-09-11 16:24:08 UTC
We will address this problem in 2.0. But since I have no issue yet I re-assign this issue to Bettina to be set to duplicate once the PCD issue is opened.
Comment 22 bettina.haberer 2003-11-11 15:52:39 UTC
Hello Dmitry, this issue is already covered by an internal issue. It will be implemented in OO.o 2.0. Due to technical reason it is not possible to set this issue as duplicate to an other issue-trackingssystem. Please check the implementation in the upcoming version OO.o 2.0. Thank you.
Comment 23 stx123 2004-03-22 08:54:36 UTC
Reassign issue to owner of selected subcomponent
Comment 24 michael.ruess 2004-03-22 12:16:43 UTC
re-assigned to ES.
Comment 25 eric.savary 2004-04-15 16:58:09 UTC
ES->BH: I couldn't find any duplicate of this nor in Bt+ neithzer in iBIS. Please find out which task is duplicate ofthis one and make a child of it. Thanx
Comment 26 Martin Hollmichel 2004-08-09 14:02:40 UTC
according to http://www.openoffice.org/servlets/ReadMsg?list=releases&msgNo=7690 this issue will be set to OOoLater
Comment 27 eric.savary 2006-03-02 10:38:51 UTC
*** Issue 62704 has been marked as a duplicate of this issue. ***
Comment 28 bettina.haberer 2010-05-21 14:46:47 UTC
To grep the issues easier via "requirements" I put the issues currently lying on my owner to the owner "requirements".