Issue 2144 - Export of non latin1 characters in HTML format broken
Summary: Export of non latin1 characters in HTML format broken
Status: CLOSED FIXED
Alias: None
Product: Internationalization
Classification: Code
Component: ui (show other issues)
Version: 638
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: Unknown
QA Contact: issues@l10n
URL:
Keywords: oooqa
: 1810 (view as issue list)
Depends on:
Blocks:
 
Reported: 2001-11-07 14:03 UTC by Unknown
Modified: 2008-05-18 00:00 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Unknown 2001-11-07 14:03:14 UTC
So, the problem is that is WYSIWG part of HTML editor all the words typed in
Latvian language are displeyd correctly, but when I save the html file and then
look at the source, the special characters like â,?,? are translated by the
progrm like &d5r2,&wr32 (just an example) instead of normal iso8859-13 encoded
characters. so resulting text of html document in my native language (latvian)
also doesn`t look like it should.

Bye
Comment 1 stefan.baltzer 2001-11-12 12:20:40 UTC
Reassigned to Éric.
Comment 2 eric.savary 2001-11-12 16:28:34 UTC
*** Issue 1810 has been marked as a duplicate of this issue. ***
Comment 3 eric.savary 2001-11-12 17:20:05 UTC
ES->FLO: please evaluate the effort and targetted milstone.
Notice: we have maybe 2 solutions to displays these charcters:
- The human readable one: display the character as is in the code the 
way an UTF-8 export allows it.
- The HTML complient one: work with named entities (possible?). In 
this case we could have instad of a '&#226' a 'â' (which 
correspond in iso-8859-1 to a circumflex accent) which the encoding 
tag "charset=iso-8859-13" would transform to a "latin small letter a 
with macron".
Comment 4 t8m 2001-12-10 13:54:11 UTC
Currently the HTML export is completely unusable for non Latin-1 charsets.
Please try to fix it. I think that the correct way should be to export
it in the utf-8 encoding without these stupid named entities and that
thing you've specified (to use named entities from Latin-1 with the
same character code) wouldn't work anyway.
If someone wants to have it's pages in different encoding he can use
some external tool to convert utf-8 to his preffered encoding.
And I think that the correct utf-8 export should be very easy to do.

Comment 5 ooo 2001-12-10 17:41:11 UTC
If you want default UTF-8 export just say so under menu
/Tools/Options/LoadSave/HTML_Compatibility and select character set
Unicode (UTF-8). If you want a different encoding just select it, you
don't even need external tools for that.
Comment 6 t8m 2001-12-11 08:54:19 UTC
OK. I didn't know about the setting. But it could be better anyway -
it should export all characters as utf-8 but it exports characters
which are the same in latin-1 and latin-2 (or maybe have the named
entities) as named entities (aacute, iacute) but it's completely
unnecessary. If some browser supports the utf-8 encoding it should
display them fine even if they would be as utf-8 character and not
named entity. And it doesn't make the source partially readable.

But the export to other charsets (OK I've tried only the latin-2 and
windows-1250) is broken completely.
Comment 7 frank.loehmann 2002-11-29 08:58:17 UTC
FL: Please see latest comment from Tomas Mraz "But the export to 
other charsets (OK I've tried only the latin-2 and windows-1250) is 
broken completely." and clarify issue. (UTF-8 and named entity are 
not the problem)
Comment 8 eric.savary 2002-12-11 14:51:46 UTC
ES->Artis & Tomas: please check if you still have problems with a
current build.
If yes:
- describe step by step what you do (which settings, encoding), how
you save etc. Notice if you see any error message
- attach a sample file (!!! but zipped if it is an HTML file because
IssueZilla destroys HTML docs !!! ) or provide an URL to this file.
- Reassign to me

If notr: close the issue
Comment 9 Unknown 2002-12-11 18:12:38 UTC
Thanks a lot! Good work!
Comment 10 ace_dent 2008-05-17 21:55:17 UTC
The Issue you raised has been marked as 'Resolved' and not updated within the
last 1 year+. I am therefore setting this issue to 'Verified' as the first step
towards Closing it. If you feel this is incorrect, please re-open the issue and
add any comments.

Many thanks,
Andrew
 
Cleaning-up and Closing old Issues
~ The Grand Bug Squash, pre v3 ~
http://marketing.openoffice.org/3.0/announcementbeta.html
Comment 11 ace_dent 2008-05-18 00:00:33 UTC
As per previous posting: Verified -> Closed.
A Closed Issue is a Happy Issue (TM).

Regards,
Andrew