Apache OpenOffice (AOO) Bugzilla – Issue 16900
XHTML Export using wrong encoding for Hebrew
Last modified: 2013-08-07 15:00:08 UTC
Hi, I just tried out OOo for the first time and am quite impressed I am using 1.1 RC on win2k. Hebrew support worked well but if not for documentation at http://www.openoffice.org.il/arabic_hebrew_howto_OS_643.txt I would have been lost. this belongs in the regular help files. Back to the issue at hand. I typed a small hebrew doc and tired saving it in the native and .doc formats. all was well. Same went for saving it as HTML. Exporting to PDF worked but exporting to XHTML did not produce legible html. Any ideas?
Created attachment 7715 [details] Original file
Created attachment 7716 [details] xhtml file (renamed to .html)
After aditional testing I noticed that it does work in Mozilla Firebird 0.6 (I assume that works on other mozilla builds as well). It does not display properly in IE (or NS 4.x). Maybe it's an IE "feature", but the regular save as html does properly display in IE. It seems that the problem is the lack of adding <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-8-I"> to the html. Adding it fixed the problem. (as does adding <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1252"> that is used in the regular save as html)
How exactly are you saving the file? And what do you have under prefrences-->save/load-->html-->character set? I got three difrrent results depending on how I save the file as xhtml. The closest I got was having it saved as .xhtml instead of .html, makign both mozilla and ie to display the content inline. After renaming the file, it worked fine in both IE and mozilla, and the file was currectly encoded as unicode, as xhtml should be. tested with rc2 on my windows 2003.
Hi, Under Options > Load/Save > HTML Compatibility I have "Western Europe Windows 1252". Since the document properties were set to Hebrew I would expect that exporting (I dont't think there is a save as xhtml) the document to xhtml would properly set the proper xhtml setting to hebrew (as the save as html option properly does) even though my default html prefs were to the default Windows char set. Does the html document attached display properly in your IE? That aside is there any reason there is no save as xhtml? why is it in the "Export" menu?
DL->MIB: Would you please takeover?
Svante has fixed this bug already, but not committed it to any realease. I think we should consider to place the bugfix in OOo 1.1.1
.
accepted
XHTML export is an export filter based on an XSL transformation (Tools/XML Filter Settings...) while HTML export/saving is implemented as an internal filter that is accessing the internal document representaion not the xml representation. Settings in the Load/Save options don't have an effect on this XHTML export. Although XHTML does not reqiure a meta-tag to select an encoding (which is UTF-8 in this case) IE seems to need it since it seems not to use the <?xml encoding="UTF-8"> element at the start of the xhtml file.
will be integrated with a other xhtml enhanchments in 1.1.2
Lars this bug will be fixed with my update of XHTML XSLT stylesheets (gonna add a link for beta download ASAP).
fixed by adding meta tag
In a patch only changes on the exisiting documents will be commited. In our case the whole filter (stylesheets) have been refactored and overworked, so I changed target to OOo2.0.
Maybe aside of the explicit UTF-8 encoding via meta-file a new 'dir' (direction) attribute have to be added. Due to my lack of knowledge about hebrew I am not able to validate this issue, so I gonna add the stylesheets, which the submitter might test. If it fails, we should file a follow up task, as some enhancements concerning this issue already have been implemented. gzip file attachement containing the stylesheets will follow...
SUS:Reassigned to the QA
I also can not validate it but we have fixed the META tags, so it should work now.
blind verified.
Short of updating OOo is there any way you can attach the latest xhtml output based on the original file (why.swx) that I attached? Thanks
Sorry, the XHTML filter is based on more than one file and we have that behavouor not specified in the 'XML Filter Settings' feature. So I can not create a .jar file for you to test it. I will write down here the milestone in which it could be tested, okay?
SHould be okay in SRC680m48 but can not test hebrew.