Apache OpenOffice (AOO) Bugzilla – Issue 25943
greek letters from imported doc files fail to export to HTML
Last modified: 2013-08-07 14:38:26 UTC
Found with OOo 1.1.0 french version, on Windows XP french. When I import a doc file created with Word 97 or 2k (but not XP), and then save it as HTML, the greek letters in my formulas come out as garbage. This also happens if I first save my file as sxw and then as HTML. This seems to have something to do with the fact that Word inserts greek special characters with the Symbol font. Indeed, if I manually (in Word) change the font of all these chars to Arial, then import into OOo and export to HTML, it works fine. However, Symbol font chars created in a native sxw file are correctly translated to HTML, so the problem isn't simply failure by OOo to translate the Symbol font to HTML. The only workarounds I have at the time being : - in OOo, use the replace menu item for each kind of greek letter, replacing it with the corresponding Arial font greek letter (time consuming) OR - in Word, change the font of each greek char to Arial in the original doc file (even worse) Thanks for looking into this issue.
reassigend to mru can you please take a look on this issue? to the issue reporter please attach a bugdoc
.
I do not have the problem. My imported formulas can be exported properly to HTML format. please attach the offending document to this issue, so that we can reproduce and fix the problem here. You can also send the document directly (mru@openoffice.org) for the case it contains confidential data. Feel free to re-open the issue when you've done. Thanks for supporting us!
Created attachment 13532 [details] bug document
I have added a document allowing the bug to be reproduced : - open the doc file with OOo - save it as html - view it in your browser
Ah, the greek letters are not inside of as formula object, they are available as plain text. also happens when a document was created in OO without using import. MRU->ES: please have a look.
ES->MIB: actually, the characters are well exported and there is no encoding problem. It's just that those characters are in "Symbol" font and the paragraph does not export to this hard formatting but remains as "Times New Roman;serif"