Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Description
samphan
2005-02-16 11:54:28 UTC
Created attachment 22701 [details]
The original Writer document for line-breaking test
Created attachment 22702 [details]
The same document convert to MS Word XP format
This looks similar to issue #23784, which sba concluded was a feature request for more text alignment options, but I don't see what it has to do with text alignment . (From a user's perspective, it's a serious bug, not a missing feature.) Confirmed. Raised Priority to P2 - data loss (language information) - basic functionality is not working properly (export document) Set Target milestone to OOo 2.0, please change this if you find it inappropriate. comment from james_clark "issue 42909 : this hasn't been analyzed yet; it makes the OOo functionality of exporting to .doc format effectively non-functional for Thai users. It affects other languages as well, but the effects are much more serious for Thai: text is not properly tagged with its language, and the language of text cannot be changed in Word; *** this is critical for Thai because line-breaking does not work in Word if text is not properly tagged as Thai. *** There's no known workaround." FT: Andreas please check. I consider this a serious issue, too. Yes, I agree. It's a serious issue. At least for OOo2.0.1 we have to find a solution. BTW: if you use RTF format instead of .doc, the language information is recognized by Word. If within Word you save the document (e.g. LineBreakTest.doc) as XML, close the document, and then open the XML version of the document, the text is tagged as Thai. This is in Word 2003, Thai edition. AFAIK FME and you did already some investigation into this issue. The fix to issue #46087 will fix this issue as well. See that issue for more details. Runs are in fact correctly tagged with the language. The problem is that runs are not marked as being complex script: Word evidently doesn't allow something that's not complex script to be tagged with a complex script language. We will investigate to find a solution for OOo2.0 Created attachment 24528 [details]
bugdoc with correct language settings - but still not working
flr: The problem is *not* the language setting. I have attached a .DOC file - generated with a modified Writer - whose language is correctly set to Thai. However, WW does not brake the lines correctly. I suggest there is a Unicode export problem. The .DOC format has a strange "chp.idctHint" flag... Solved with patch from james_clark for #i46087#. Solved with patch from james_clark for #46087#. Fixed in dvoqbfix2. *** This issue has been marked as a duplicate of 46087 *** flr: The patch from james leads to correct language attributes. However my version of Word still does *not* perform the line break. Can you try it with your Word Version; perhaps my setting for complex scripts are set incorrectly. The patch from james is applied in dvoqbfix2. Created attachment 24531 [details]
Bugdoc exported with patch from james - attribues set corretly; however the line-break is not performed in my version of word
I can confirm that Word 2003 (Thai edition) does not perform correct line-breaking on LineBreakTest_expored_with_patch_from_james.doc. Some possible clues: a) saving this to XML in Word 2003 and reopening solves the problem; if the file is saved again as .doc, then when the .doc file is reopened is still works correctly b) if in Word you change the keyboard layout to Thai, then type a space (with the cursor still before the first character), Word performs correct line-breaking; if you then do backspace (or Ctrl-Z), the correct line-breaking remains c) if you do b), but with US keyboard layout, Word doesn't do correct line-breaking If after b) and c) (using backspace rather than Ctrl-Z), you then resave the file as .doc, you get two very similar .doc files, for one of which Word does correct line-breaking and for one of which it does not. Maybe analyzing the difference between these files will tell us what the problem is. Unfortunately wv2 debug dumps show no difference. Created attachment 24534 [details]
Bugdoc saved from Word, after space/backspace with US keyboard, with bad breaks
Created attachment 24535 [details]
Bugdoc saved from Word, after space/backspace with TH keyboard, with good breaks
I think I've figured it out. The problem is a missing document property. If you go to the Compatibility tab of the Options dialog, there should be an option called something like "Apply breaking rules" (I've only got the Thai language version, so I'm not sure what it's called in English). The problem is that OOo isn't setting this property, which causes Word not to apply Thai breaking rules. Word is smart enough to set this property automatically when you enter Thai text or open an XML file containing Thai, but it doesn't set it when you open a .doc file with Thai. In the Word XML format this corresponds to the <w:applyBreakingRules/> element. In the .doc format, it's towards the end of the DOP structure, specifically bit 0x20 in the byte immediately after the 0x04 from fDontUseHTMLAutoSpacing. Created attachment 24537 [details]
Manually hacked version of flr's bugdoc with the applyBreakingRules flag set
Created attachment 24538 [details]
Untested patch to unconditionally set the applyBreakRules flag on export
flr: duplicate to #i46732#. fixed in fr8fix1 (with appropriate language tests...) *** This issue has been marked as a duplicate of 46732 *** *** Issue 23784 has been marked as a duplicate of this issue. *** closing |