Issue 92341 - WW8: CTL/Thai font convert incorrectly when import from MS Office 2003
Summary: WW8: CTL/Thai font convert incorrectly when import from MS Office 2003
Status: RESOLVED FIXED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: OOo 2.4.1
Hardware: PC Windows XP
: P3 Trivial (vote)
Target Milestone: ---
Assignee: openoffice
QA Contact: issues@sw
URL:
Keywords:
Depends on:
Blocks: 41707 92549
  Show dependency tree
 
Reported: 2008-07-31 16:10 UTC by samphan
Modified: 2013-08-07 14:44 UTC (History)
10 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Word .doc with Angsana New as default font (23.50 KB, application/msword)
2008-07-31 16:15 UTC, samphan
no flags Details
This probably works (5.81 KB, patch)
2010-11-11 15:19 UTC, caolanm
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description samphan 2008-07-31 16:10:58 UTC
We found a certain type of conversion error occur noticeably often in Thai
documents when importing from some MS Word. So we did some investigation and
found a specific pattern that will reproduce the error (thanks to feedback from
the OSS department, EGAT).

1) In MS Word 2003 (they say that this happen to all versions except MS Word XP
but I haven't check that)
2) Set the default font to "Angsana New" for both Thai and Latin. The size
doesn't matter. I use 17 pt.
3) Create a blank document. Input some mixed Thai/English text, e.g.
		Hello สวัสดี Again อีกครั้ง
	All the text will be in Angsana New, 17 pt.
4) Save the document as .doc, close MS Word
	I've attached the bugdoc that've been made this way
5) Open the .doc file in Writer
6) Notice the font of the English segments v.s. Thai segments
	- the English text and format are converted correctly in Angsana New, 17 pt.
	- the Thai text are converted correctly but the fonts are changed to Time New
Roman, 17 pt.

Since Time New Roman doesn't have Thai glyphs so Writer picks the glyphs from
the default font, Tahoma. The result is unbalanced-looking type since Thai glyphs in
Tahoma are big but Latin glyphs in Angsana New are small.

I've tried other fonts and it seems that this only happen to "Angsana New" only. 
It seems like there're some code that treats the font specially. Angsana New is
the default font for Thai in Writer, anyway.

Since Angsana New is one of the few most popular Thai fonts in Windows,
combination of this with other type of conversion errors result in the fact that
less Thai documents convert correctly than Western documents. Fix the bug and
the usability of Writer for Thai/CTL will improve to be on par as for Western
documents. So I set the priority to 2.

Please help investigate the bug and if you need more information/test cases,
I'll provide them.
Comment 1 samphan 2008-07-31 16:15:03 UTC
Created attachment 55481 [details]
Word .doc with Angsana New as default font
Comment 2 samphan 2008-07-31 16:21:27 UTC
I've tested the bugdoc with OpenOffice.org 3.0 beta 2 and the bug still happen.
Comment 3 michael.ruess 2008-08-01 14:57:05 UTC
Is the same problem as issue 48064.

*** This issue has been marked as a duplicate of 48064 ***
Comment 4 michael.ruess 2008-08-01 14:57:52 UTC
Closing duplicate.
Comment 5 samphan 2010-11-08 08:18:12 UTC
I still believe that the resolved-duplicate issue 92341 (CTL/Thai font convert 
incorrectly when import from MS Office 2003) is not actually a duplicate of
issue 48064 because it happen in different version of MS Office.

We (me & tantai_thanakanok) are starting to work on fixing this bug.

Comment 6 michael.ruess 2010-11-09 16:23:08 UTC
MRU->samphan: OK, you are working on a fix. For this time we can keep this issue
open. If you have finished the fix, please attach the patch proposal to this
issue, so that our Developers can evaluate and integrate this in one of our CWS's. 
We will than change the issue type to "Patch" and move the target to a much
earlier release than "OOo later". Thanks in advance for your work on this. 
Comment 7 tantai 2010-11-11 08:28:10 UTC
I will start to fix this bug. Can anyone tell me where should I start.
Comment 8 thomas.lange 2010-11-11 12:35:30 UTC
I wonder if this is a bug at all, because when I open the attached document with
MS Word 2007 then it actually does list 'Times New Roman' as 'Asian text font'
in use for the Thai text... Also Angsana New is listed in the 'Font' listbox
(that is among the Western fonts) but not in the listbox for the Asian fonts.
Thus I wonder if maybe Word 2003 has a problem when saving the file or
displaying the font in the font dialog.

Anyway, www8 import/export is done in sw/source/filter/ww8/ww8par2.cxx and I was
told that namely the file ww8par2.cxx might be of interest. The text attribute
for the Asian font is RES_CHRATR_CTL_FONT since Thai is a CTL language, thus any
place where that is used in the ww8 directory is of interest as well.


Comment 9 caolanm 2010-11-11 15:19:10 UTC
Created attachment 74646 [details]
This probably works
Comment 10 thomas.lange 2010-11-12 08:16:06 UTC
tl->cmc: Thanks for the fish... err patch! ^_-
Patch seems to work fine. One question did you create this patch on a OOO330
based build? I'm just curious because I applied it to DEV300_m88 had to replace
some additional occurrences of
    ftcStandardChpStsh;     
    ftcStandardChpCJKStsh;  
    ftcStandardChpCTLStsh;  
with your new names for them.
Comment 11 thomas.lange 2010-11-12 09:53:10 UTC
Changing issue type to PATCH and target to OOo 3.4.
Comment 12 samphan 2011-01-05 04:59:34 UTC
The patch works. Is this bug finished?
Comment 13 Martin Hollmichel 2011-03-16 11:42:14 UTC
set target 3.x since not relevant for 3.4 release.
Comment 14 Pedro Giffuni 2011-10-28 20:37:44 UTC
FWIW,

This patch is broken on AOOo 3.4:

patching file sw/source/filter/ww8/ww8par2.cxx
Hunk #1 FAILED at 3920.
1 out of 1 hunk FAILED -- saving rejects to file sw/source/filter/ww8/ww8par2.cxx.rej
patching file sw/source/filter/ww8/ww8par6.cxx
Hunk #1 succeeded at 3675 (offset -15 lines).
Hunk #2 succeeded at 5867 (offset -47 lines).
patching file sw/source/filter/ww8/ww8scan.cxx
Hunk #1 succeeded at 5988 (offset -7 lines).
Hunk #2 succeeded at 6033 (offset -7 lines).
patching file sw/source/filter/ww8/ww8scan.hxx
Hunk #1 FAILED at 1464.
1 out of 1 hunk FAILED -- saving rejects to file sw/source/filter/ww8/ww8scan.hxx.rej
Comment 15 Pedro Giffuni 2011-10-29 19:05:07 UTC
Committed with minor fixes to the patch:

Sending        sw/source/filter/ww8/ww8par2.cxx
Sending        sw/source/filter/ww8/ww8par6.cxx
Sending        sw/source/filter/ww8/ww8scan.cxx
Sending        sw/source/filter/ww8/ww8scan.hxx
Transmitting file data ....
Committed revision 1194975.

Thanks!