Issue 96567 - I can not convert MS Word the document to HTML and back to doc.
Summary: I can not convert MS Word the document to HTML and back to doc.
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: save-export (show other issues)
Version: OOo 3.0
Hardware: All All
: P2 Trivial with 3 votes (vote)
Target Milestone: ---
Assignee: michael.ruess
QA Contact: issues@sw
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2008-11-25 08:04 UTC by nassaja
Modified: 2013-08-07 14:44 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
This original file, which I convert to html (35.50 KB, application/msword)
2008-11-25 08:06 UTC, nassaja
no flags Details
This is output html file, which I can't convert back to doc file (17.26 KB, text/html)
2008-11-25 08:07 UTC, nassaja
no flags Details
This is linux console log (2.87 KB, text/plain)
2008-11-25 08:11 UTC, nassaja
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description nassaja 2008-11-25 08:04:59 UTC
I use popular java application - JODConverter
(http://www.artofsolving.com/opensource/jodconverter) for converting types of
documents. 

I can successful convert doc to html. But I can not converting this html back to
doc. OO hangs or exits on converting operation. I'm try JODConverter and Docvert
and pyooconverter without success.

But I can convert this HTML to odt and from odt to doc.
Comment 1 nassaja 2008-11-25 08:06:26 UTC
Created attachment 58275 [details]
This original file, which I convert to html
Comment 2 nassaja 2008-11-25 08:07:19 UTC
Created attachment 58276 [details]
This is output html file, which I can't convert back to doc file
Comment 3 nassaja 2008-11-25 08:11:46 UTC
Created attachment 58277 [details]
This is linux console log
Comment 4 michael.ruess 2008-11-25 13:56:54 UTC
It is possible. When opening a HTML document without a preselected filter, it is
opened as webpage. You have to do it via fileOpen dialog. There, set "HTML
document (OpenOffice Writer)" as filter. Then the HTML will be opened as Writer
document and it will be possible to save it as MS Word format.
Comment 5 michael.ruess 2008-11-25 13:59:27 UTC
closed.
Comment 6 nassaja 2008-11-25 21:33:53 UTC
You don't understand me. 

I do not use GUI, I'm use headless mode, and JODConverter. This is in Ubuntu
linux, and OpenOffice 3.0 from Ubuntu repo.

And in headless mode I'm try convert this file by doc->html->doc schema. It's
work only for doc->html, on html->doc my headless OpenOffice hangs or exist with
error.

I think, this is because document contain table. When I remove table from html
file by hand, it converts normally.

Sorry for my English.
Comment 7 nassaja 2008-11-25 21:36:03 UTC
Little addition. In GUI mode all works fine. In headless mode OO hangs or
abnormally exist when try to convert this html document to doc. But it can
convert this html to odt.
Comment 8 eric.savary 2008-11-26 15:52:12 UTC
JSK: Please have a look
Comment 9 nassaja 2008-11-27 21:42:36 UTC
And in general, a headless mode is far from stability. 

In headless mode oo hangs once a day for me. I must solve this problems with
various props. Daemontools for autorestart, perl script for check oo service
once in 10 minutes, cron job for restart oo once in day. 

Sometimes there are miracles. For example, today ОО worked itself and can answer
by network queries. But did not wish to convert documents. JODConverter produce
strange error. After restarting oo manually all has work fine.
Comment 10 joerg.skottke 2008-11-28 09:21:36 UTC
-> PL.

TM suggested that you could be the right one to have a look at this. Neither me
nor TM think that this issue is our business. However, i'd like to be sure.

Comment 11 philipp.lohmann 2008-11-28 09:43:29 UTC
No idea at all, however since I'm not even remotely involved in doc or html
filters nor in the java bridge nor in the framework I feel a little out of sorts.

pl->cd: I don't know whether this is actually a framework problem, but perhaps
you would know someone who might feel responsible for the involved code ?
Comment 12 nassaja 2008-11-28 14:55:22 UTC
How to produce this error:

1. Today I use ubuntu 8.10 x86_64 with vanilla OO 3.0 from this site. But I
produce this error on ubuntu x86 too.

2. Download converter (jodconverter-2.2.1.zip) from
http://www.artofsolving.com/opensource/jodconverter ( direct link to sourceforge
-
http://sourceforge.net/project/downloading.php?group_id=91849&use_mirror=dfn&filename=jodconverter-2.2.1.zip&23833455
) into test folder;

3. Unzip converter;

4. Download test file
http://www.openoffice.org/nonav/issues/showattachment.cgi/58275/original.doc to
test folder

5. Start OO with network support:
soffice -nofirststartwizard -norestore
-accept="socket,host=127.0.0.1,port=8100;urp;"
or in headless mode without X server:
soffice -nofirststartwizard -norestore -headless
-accept="socket,host=127.0.0.1,port=8100;urp;"

6. convert test file to html:
java -jar ./jodconverter-2.2.1/lib/jodconverter-cli-2.2.1.jar original.doc
original.html

7. Look, there new original.html file in test folder.

8. Convert html back to doc:
java -jar ./jodconverter-2.2.1/lib/jodconverter-cli-2.2.1.jar original.html test.doc

9. Baah! OO window closed unexpected. Converter raise "Exception in thread
"main" com.artofsolving.jodconverter.openoffice.connection.OpenOfficeException:
conversion failed: could not save output document".

10. Restart OO by 5.

11. Successfully converting html to odt:
java -jar ./jodconverter-2.2.1/lib/jodconverter-cli-2.2.1.jar original.html test.odt

12. Successfully converting odt to doc:
java -jar ./jodconverter-2.2.1/lib/jodconverter-cli-2.2.1.jar test.odt result.doc

=====

Question - why it can convert doc to html, but can't covert html to doc?
Comment 13 carsten.driesner 2008-11-28 15:07:13 UTC
cd->sba: Could you please check this issue. As nassaja can reproducible this
issue with/without headless mode it's currently not correctly assigned. 
Comment 14 stefan.baltzer 2008-11-28 22:15:17 UTC
Put myself and CMC on c/c.
SBA->CMC: Any idea? 
SBA->MRU: After more info rolled in, please see if this is a filter issue, thank
you.
Reassigned to MRU.

Comment 15 michael.ruess 2008-12-01 09:27:32 UTC
Ok, was my fault in he beginning, I did not fully understand the problem. 
I now can reproduce the crash in normal OOo process. One needs to open the
attached .doc, save it as HTML, reopen the HTML, re-export to .doc and close the
document. Then the crash will happen.
The report ID is r8wvkuc.

MRU->SJ: could you please initially have a look? There is SdrModel somewhere on
top of the stack.
I also put OD and HBRINKM onto cc list.
Comment 16 kpalagin 2008-12-02 11:13:19 UTC
If I understand correctly this is crash, thus priority and keyword.
Comment 17 michael.ruess 2008-12-02 11:19:18 UTC
Setting appropriate target for P2 issue.
Comment 18 sven.jacobi 2009-01-16 11:41:55 UTC
sj->mru: We fixed some table problems, so in OOO300m15 and DEV300m38 I am not
able to reproduce this issue any longer (in OOO300m12 I am still crashing).

Can you please check this.
Comment 19 sven.jacobi 2009-01-16 11:42:59 UTC
changed owner
Comment 20 michael.ruess 2009-01-16 14:03:56 UTC
Yes, I can confirm, that this now works correctly in OOO 3.0.1 and dev 3.1 branch.