Issue 120790 - importing .doc, only boxes/links/tables are displayed, text is missing
Summary: importing .doc, only boxes/links/tables are displayed, text is missing
Status: CONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: 3.4.1
Hardware: All Linux, all
: P3 Normal (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-01 15:51 UTC by Franta Hanzlik
Modified: 2014-05-20 08:45 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
.doc file which OOO import without text (76.00 KB, application/msword)
2012-09-01 15:51 UTC, Franta Hanzlik
no flags Details
Resaved from Microsoft Word 2010 (79.00 KB, application/msword)
2012-09-01 16:10 UTC, Regina Henschel
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description Franta Hanzlik 2012-09-01 15:51:32 UTC
Created attachment 79273 [details]
.doc file which OOO import without text

Openning MS Word .doc document, there are displayed only table lines, boxes lines and horizontal lines, but all text fields in documents are missing.
Document statistic say that it has 1 page an all other items (no. of tables/images/OLEs/paragraphs/words/characters/lines) are 0.

"file" command at this document say:
Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.1, Code page: 1250, Title: Likvida, Subject: Tiskova sestava BYZNYS Win/VR, Author: UNIT PLUS s.r.o., Template: Normal, Last Saved By: jsvabova, Revision Number: 2, Name of Creating Application: Microsoft Word 9.0, Create Time/Date: Mon Jun 18 22:37:00 2012, Last Saved Time/Date: Mon Jun 18 22:37:00 2012, Number of Pages: 1, Number of Words: 42, Number of Characters: 240, Security: 0
(really, number of words in document is about 150 and number of characters is ~ 1100)

These documents are without problem opened in MS Office, text fields are also displayed in catdoc utility.
Important may be fact, that documents are SW-generated, probably by utilizing some MS Office SDK.
Comment 1 Regina Henschel 2012-09-01 16:10:57 UTC
Created attachment 79274 [details]
Resaved from Microsoft Word 2010

I confirm, that the text fields are missing.

I have opened the file in Microsoft Word 2010 and saved it without changing anything. That is the attached file. The re-saved file opens in AOO3.4.1 including the text fields.

It is remarkable that the whole document does not contain any normal text, but all content is in text fields or is a shape.
Comment 2 Regina Henschel 2012-09-01 17:21:02 UTC
My debug build of r1378069 reports multiple times:

Error: Assertion failed
==================
FILE      :  c:/AOO_2012_07_git/trunk/main/sw/source/filter/ww8/ww8graf.cxx at line 2588
ERROR :  "Where is the Shape ?"

There are no error messages when opening the re-saved file.
Comment 3 Franta Hanzlik 2014-05-20 08:45:36 UTC
Just tried it in Apache OO 4.1.0, and (after nearly two years) no change - text is missing.
What other I tried, only Calligra Word give some text output, but it is not quite ideal. But unlike AOO at least some text prints, although little corrupted.