Issue 78498 - WW8: imported TOC empty, due to paragraphs having "outline level" property
Summary: WW8: imported TOC empty, due to paragraphs having "outline level" property
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: OOo 2.2
Hardware: All All
: P3 Normal with 2 votes (vote)
Target Milestone: 4.1.1
Assignee: Oliver-Rainer Wittmann
QA Contact:
URL:
Keywords: ms_interoperability
Depends on:
Blocks: 62681 67302 83951 93630 94525 102628 107029 108094
  Show dependency tree
 
Reported: 2007-06-14 18:48 UTC by aulagne
Modified: 2017-05-20 08:18 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: 4.1.0
Developer Difficulty: ---
jsc: 4.1.1_release_blocker+


Attachments
sample (28.50 KB, application/msword)
2014-06-16 02:50 UTC, Clarence GUO
no flags Details
a semi-fixpatch (555 bytes, patch)
2014-06-16 08:39 UTC, Clarence GUO
clarence.guo.bj: review?
Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description aulagne 2007-06-14 18:48:57 UTC
HI,
I'm trying to open a doc file (created with MSWord 2003).
Everything looks fine, except the table of contents that is empty...

It looks that indexes haven't been recognized (but, I'm newbie with this
product, so maybe there something I've missing, and I don't know what.)

I've search in issue's database, but can't find any solution... can someone help me?
Because the document is confidential, may I send it to an email address?

Thanks in advance
Comment 1 michael.ruess 2007-06-15 08:10:57 UTC
Yes, please send the document to mru@openoffice.org. We here at SUN will handle
those documents cofidential, of course. Thanks for supporting us!
Comment 2 michael.ruess 2007-06-15 14:34:24 UTC
MRU->OD: the TOC of the imported doc is empty, because the heading paragraphs
are all of style "Normal" (resp. "Default" in OOo), but have assigned an outline
level via hard paragraph attribute.
This issue is a good "Reminder" for the "Outline level" CWS.
Comment 3 aulagne 2007-06-15 17:04:36 UTC
Thanks for your help...

I tried, by changing the paragraphe style (Default --> TOC level 1), to correct
my problem, but with no result... when I ask to Update Indexes, nothing appear.
As I told you, I'm just starting to work with OOo...

Is there a way to make it work ?

Thanks again
Comment 4 Clarence GUO 2014-06-16 02:50:41 UTC
Created attachment 83567 [details]
sample

It is still existing in AOO4.1. I created a sample which can reproduce the problem. 
There are three paragraphs in the sample, the first two were applied outline level 1&2 by MS Office, the third one were not.
Open the sample file in AOO, update the indexes, the TOC will be lost. That is due to outline level were lost during parsing process. Check outline level from paragraph attribute, you can find the outline level are all body text.
If apply outline level to a paragraph from paragraph property setting -> Outline and numbering, once create TOC, the paragraph with only outline level settings but without any numbering settings nor headings settings will be inserted into TOC, the behavior is same as MS Word. So AOO core function already support this.
Comment 5 Clarence GUO 2014-06-16 05:30:50 UTC
I think priority trivial is too low for this defect.
I met many documents which were applied outline level to paragraphs but were not applied headings nor numberings. These paragraphs can be inserted into TOC.
AOO core function also support this scenario. But ww8 parser lose to parse this.
As I met many documents have this issue. Adjust priority from trivial to normal.
Comment 6 Clarence GUO 2014-06-16 08:39:39 UTC
Created attachment 83568 [details]
a semi-fixpatch

The patch can partial fix the problem but it is not satisfied. I just put it here for more input and discussion from others because I only have very little knowledge on document.
Without the fix code, there's not any paragraphs in the total three paragraphs of the sample was applied outline level.
With the fix code, from my debugging, I saw the code was only called twice to apply outline level attribute to two text notes. But I checked the outline level of the three paragraphs from paragraph attribute, the three paragraphs were all applied outline level. I haven't found any other places set this attribute.
If the wrong set was introduced by my code, but my code only set the attribute for two text nodes.
If the wrong set was introduced by other code, why without my code, none of the paragraphs were applied outline level.
Is my fix code correct? maybe I'm wrong. Could anybody can teach me on this?
Comment 7 Oliver-Rainer Wittmann 2014-06-17 08:06:21 UTC
(In reply to Clarence GUO from comment #6)
> Created attachment 83568 [details]
> a semi-fixpatch
> 
> The patch can partial fix the problem but it is not satisfied. I just put it
> here for more input and discussion from others because I only have very
> little knowledge on document.
> Without the fix code, there's not any paragraphs in the total three
> paragraphs of the sample was applied outline level.
> With the fix code, from my debugging, I saw the code was only called twice
> to apply outline level attribute to two text notes. But I checked the
> outline level of the three paragraphs from paragraph attribute, the three
> paragraphs were all applied outline level. I haven't found any other places
> set this attribute.
> If the wrong set was introduced by my code, but my code only set the
> attribute for two text nodes.
> If the wrong set was introduced by other code, why without my code, none of
> the paragraphs were applied outline level.
> Is my fix code correct? maybe I'm wrong. Could anybody can teach me on this?

The fix is the right one.
But the Writer has the following behavior when a new paragraph is inserted. The new paragraph gets the attributes from its previous one. These attributes are in general reset when a paragraph style is applied. The WW8 import always applies a paragraph style. But certain attributes are kept when a paragraph style is applied and the outline level attribute is one of them.

I am working on a solution.
Comment 8 Oliver-Rainer Wittmann 2014-06-17 11:16:10 UTC
The general defect cause is that the outline level property of paragraphs and paragraph styles are not imported.

The provided patch imports the outline level property for paragraphs. But due to insufficiency in the Writer's core model the import outline level is passed on the next paragraph - see previous comment.

Solutions in progress:
- Remove the code which keeps the paragraph's outline level attribute in case a paragraph style is applied. Currently, I do not see any reason why this code had been introduced.
- WW8 import: Apply the read paragraph style's outline level to the paragraph style. Currently this outline level is only used WW8 import internally.
- WW8 import: Tweak a little bit the outline level's import code
Comment 9 Clarence GUO 2014-06-18 01:53:03 UTC
Great! Oliver. Thanks for your input.
Looking forward for your solution.
Comment 10 Oliver-Rainer Wittmann 2014-06-20 07:14:32 UTC
The proposed solutions are in progress.
Further improvements and correction regarding outline level and outline numbering in the WW8 import are also sensible and in progress. Thus, it will take some more time to address the complete area.
Comment 11 SVN Robot 2014-06-27 12:27:56 UTC
"orw" committed SVN revision 1606055 into trunk:
78498: Do not keep OutlineLevel attribute at paragraph when a Paragraph Style...
Comment 12 SVN Robot 2014-06-27 12:34:11 UTC
"orw" committed SVN revision 1606061 into trunk:
78498:  WW8 import - improvements/corrections regarding outline level & Co
Comment 13 Oliver-Rainer Wittmann 2014-06-27 12:41:16 UTC
fixed in trunk
Comment 14 SVN Robot 2014-07-01 15:20:58 UTC
"orw" committed SVN revision 1607111 into trunk:
78498:  WW8 import - some further improvements and corrections regarding outl...
Comment 15 Oliver-Rainer Wittmann 2014-07-01 15:25:06 UTC
Integration some further improvements and corrections regarding outline for the WW8 import.

These changes solves more or less the following issues 62681, 67302, 83951, 93630, 94525, 107029, 102628
Comment 16 Oliver-Rainer Wittmann 2014-07-07 09:12:32 UTC
This issue is not really a release blocker, but the changes I made contain general corrections and improvements of the import of Microsoft Word document in the binary file format (WW8 import) in the area of the outline levels. See also the list of depending issues (62681, 67302, 83951, 93630, 94525, 102628, 107029, 108094) which are also fixed (or improved).

This some support on the QA side - testing in the general the WW8 import - I would like to see these changes in our planned 4.1.1 release.
Comment 17 jsc 2014-07-07 11:20:19 UTC
grant showstopper flag
Comment 18 SVN Robot 2014-07-08 15:04:18 UTC
"orw" committed SVN revision 1608819 into branches/AOO410:
78498: Do not keep OutlineLevel attribute at paragraph when a Paragraph Style...
Comment 19 SVN Robot 2014-07-08 15:08:10 UTC
"orw" committed SVN revision 1608825 into branches/AOO410:
78498:  WW8 import - improvements/corrections regarding outline level & Co
Comment 20 SVN Robot 2014-07-08 15:16:41 UTC
"orw" committed SVN revision 1608829 into branches/AOO410:
78498:  WW8 import - some further improvements and corrections regarding outl...
Comment 21 Oliver-Rainer Wittmann 2014-07-08 15:18:12 UTC
integrated fixes on branch AOO410
Comment 22 Rekha S 2014-07-20 19:55:37 UTC
Verified fixed on AOO 4.1.1 m2  - Windows 8 
Im able to import any doc file having Table of Contents into Writer & it shows the TOC when imported now


Thanks

Rekha
Comment 23 Rekha S 2014-07-20 19:56:51 UTC
This bug can be marked verified .
Comment 24 fanyuzhen 2014-07-23 17:48:51 UTC
Thanks Rekha! Mark it as Verified / Fixed based on Rekha's check result on Windows 8 in comment 22