Issue 21116 - Numbering corrupted after export to HTML
Summary: Numbering corrupted after export to HTML
Status: ACCEPTED
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: OOo 1.1
Hardware: PC All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: oooqa
Depends on:
Blocks:
 
Reported: 2003-10-13 12:14 UTC by tuharsky
Modified: 2013-08-07 14:38 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Original document (compressed bz2) (21.79 KB, application/octet-stream)
2003-10-13 12:18 UTC, tuharsky
no flags Details
Exported to HTML 3.2 (10.93 KB, application/octet-stream)
2003-10-13 12:24 UTC, tuharsky
no flags Details
Word document with a numbering hierarchy (19.00 KB, application/octet-stream)
2003-11-03 11:39 UTC, eric.savary
no flags Details
Another short example for problem study (8.77 KB, application/octet-stream)
2003-11-28 07:31 UTC, tuharsky
no flags Details
More complex example, bz2-ed doc and html (29.81 KB, application/octet-stream)
2003-11-28 07:38 UTC, tuharsky
no flags Details
SXW document -another test case. (14.78 KB, application/vnd.sun.xml.writer)
2006-01-31 14:29 UTC, tuharsky
no flags Details
HTML export -numbering dosen't follow original document from paragraph 2 (24.60 KB, text/html)
2006-01-31 14:30 UTC, tuharsky
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description tuharsky 2003-10-13 12:14:11 UTC
If I open an DOC or RTF document and save it as HTML, numbering is incorrect.
But original document is always displayed with correct numbering, so it probably
is more export than import filter error. I have set HTML 3.2 to be default HTML
export type, and I blocked font settings to avoid gray fields to be exported.
I'll post demonstration files.
Comment 1 tuharsky 2003-10-13 12:18:51 UTC
Created attachment 10259 [details]
Original document (compressed bz2)
Comment 2 tuharsky 2003-10-13 12:24:41 UTC
Created attachment 10261 [details]
Exported to HTML 3.2
Comment 3 mci 2003-10-16 14:21:41 UTC
reassigned to es
Comment 4 utomo99 2003-11-01 04:01:56 UTC
I can Reproduce the problem on 
OpenOffice 1.1 (default Install, US), Win XP Pro Sp1. 
(And MS Office XP Sp2). 
It is real problem

when exporting to html at page 2. the numbering is all wrong. 
It is export filter problem 

change OS to all, reproduce able in win xp 
Comment 5 stefan.baltzer 2003-11-03 09:52:37 UTC
.
Comment 6 eric.savary 2003-11-03 11:37:57 UTC
The sample document has from the begining a wrong number formatting.
It's a simple numbering while the structure of the document would need
an outline numbering (a defined hierachy of numbering levels).
Indeed, on can notice that the "levels" 1, 2, 3... and a). b), c)...
are not levels of a *same* hierachy but both "level 1" of different
hierachies. Saving to HTML, OOo applies then the default format of
level 1 to each corresponding level.
So the effect:
1
2
1
2
3
is pretty normal
Comment 7 eric.savary 2003-11-03 11:38:19 UTC
ES->MIB
It remains that a correct hierachy like:
1.
  a.
    i.
Will be save as:
1.
  1.
    i.
to HTML 3.2, when the source document is a *doc or *.rtf (see
attachment below). Which is wrong. HTML 3.2 knows the OL TYPE attribute.

http://www.w3.org/TR/REC-html32#ol
Comment 8 eric.savary 2003-11-03 11:39:51 UTC
Created attachment 10906 [details]
Word document with a numbering hierarchy
Comment 9 michael.brauer 2003-11-04 11:37:51 UTC
The numbering a, b, c, aa, bb, cc, ... is not mapped to a, b, c though
it could.
Comment 10 tuharsky 2003-11-28 07:31:41 UTC
Created attachment 11609 [details]
Another short example for problem study
Comment 11 tuharsky 2003-11-28 07:38:03 UTC
Created attachment 11610 [details]
More complex example, bz2-ed doc and html
Comment 12 tuharsky 2004-05-05 06:51:42 UTC
On OOo 1.1.2, bug still present.
Comment 13 tuharsky 2006-01-31 14:29:50 UTC
Created attachment 33737 [details]
SXW document -another test case.
Comment 14 tuharsky 2006-01-31 14:30:59 UTC
Created attachment 33738 [details]
HTML export -numbering dosen't follow original document from paragraph 2
Comment 15 ersalo 2008-07-31 23:04:46 UTC
More on this:
-- affects also current versions of OOo
-- there is a similar problem with RTF saves
-- yet, the HTML issue seems to lie in the save-export side, where the RTF is
likely in the open-import side
...so probably they should be reported separately, but I am not the expert and
this *old* issue deserves some attention...
>>> swriter, ODT source:
1. save as HTML (both 3.2 and 4.1 transitional), close, open HTML:
1.1. OOo 2.4.1: numbering misses the Before and After text (e.g. "Section 1 - "
gives "1. "). Same if opening in web browser.
1.2. OOo 3 beta (build 9328): numbering misses the Before and After text (e.g.
"Section 1 - " gives "1. "). Same if opening in web browser.
2. save as RTF, close, open RTF:
2.1. OOo 2.4.1: strange text ('Left Page;Right Page;Envelope;Endnote', etc.)
appears at the beginning of the document, and numbering misses spaces at the end
of Before text (e.g. "Section 1" reads "Section1"). None of this applies if
opening in MS WordPad.
2.2. OOo 3 beta (build 9328): strange text ('Left Page;Right
Page;Envelope;Endnote', etc.) appears at the beginning of the document, and
numbering is completely lost (a blank paragraph appears if no other text
followed). None of this applies if opening in MS WordPad.
Comment 16 michael.brauer 2008-08-04 07:50:45 UTC
HTML does not support prefix and suffice for numberings, so 
Comment 17 ersalo 2008-08-04 11:14:35 UTC
If numbering format cannot be kept when saving from ODT to HTML, sometimes one
would prefer keeping the numbering structure, but sometimes one would prefer
keeping the appearance (as when saving to TXT: in that case, being not possible
to retain the numbering internals, the choice is clear). 

I am afraid there is no one 'best' choice --if you think there is, it's because
you haven't been in the other side.

So perhaps this raises the matter of the warning about format loss when
exporting to other document formats. Instead of a generic warning, couldn't it
be more specific and detailed? Perhaps with links to help pages with still more
info, possibly tricks to bypass some limitations, etc.