Issue 71030

Summary: Performance: The page-formatting algorithm should trigger less often and settle quickly
Product: Writer Reporter: raindrops <na1000>
Component: formattingAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: issues, jian.li
Version: OOo 2.0.4   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---

Description raindrops 2006-10-31 09:53:54 UTC
The page foirmatting algorithm starts working as soon as a saved document
opened. It SHIFTs some of the content to other pages. 

But even then OOo not consider the document changed! (The "save" button is still
grayed out.) 

As a result, if you reopen a large document without saving, each time the
algorithmn has to work from scratch. ALL the work done by the algorithm in the
previous session is wasted.

Correct behavior:
As soon as the layout is changed, OOo should regard the document as changed.
This ensures that if you open the document in the next session, it will settle
far more quickly.
Comment 1 raindrops 2006-10-31 09:57:57 UTC
Assigned to OD (as reviously arranged).
Also see Issue 70268.
Comment 2 Oliver-Rainer Wittmann 2006-10-31 11:32:31 UTC
Because OpenOffice.org stores by default its documents in the OpenDocument file
format and the OpenDocument file format doesn't contain layout information in
general, this issue will not be "fixed". BTW, documents will look different on
different machines due to the availability of the used fonts and the difference
between used printers. Thus, the layout of office document can't be constant.
The exists other formats (e.g. PDF) for this purpose.
Comment 3 Oliver-Rainer Wittmann 2006-10-31 11:33:01 UTC
closing
Comment 4 raindrops 2006-11-01 06:40:46 UTC
I went through the ODF specs and some other documentation:

http://en.wikipedia.org/wiki/OpenDocument_technical_specification
http://netmoc.cpe.ucf.edu/Projects/OpenDocument/TestSuite.html
http://en.wikipedia.org/wiki/OpenDocument

And I'm convinced about my earlier statement. 

Let me support that with a few usability issues:

Point-1:
--------
The XML file is clearly supposed to contain all text and paragraph properties. 

So, when the file is opened in another OS (or a different PC, where OOo has
different default settings), the default settings of the second system MUST NOT
be applied to a file.

In the first place, the default settings are for creating a NEW file; not for
applying to someone else's file.

If someone makes a document in "A4 size, Landscape" format and sends it to me,
can my OOo automatically change it to "A5 size, Portrait" just because that is
MY default? 

Similarly, if someone has made a document in Times New Roman size 12, I CANNOT
change it to Arial size 10, because that is my default! 

Unless I edit the document specifically, ALL the properties must remain as per
the original author's settings.

The only exception is some rare fonts, which may not be installed in the target
system. In that case OOo may try to substitute the original font with a
near-equivalent. (MS Office does that). MS Office also offers another solution
by embedding fonts in the document itself, so that even if the target system
does not have those fonts, it would manage nicely. I don't know if OOo has this
feature-- The help does not talk about it.

So, if a document is fully defined in terms of text, paragraphs, tables,
margins, etc., how does the algorithm have so much flexibility to change the
actual layout by so much?? 

And even if there are infinite possible solutions, why does it accept all of
them? That is NOT what the user wants!

This is like saying that if I drew "Mona Lisa" in OOo, it could become "Julia
Roberts" tomorrow, and "Madonna" the day after; because they share the same
attributes. The only thing I am assured of is that they will NEVER have a
mustache (or will they?).

Point-2:
--------
I do not agree that even the currently selected printer should have influence on
a document's layout.

Suppose I have printers with different capabilities (this can happen in LANs).
Assume that the default printer is a black-and-white printer. 

Why should OOo refuse to create a document in color? Would it mean I CANNOT use
colors in fonts, tables, etc? 

OK this is subtle example. Suppose my default printer is an envelope printer.
Would that mean that OOo will not let me open an A4 document because I cannot
print it with the default printer? 

What about the fact that I can change the printer afterwards? At this moment I
may have several printers connected on the LAN, with the flexibility to chose
any one of them.

Only print preview should adopt to the specific printer (or its mode); or give
me warnings if some featutes (e.g. color, margins) cannot be used with the
currently selected printer.

Point-3:
--------
The "Tools" menu of OOo provides distinct controls to update (a)Fields
(b)index/tables and (c)page formatting. There is a SEPARATE menu option to
"update all". It means that the a,b and c options listed above are mutually
exclusive.

In other words, when OOo updates the page formatting, it would NOT update the
other things. So when it shifts some text by 4-5 pages, many of the pointers in
the fields and indexes/tables turn wrong! (I have verified this with my TOC just
now, before posting this.)

In fact, thanks to its continuous adjustments even as the user is working, most
of the fields and indexes/tables ARE ALMOST ALWAYS WRONG, except for a few
moments when the user has manually used the "Tools>Update all" command.

If I take a print, the text may say something like "see page-xx for details",
and that will be on a different page altogether! 

The only way to avoid this mess is to update ALL, and fire a print immediately.
If I delay the printing (looking for paper, etc), I am back to square one.

And god help me if I forget to update all manually!

Is THAT what we want??
***
To sum up, the current version of OOo is suited ONLY for small/medium documents
(e.g. a letter). But it is full of minefield for large documents; or book
manuscripts.

(I know this discussion spans a few of my recent bugs; not this bug alone. The
solution may not lie in "saving the file" alone. That's why I have mentioned
that all these bugs need to be considered together; not in isolation.)

Reopening.
Comment 5 Oliver-Rainer Wittmann 2006-11-01 07:45:31 UTC
OD->raindrops:
Sorry, there was a misunderstanding.
The formatting attributes are stored in the OpenDocument file format and will
not change from one system to another one.
But, the font metrics of the used fonts will differ from one system to another
one, especially if the document is formatted using different printers, because
different metrics of the used fonts are used. The formatting algorithm applies
these font metrics to the characters to calculate its size on the current
system. This size belongs to the output of the formatting algorithm and can't be
stored in the OpenDocument file format. Because this size can differ from one
system to another one, a line break can occur on another word on different
systems. This will cause different layouts on different systems.

Thus, still this issue will not be fixed.
Comment 6 Oliver-Rainer Wittmann 2006-11-01 07:46:04 UTC
closing again.
Comment 7 raindrops 2006-11-01 08:54:45 UTC
I think your diagnosis is not complete: 

The algorithm you are talking about applies to a very limited case; valid ONLY
IF the same document is seen with different systems and/or different printers.
But what if those factors are not changed? Then an UNEDITED document should be
adjusted only once and then it should remain unchanged.

But here, I am using the same PC with the same printer; and yet the document
changes its form by a few pages; AND KEEPS changing the document forever if I am
not even editing it. What triggers it? 

So the culprit is clearly a different algorithm. As you said elsewhere, several
methods are responsible for this. I think we are looking at the wrong one, then!

And if another algorithm is changing the document, will it not be able to save
it in one of the document's properties?

Therefore I am not satisfied with the logic here.

Till the offending algorithm is identified, this issue must not be closed;
because nobody will look at it again.
Comment 8 Oliver-Rainer Wittmann 2006-11-01 09:06:19 UTC
OD->raindrops:
I think you are mixing this issue with issue 71028.

In this issue you want that the results of the layout algorithm are stored in
the OpenDocument file format. This won't be done, because of the OpenDocument
file format and the my statements made above.
Thus again, closing this issue as WONTFIX.
Comment 9 Oliver-Rainer Wittmann 2006-11-01 09:06:55 UTC
closing again.
Comment 10 raindrops 2006-11-01 09:36:06 UTC
I do understand your statement about not being able to save the output (because
the ODF architecture does not allow it). 

I would compare this with a hypothetical organization that has a policy of
storing only the source files in CVS; so that executables can be made whenever
required. But the executables are not allowed to be stored as a policy.

***
The "save results" proposal was to address two things: (a) the algorithm does
heavy work every time (lasting for a long time) and (b) It keeps triggering even
in a static document, without any apparent reason. 

We have concluded that saving the results of each iteration is not allowed. But
yet there is no reason to close this bug entirely, because the original problems
are still there. Note that Issue 71028 does NOT talk about ANY of these issues.

The summary of the bug would be changed to drop the keyword "save".

What do you suggest?
Comment 11 Oliver-Rainer Wittmann 2006-11-01 10:01:55 UTC
OD->raindrops:
The OpenDocument file format is the open standard for office document. It has
been standardized by the OASIS and it has been accepted by the ISO as a
standard. The purpose of a standard is, that *NOT* every application, which uses
this standard, extend it by its own proprietary stuff. Thus, the OOo Writer can
not store its own formatting results in the OpenDocument file format.

BTW, the OpenDocument file format is the standard mainly for the content of
office documents. For layout conserving documents exist other file formats (e.g.
PDF).
Comment 12 raindrops 2006-11-01 10:29:23 UTC
raindrops -> OD
I understood that. That's why I wrote in my last post that I'll drop the
suggestion for "saving". 

Please ignore the save results suggestion, and see if the following two problems
should be addressed in a different way:
(a) the algorithm does heavy work every time (lasting for a long time) and 
(b) It keeps triggering even in a static document, without any apparent reason. 

I have changed the summary now. See if that suits better.

Thanks.
Comment 13 Oliver-Rainer Wittmann 2006-11-02 09:10:18 UTC
OD->raindrops:

Okay, now we are talking about the performance of the layout algorithm.

After the import of a document the layout have to be built up. The work for the
layout algorithm depends on the nature and the size of content. E.g. formatting
tables, which break into several part on consecutive pages, takes more time for
the formatting compared to the time to format paragraphs. This task has to be
performed every time after a document has been imported. Thus, I see no way how
I should lower this work for a certain document.

Currently, directly after the import the visible part of the document is
formatted and then painted on the screen. The rest of the document is formatted
in the background, when no other action is running in the application. This, we
call the "idle formatting". Because there are other background task running, the
idle formatting is interrupted several times, even if no user action has to be
performed. This part is under investigation and I want to improve the idle
formatting by given it a higher priority compared to the other task running.
Thus, we will save some time, because the idle formatting hasn't got setup so
much often.

Does this fit your intention?
If yes, I will reopen this issue as an enhancement to improve the performance of
the layout algorithm.
Comment 14 raindrops 2006-11-02 16:48:44 UTC
yes, that's right.

I have sent to you some suggestion offline. Thanks!
Comment 15 raindrops 2007-03-27 18:08:18 UTC
As suggested by OD in the second-last post, I am reopening this issue.
Comment 16 Oliver-Rainer Wittmann 2007-03-30 07:19:12 UTC
OD->raindrops: Thank you. I've forgot to reopen it.
Now, I also adjust the issue type according the same post. The target has to be
set also - I will figure out, which target is realistic. Stay tuned.
Comment 17 Oliver-Rainer Wittmann 2007-04-02 11:21:43 UTC
I'll add this issue into my OOo 2.x issue queue.
It's sensible to work on this issue on the way to OOo 3.0, but currently I can't
give a deadline for this issue.
Comment 18 Mathias_Bauer 2007-12-03 14:48:18 UTC
according to release status meeting -> 3.x
Comment 19 michael.ruess 2009-06-25 16:31:43 UTC
*** Issue 103071 has been marked as a duplicate of this issue. ***
Comment 20 Marcus 2017-05-20 11:17:37 UTC
Reset assigne to the default "issues@openoffice.apache.org".