Issue 119341

Summary: Worse than Linear Performance Degradation with Tracked-Changes Growth
Product: performance Reporter: orcmid <orcmid>
Component: wwwAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Normal    
Priority: P3 CC: fanyuzhen, hdu, issues, jsc, kschenk
Version: AOO 3.4.0Flags: jsc: 4.0.0_release_blocker-
Target Milestone: not determined   
Hardware: All   
OS: All   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
ODF Spreadsheet with timing measurements of the documents on different platforms
none
Performance Metering of AOO 3.4 opening WD03c
none
Added tests of current releases none

Description orcmid 2012-05-12 19:24:28 UTC
Created attachment 77537 [details]
ODF Spreadsheet with timing measurements of the documents on different platforms

SUMMARY

There is a worse-than linear decrease in document opening and saving
performance when additional change tracking is added in a progression of draft
changes to an original document.

At some point, the degradation is so bad that an user is likely to assume that
the software has hung and is failing to open the document.  On slower machines
than the one the document was created on, this delay can be hours, not just
too many minutes.

TEST DOCUMENTS

There are five test documents, WD03a, WD03b, WD03c, WD04a, and WD03x.

They are all available here:
<http://tools.oasis-open.org/version-control/svn/oic/TestSuite/trunk/odf12/ChangeTrackingResilience/>.

If you want to know what WD03c is supposed to look like, there is a PDF
available here:
<http://www.oasis-open.org/apps/org/workgroup/office/document.php?document_id=45946>.
It is a large file, but it opens quickly in Acrobat.

To know what the 223 tracked changes are, you can also check Section 2 of the
smaller file available here:
<http://www.oasis-open.org/apps/org/workgroup/office/document.php?document_id=45936>.
It is an ODF Text (.ODT) file.

DEMONSTRATION OF THE WORSE-THAN LINEAR DEGRADATION

The defect is demonstrated by timed opening of 4 documents that have an
increasing number of tracked changes.

 * WD03a is 476kB and has 169 changes.  It opens in around 15 seconds on a
   fast system.

 * WD03b is 746kB and the number of changes is raised to 207.  It takes a
   few minutes to open the document (roughly 16x as long as WD03a on a fast
   system).

 * WD03c is 1,132kB has 223 changes.  It takes roughtly 4x more than WD03b.
   On a slower Windows XP SP3 x86 system, it takes more than an hour to open
   the document.

 * WD04a is 1,343kB although it has no more tracked changes and was only
   updated enough to start a new working draft set.  Yet it is 200kb larger
   and it takes almost double the time open than WD03c.  On the slowest system
   used, it takes 2.5 hours.

 THE SPREADSHEET (attached) provides timing statistics and the different 
 configurations and software releases on which measurements were
 captured.

NOTE: This bug appears to be related to #29842 which appears to have been around since OpenOffice.org 1.1.
Comment 1 orcmid 2012-05-12 19:25:25 UTC
 THE ODF SPREADSHEET provides more data points with regard
 to the timings and the different configurations and software releases tested.

 For Apache OpenOffice only the 3.4.0 released binaries were used in these
 timings.

 When available, I provide timing tests with OpenOffice.org 3.3.0 as well.
 This is to provide a baseline and confimr that the problem has existed since
 at least that releast of OpenOffice.org.

NOTE: THE MEASURED TIMINGS ARE NOT SUITABLE FOR COMPARISONS BETWEEN PRODUCTS.
  These timings were determined manually with a stop watch.  The conditions
  were not carefully controlled and the typical variances related to
  configuration differences, background activity, and system state are too
  high.
     The sole purpose of the timings is to demonstrate that the degradation
  of performance is consistent and predictable across all OpenOffice-
  lineage software.  The variance between releases is negligible compared to
  the major source of degradation.

CONFIGURATIONS

Astraendo is a Dell XPS 9100 with Windows 7 en_US x64, 18GB RAM, and an
Intel i7-980x 3.33GHz 6-core processor.

Quadro is a Toshiba Satellite Tablet PC with Windows XP SP3, 1.5GM RAM, and
an Intel Pentium M (Celeron) 1.7GHz processor.

Win8CP64 is Windows 8 Customer Preview en_US x64 running in VirtualBox on
Astraendo

Zorin Core is Zorin OS (Core, Debian/Ubuntu) x64 running in VirtualBox on
Astraendo

Zorin Edu is Zorin OS (Edu, Debian/Ubunto) x86 running in VirtualBox on
Astraendo

SPECIAL CASES

Close crash means that there was a crash on closing the document in the
application, with a report that the software had not closed properly and
work might have been lost.  The document was fine (it had not been touched)
but the lock file was still present in the file system.
Comment 2 orcmid 2012-05-12 20:01:06 UTC
This defect is apparently in common code inherited from OpenOffice.org in both
LibreOffice and Apache OpenOffice: 
https://bugs.freedesktop.org/show_bug.cgi?id=49848

It appears that symptoms of this problem have been identified as far back as
OpenOffice.org 1.0:
https://issues.apache.org/ooo/show_bug.cgi?id=29842
There are potentially multiple defects behind these, ones related to
change-tracking on open (as well as autosave and save but not quite as slow)
and to bloating of the .ODT file for no apparent reason.

All of the ODF Text documents, WD03a, ..., WD04a were produced with LibreOffice
3.3.2, the software being used for an ODF maintenance activity (no changes to
the software are made during the project to avoid regressions).

The original document with which editing began is the ODF Text for the OASIS
OpenDocument Format 1.1 Standard.  This document was produced by generator 
"StarOffice/8$Solaris_Sparc OpenOffice.org_project/680m5$Build-9114". (There were no tracked changes in that document.)
Comment 3 orcmid 2012-05-12 21:43:01 UTC
Created attachment 77538 [details]
Performance Metering of AOO 3.4 opening WD03c

This is a Zip containing four PNG images.

The images are of the Performance pane of the Windows 8 Consumer Preview x64 before file open started (idle), when it had just begun (starting), while it was proceeding (bursting), and as it finished (done).

These were taken with Win8CP running in a VirtualBox with 1 physical core, 4 logical cores.  The metering is with 100% using all 4 logical cores.  As you can see, most of the operation was in a single core and it often "pegged" for sustained times.

Of other possible interest is the change in numbers of processes running, the number of threads, and the number of handles.
Comment 4 Kay 2012-07-22 19:58:28 UTC
Memory leak issues?
Comment 5 hdu@apache.org 2012-07-23 18:43:25 UTC
No the problem here is not a memory leak but an inefficient container and many walks over a great part of the data. There was unfortunately no easy and risk-less fix for it though and IIRC the interfaces used were part of the set-in-stone UNO-API, which doesn't really help.
Comment 6 jsc 2013-07-03 08:47:40 UTC
remove showstopper request because I don't see that we will get any fix in time.

It requires a major refactoring that takes more time.
Comment 7 orcmid 2015-01-05 23:20:11 UTC
Created attachment 84378 [details]
Added tests of current releases

I received an automated QA ping from LibreOffice asking if the problem persists.  Of course it does.

I reconfirmed the problem with AOO 4.1.1 and LibreOffice 4.3.5.2.

The uploaded file adds the informal results of those runs.  They should not be taken as accurate, absolute timings, but useful for relative determination of the non-linear behavior.