Issue 41675

Summary: memory leak while converting
Product: App Dev Reporter: frontline <frontline>
Component: apiAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: ahmed.elidrissi.attach, frank.loehmann, gbpacheco, issues, lothar.may, masaya.k, mik373, shenmux09, vinu.kumar
Version: 3.3.0 or older (OOo)   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
File used when testing
none
complex testcase which meassures memory usage on *nix
none
Output of a valgrind session using memCheck (conversion is done 3 times)
none
Output of the memCheck session
none
Simple program to make it easier to find the leak. none

Description frontline 2005-01-31 08:11:08 UTC
I have a java program which uses openoffice to convert files between different
formats.
When I run it OO takes more and more memory for each conversion and doesn't free it.

I have started openoffice with (on linux):
/opt/openoffice.org1.9.74/program/soffice -invisible "-accept=socket,port=8100;urp;"

The java program is as follows:

import net.sf.joott.uno.*;
import java.io.*;
public class Test {

public static void main(String[] a) throws Exception {
UnoConnection conn = DocumentConverterFactory.getConnection();
DocumentConverter conv = new DocumentConverter(conn);
File source = new File("test.ppt");
File dest = new File("test.pdf");
for (int i=0; i<1000; i++) {
conv.convert(source, dest, DocumentFormat.PDF_IMPRESS);
}
}
}

The leak happens with conversions between other formats as well.
Comment 1 stephan.wunderlich 2005-01-31 16:04:19 UTC
I just converted 100 documents from ppt to pdf always ind the sequecence

1. open document
2. export document
3. close document

and I had an increase of memory usage of 10k per document. Don't think this can
be called a memory leak ... what am I missing ? How much memory is kept during
your conversion ? Can you provide a java-program that does all the steps you do,
so rebuilding the behaviour is possible ?
Comment 2 frontline 2005-01-31 20:06:13 UTC
You can find joovonverter at (you need just the jar in the root of the archive):
http://sourceforge.net/project/showfiles.php?group_id=91849&package_id=114337

(jar too big to attach)

1) Compile the program found in the issue with
javac -classpath jooconverter.jar Test.java

2) Run openoffice as in the issue:
/opt/openoffice.org1.9.74/program/soffice -invisible "-accept=socket,port=8100;urp;"

3) Run the java test program with

java -cp jooconverter.jar:.:/opt/openoffice.org1.9.74/program/classes/juh.jar:
/opt/openoffice.org1.9.74/program/classes/jurt.jar:
/opt/openoffice.org1.9.74/program/classes/ridl.jar:
/opt/openoffice.org1.9.74/program/classes/sandbox.jar:
/opt/openoffice.org1.9.74/program/classes/unoil.jar Test

You need to have a file test.ppt in the current directory.

Notice that you need to exit the program with CTRL-C after the conversions are done.

After converting 1000 times memory usage shows for me over 400Mb.

I'll attach the ppt I have used.

Thanks for your quick response.
Comment 3 frontline 2005-01-31 20:07:17 UTC
Created attachment 22049 [details]
File used when testing
Comment 4 hol_sten 2005-01-31 20:18:34 UTC
I have the same problem (memory leak) with OOo 1.1.2, 1.1.3, and 1.1.4 on
Windows XP Home, Windows XP Professional and Debian Linux.

I wrote a Java application which uses OOo through the Java-UNO-API. The Java
application loads a OOo writer template, replaces several bookmarks, exports the
result as a PDF and closes the document. With every loaded document the used
memory increased and never decreased although I'm very certain that I freed all
used resources.

My luck was that I repeatedly loaded the same documents. So I made a
modification to my program: I load all documents once, and start then the
bookmark replacement and the PDF export. With this modification my Java
application is using a constant amount of memory for over 30000 exported PDF
documents.
Comment 5 frontline 2005-02-01 09:26:57 UTC
I'd like to clarify that I wasn't talking about a leak in my java program, but
in openoffice.
Comment 6 stephan.wunderlich 2005-02-01 13:54:50 UTC
Created attachment 22087 [details]
complex testcase which meassures memory usage on *nix
Comment 7 stephan.wunderlich 2005-02-01 14:06:18 UTC
The attached complex testcase loads a ppt document, exports it to pdf and then
closes the document again. This is done 5 times. Afterwards it measures the
memory usage with the help of "pmap".

To execute it unzip the file memoryTest.zip and change to the appearing folder
memoryTest/complex/memCheck. Now

1. call setsolar in your shell
2. call dmake
3. start the office with the parameters 
      -invisible "-accept=socket,port=8100;urp;"
4. call dmake run

The testcase allows 20K of additional memory usage, but the src680_m74 consumes
500K per conversion ... so doing 1000 conversions will lead to 500MB.

Remark:
run the test at least twice, since during the first run  the memory usage will
increase dramatically due to the fact that the impress libraries are loaded into
memory for the first time. 
Comment 8 clippka 2005-02-02 09:16:12 UTC
Sven, is that the memory leak armin just discovered with the autoshapes not
freeing theire SdrObjects?
Comment 9 sven.jacobi 2005-02-02 09:35:43 UTC
sj->aw: I think you already fixed this issue in your cws, can you please check
if it is double.
Comment 10 Armin Le Grand 2005-02-02 14:23:24 UTC
AW->SW: Should be fixed with #i40944# which is in m76, please review there again.
Comment 11 stephan.wunderlich 2005-02-02 15:27:31 UTC
sw->aw: just checked on Solaris with an src680_m76 ... sadly the memory
consumption persists :-( ... so I'd say the mentioned fix doesn't solve the problem
Comment 12 Armin Le Grand 2005-02-02 18:37:26 UTC
AW->SJ: Unforzunately not double to the fixed one, there is more leaks involved.
Maybe IHA or BM have time for a boundscheck round? Ask KA for that.
Comment 13 sven.jacobi 2005-02-03 10:39:34 UTC
sj->bm: Very nice from you to agree to analyse the problem.
Comment 14 bjoern.milcke 2005-02-11 13:24:33 UTC
I did a valgrind on Linux (m76) and didn't find anything that I think could
cause this leak yet.
Comment 15 zhou1978 2005-02-12 00:26:12 UTC
Thanks for you confirmation
but I think the bug  does exist...
or There is something wrong with my java source?
the problems occurs when I try to modify the xsl file or compress it to PDF file
with the UNO developement package

I find ver1.9.7.4 may be better than 1.1.X on this issue and it will be 
fantastic if the 2.0 solved such problem throughly...

Comment 16 bjoern.milcke 2005-02-14 10:32:04 UTC
Created attachment 22577 [details]
Output of a valgrind session using memCheck (conversion is done 3 times)
Comment 17 bjoern.milcke 2005-02-14 10:37:10 UTC
Created attachment 22578 [details]
Output of the memCheck session
Comment 18 bjoern.milcke 2005-02-14 10:49:01 UTC
I didn't find a culprit for the leaks. I attached the 12MB valgrind output
(gzipped) and the output of the memory test program running on a StarOffice
SRC680.m78 (on Linux). The only places that caught my eye were spellchecker
calls. I don't know if these appear only because we loose control over the
memory management at some place?
The output of memCheck is a bit strange insofar as the memory consuption doesn't
seem to increase, but that's probably due to the fact that valgrind is running.

->SJ: Maybe you (or someone else) find(s) a stack that may be related to the
conversion process in the log file.
Comment 19 bjoern.milcke 2005-02-14 10:55:59 UTC
What I forgot to add: Maybe the leaks come from Java. In Java, as you know,
memory is not freed at a specific time, it depends on the garbage collector. If
the Java program keeps references to the XModel of the first document loaded,
and then loads a second document, the first XModel may still exist in memory.
Have you (anybody) checked if the memory that is not freed is probably freed
after the Java program terminated?
Comment 20 ooo 2005-02-14 12:29:53 UTC
retargeted to OOo 2.0.1 after QA approval by MSC
Comment 21 ooo 2005-02-14 12:30:31 UTC
.
Comment 22 frontline 2005-02-14 21:35:01 UTC
Have you tried to run my sample program or SW:s while you did the valgrind run?
Unfortunately I haven't used valgrind or know much C(++), so I can't help here.

It is definitely a leak in OOo, and the memory remains used even after I exit my
java program.

Does OOo use java internally for this, and if, how can I set the java -Xmx flags
for heap size?
Maybe they are set high, and as you probably know java will try to eat all
memory given to it. In that case it wouldn't be a leak, just a configuration matter.
Comment 23 frontline 2005-02-17 20:33:54 UTC
Just to be sure. The output of the memcheck run seems odd, and it reads:
/home/bm/bin/memcheck soffice.bin -invisible -accept=socket,port=8100;urp;

Shouldn't the parameters be in ""? Like:
/home/bm/bin/memcheck soffice.bin -invisible "-accept=socket,port=8100;urp;"
Comment 24 sven.jacobi 2005-05-24 11:58:00 UTC
Because of a too huge workload I can't fix this issue for OOo 2.01 -> changed
target to OO Later.
Comment 25 rf99 2006-04-20 15:16:38 UTC
I have the similar problem in OOo Calc using OOo Basic macros
(OOo2.0.2/Linux:Slackware 10.2). It happens at about 50 file open & 50 file
close. In addition, could this cause a "Signal 11 / SIGSEGV" report from KDE, on
OOo exit? Perhaps OOo is trying to release the leaked memory when quitting,
somehow? Thanks.
Comment 26 bobharvey 2006-07-16 22:04:33 UTC
I can't believe, a year and a half on and a couple of releases later that this
is still "OOo later"
Comment 27 mnasato 2006-07-27 15:53:09 UTC
Being the author of the mentioned JOOConverter
(http://sourceforge.net/projects/joott/) open source project I've done some
testing myself to make sure the problem was not with my Java code. I'm using OOo
2.0.3 on Linux.

The memory usage increase seems to greatly vary based on the type of document
used. For example, ODT to PDF only increases by 2.5kb per conversion, which may
not be a memory leak at all.

However a PPT to PDF conversion using the document attached to this issue does
produce a significant increase, approx 1mb for each conversion!

Just loading and disposing the document without exporting to PDF results in
almost the same increase. While loading and disposing the same document but in
ODP format shows not much increase at all.

So my guess is that most of the leak is caused by the PPT import filter.

However, converting from ODP to PDF also increases memory usage by some 200kb
per conversion so there may be a problem with the impress PDF export filter as well.
Comment 28 nushi7 2008-01-31 16:40:16 UTC
The problem also occurs when you try to convert a ppt to an swf.
Comment 29 maibee 2008-12-08 15:56:01 UTC
I can confirm the issue in OO version 3.0. Opening doc-files leaves about 20kb
and ppt about 700kb in memory (depending the size of the files).
A fix would be very, very nice :)
Comment 30 mramuta 2008-12-08 17:01:49 UTC
I too have a very difficult time understanding what after four years this 
defect doesn't seem to be important. 
Comment 31 kaplan4 2009-05-21 23:32:31 UTC
I am working on a software application where we use the OpenOffice software in
server mode.  We are running OpenOffice 3.0.  We use OpenOffice to extract text
from Microsoft formatted documents (word, PowerPoint and Excel).  After running
for about an hour (I know, not much of a metric), the OpenOffice server hangs. 
By hang I mean: all processing stops (the CPU usage is approximately zero), but
the OpenOffice process is still live (or at least shown in the Windoz process
monitor).  Our guess is that this problem is due to a memory error.

For anyone using OpenOffice in server mode, this is a pretty critical problem.
Comment 32 mnasato 2009-05-22 00:14:24 UTC
I guess the problem with this sort of bug reports - and why they end up getting
stuck in "OOo Later" forever - is that they are a bit too generic.

We (as users) should follow the rule of "one problem, one issue". How can we
expect the OOo team to fix all possible memory leaks for all possible
import/export formats as part of a single issue?

I tried to focus this issue on a specific case, i.e. PPT import which seemed to
be the biggest culprit, in my comment (some 3 years ago now). But people keep
adding generic complaints relating to various formats, with the only result that
this issue will never be solved, because it's impossible to solve an ill-defined
problem.

(Incidentally, JODConverter 3.0 for its part now does provide a workaround: it
automatically restarts OOo - see http://jodconverter.googlecode.com if interested.)
Comment 33 zeroin23 2009-09-17 02:44:40 UTC
good job mnasato, haha though i am using the java OODaemon instead for 
production. 

is it possible to write unit test case for each of the input and output pair so 
that it will be easier for the developer to test? 
Comment 34 coldwinston 2009-09-24 15:39:59 UTC
Amazing. I've been subscribed to this bug for nearly 4 years now, and I'm still
getting emails about it :)
Comment 35 groverblue 2009-10-22 22:00:10 UTC
Is there any way to monitor soffice.bin for active/unreleased objects?  I just
processed about 600 documents and used memory jumped from 80M to 190M, and it's
not being released.
Comment 36 dpmelo 2009-10-27 12:06:48 UTC
We're experiencing a similar problem. An application converts many .doc archives
to odt format, and, at a particular moment, the soffice process had 5.2G of
resident memory, in a 8GB machine.

We'll start monitoring closer this problem, but seems really strange to me the
age of this bug, almost 5 years. Is this Category of milestone (OOo Later)
hidden to the Open Office team?
Comment 37 bornmw 2010-03-07 18:16:19 UTC
My code opens a simple Calc document with four cells filled and a Chart based on
them.
It exports the chart (via XFilter) and closes the document (the right way as
described at
http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/OfficeDev/Closing_Documents).

After 1000 of such operations (less than 10 minutes) soffice.bin grows from 84Mb
to 330Mb.
The workaround that I'm going to use is to kill OO process periodically, so it's
ok to mark this issue as FIXED.
Comment 38 bornmw 2010-03-07 18:46:41 UTC
And concerning the comment from mnasato about issue being too generic.
Lets make it clear: this issue is not related to any sort of import/export.
OO leaks even if you just open and close a document - just like that.
The next piece of code will make OO grow 100Mb per 2 minutes (demo.ods is a
spreadsheet with four cells and one chart):

public static void main(String[] args) throws Exception {
  XComponentContext xCompContext = com.sun.star.comp.helper.Bootstrap.bootstrap();
  XMultiComponentFactory xMCF = xCompContext.getServiceManager();
  Object xDesktop = xMCF.createInstanceWithContext("com.sun.star.frame.Desktop",
xCompContext);
    XComponentLoader aLoader = UnoRuntime.queryInterface(XComponentLoader.class,
xDesktop);
    PropertyValue[] loadProps = new PropertyValue[]{new PropertyValue("Hidden",
0, true, null)};
  while (true) {
    XComponent xComponent = aLoader.loadComponentFromURL("file:///tmp/demo.ods",
"_blank", 0, loadProps);
    close(xComponent);
  }
}

private static void close(XComponent xComponent) {
  XCloseable xCloseable = UnoRuntime.queryInterface(XCloseable.class, xComponent);
  if (xCloseable != null) {
    try {
      xCloseable.close(true);
    } catch (CloseVetoException e) {}
  } else xComponent.dispose();
}
Comment 39 thierry_thelliez 2010-11-03 20:23:27 UTC
I just hit this bug while using a small Python based script found at:

http://www.linuxjournal.com/content/convert-spreadsheets-csv-files-python-and-pyuno-part-1v2

When converting Excel files to CSV, memory usage slowly increases. Server
crashes after a few hundreds conversion.
Comment 40 frank.loehmann 2010-12-17 08:55:59 UTC
cc myself
Comment 41 ahmed.elidrissi.attach 2011-03-02 22:57:03 UTC
Same problem for me.
The difference is my conversion method can convert more than 43 000 documents before OOo crashes.
Hope that bug will be fixed soon....

Moulay
Comment 42 mik373 2011-03-10 18:16:33 UTC
Is there any way you can show us the bulk of the Java code (if it is Java) and what it is you are disposing so successfully that you can convert 4300.  I have % OO instances and I can only convert about 2300 total before all the instances lockup or crash.  The once that do lockup use over 2G of memory.
 
(In reply to comment #41)
> Same problem for me.
> The difference is my conversion method can convert more than 43 000 documents
> before OOo crashes.
> Hope that bug will be fixed soon....
> 
> Moulay
Comment 43 Lothar May 2014-08-05 16:29:07 UTC
In response to the comments by Bjoern Milcke on 2005-02-14:
Some leak also occurs if each relevant Open Office call in Java is followed by
System.gc();
System.runFinalization();
I verified this by using uno_dumpEnvironment(logFile, binaryUno_.get(), 0); in bridge.cxx, the object count in the bridge is constant between opening documents.

Also, I can confirm that a leak occurs just by creating and closing text documents in a Java loop (with gc and runFinalization) using the URP UNO bridge.

In order to find this bug, I'd suggest running valgrind not for 3 conversions (as was done in the attached valgrind logs), but for example for 100 or even more conversions. This should make it a lot easier to find the leak, as the relevant loss records would probably be repeated.

Could someone possibly do this? I have compiled AOO on Windows and do not have a corresponding tool available for Windows.
Comment 44 Lothar May 2014-08-06 10:55:48 UTC
Created attachment 83796 [details]
Simple program to make it easier to find the leak.

I'm attaching a very simple Java program to help hunting down this issue.
All it does is creating and closing Open Office text documents in a loop, and calling the garbage collector/finalizer to ensure that things are properly cleaned up on the Java side.
The program takes one argument, which is the number of loops (i.e. the number of times a text document is created and closed). If you run it with argument "1000" or "10000" you will note the increase in memory usage just by creating and closing documents.
In order to run this program, you need the "Nice Office Access" jars (license LGPL)
http://ubion.ion.ag/loesungen/004niceofficeaccess
and add them to the classpath.
Additionally, add
java_uno.jar;juh.jar;jurt.jar;officebean.jar;ridl.jar;unoil.jar;unoloader.jar
to the classpath.
You may need to adjust the Open Office path
final String officeInstallDir = "C:\\Program Files (x86)\\OpenOffice 4";

I'm sorry I could not attach a complete jar file containing all libs, because of the size restrictions. Feel free to request it via email.