Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | memory leak while converting | ||
---|---|---|---|
Product: | App Dev | Reporter: | frontline <frontline> |
Component: | api | Assignee: | AOO issues mailing list <issues> |
Status: | CONFIRMED --- | QA Contact: | |
Severity: | Trivial | ||
Priority: | P3 | CC: | ahmed.elidrissi.attach, frank.loehmann, gbpacheco, issues, lothar.may, masaya.k, mik373, shenmux09, vinu.kumar |
Version: | 3.3.0 or older (OOo) | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Issue Type: | DEFECT | Latest Confirmation in: | --- |
Developer Difficulty: | --- | ||
Attachments: |
Description
frontline
2005-01-31 08:11:08 UTC
I just converted 100 documents from ppt to pdf always ind the sequecence 1. open document 2. export document 3. close document and I had an increase of memory usage of 10k per document. Don't think this can be called a memory leak ... what am I missing ? How much memory is kept during your conversion ? Can you provide a java-program that does all the steps you do, so rebuilding the behaviour is possible ? You can find joovonverter at (you need just the jar in the root of the archive): http://sourceforge.net/project/showfiles.php?group_id=91849&package_id=114337 (jar too big to attach) 1) Compile the program found in the issue with javac -classpath jooconverter.jar Test.java 2) Run openoffice as in the issue: /opt/openoffice.org1.9.74/program/soffice -invisible "-accept=socket,port=8100;urp;" 3) Run the java test program with java -cp jooconverter.jar:.:/opt/openoffice.org1.9.74/program/classes/juh.jar: /opt/openoffice.org1.9.74/program/classes/jurt.jar: /opt/openoffice.org1.9.74/program/classes/ridl.jar: /opt/openoffice.org1.9.74/program/classes/sandbox.jar: /opt/openoffice.org1.9.74/program/classes/unoil.jar Test You need to have a file test.ppt in the current directory. Notice that you need to exit the program with CTRL-C after the conversions are done. After converting 1000 times memory usage shows for me over 400Mb. I'll attach the ppt I have used. Thanks for your quick response. Created attachment 22049 [details]
File used when testing
I have the same problem (memory leak) with OOo 1.1.2, 1.1.3, and 1.1.4 on Windows XP Home, Windows XP Professional and Debian Linux. I wrote a Java application which uses OOo through the Java-UNO-API. The Java application loads a OOo writer template, replaces several bookmarks, exports the result as a PDF and closes the document. With every loaded document the used memory increased and never decreased although I'm very certain that I freed all used resources. My luck was that I repeatedly loaded the same documents. So I made a modification to my program: I load all documents once, and start then the bookmark replacement and the PDF export. With this modification my Java application is using a constant amount of memory for over 30000 exported PDF documents. I'd like to clarify that I wasn't talking about a leak in my java program, but in openoffice. Created attachment 22087 [details]
complex testcase which meassures memory usage on *nix
The attached complex testcase loads a ppt document, exports it to pdf and then closes the document again. This is done 5 times. Afterwards it measures the memory usage with the help of "pmap". To execute it unzip the file memoryTest.zip and change to the appearing folder memoryTest/complex/memCheck. Now 1. call setsolar in your shell 2. call dmake 3. start the office with the parameters -invisible "-accept=socket,port=8100;urp;" 4. call dmake run The testcase allows 20K of additional memory usage, but the src680_m74 consumes 500K per conversion ... so doing 1000 conversions will lead to 500MB. Remark: run the test at least twice, since during the first run the memory usage will increase dramatically due to the fact that the impress libraries are loaded into memory for the first time. Sven, is that the memory leak armin just discovered with the autoshapes not freeing theire SdrObjects? sj->aw: I think you already fixed this issue in your cws, can you please check if it is double. AW->SW: Should be fixed with #i40944# which is in m76, please review there again. sw->aw: just checked on Solaris with an src680_m76 ... sadly the memory consumption persists :-( ... so I'd say the mentioned fix doesn't solve the problem AW->SJ: Unforzunately not double to the fixed one, there is more leaks involved. Maybe IHA or BM have time for a boundscheck round? Ask KA for that. sj->bm: Very nice from you to agree to analyse the problem. I did a valgrind on Linux (m76) and didn't find anything that I think could cause this leak yet. Thanks for you confirmation but I think the bug does exist... or There is something wrong with my java source? the problems occurs when I try to modify the xsl file or compress it to PDF file with the UNO developement package I find ver1.9.7.4 may be better than 1.1.X on this issue and it will be fantastic if the 2.0 solved such problem throughly... Created attachment 22577 [details]
Output of a valgrind session using memCheck (conversion is done 3 times)
Created attachment 22578 [details]
Output of the memCheck session
I didn't find a culprit for the leaks. I attached the 12MB valgrind output (gzipped) and the output of the memory test program running on a StarOffice SRC680.m78 (on Linux). The only places that caught my eye were spellchecker calls. I don't know if these appear only because we loose control over the memory management at some place? The output of memCheck is a bit strange insofar as the memory consuption doesn't seem to increase, but that's probably due to the fact that valgrind is running. ->SJ: Maybe you (or someone else) find(s) a stack that may be related to the conversion process in the log file. What I forgot to add: Maybe the leaks come from Java. In Java, as you know, memory is not freed at a specific time, it depends on the garbage collector. If the Java program keeps references to the XModel of the first document loaded, and then loads a second document, the first XModel may still exist in memory. Have you (anybody) checked if the memory that is not freed is probably freed after the Java program terminated? retargeted to OOo 2.0.1 after QA approval by MSC . Have you tried to run my sample program or SW:s while you did the valgrind run? Unfortunately I haven't used valgrind or know much C(++), so I can't help here. It is definitely a leak in OOo, and the memory remains used even after I exit my java program. Does OOo use java internally for this, and if, how can I set the java -Xmx flags for heap size? Maybe they are set high, and as you probably know java will try to eat all memory given to it. In that case it wouldn't be a leak, just a configuration matter. Just to be sure. The output of the memcheck run seems odd, and it reads: /home/bm/bin/memcheck soffice.bin -invisible -accept=socket,port=8100;urp; Shouldn't the parameters be in ""? Like: /home/bm/bin/memcheck soffice.bin -invisible "-accept=socket,port=8100;urp;" Because of a too huge workload I can't fix this issue for OOo 2.01 -> changed target to OO Later. I have the similar problem in OOo Calc using OOo Basic macros (OOo2.0.2/Linux:Slackware 10.2). It happens at about 50 file open & 50 file close. In addition, could this cause a "Signal 11 / SIGSEGV" report from KDE, on OOo exit? Perhaps OOo is trying to release the leaked memory when quitting, somehow? Thanks. I can't believe, a year and a half on and a couple of releases later that this is still "OOo later" Being the author of the mentioned JOOConverter (http://sourceforge.net/projects/joott/) open source project I've done some testing myself to make sure the problem was not with my Java code. I'm using OOo 2.0.3 on Linux. The memory usage increase seems to greatly vary based on the type of document used. For example, ODT to PDF only increases by 2.5kb per conversion, which may not be a memory leak at all. However a PPT to PDF conversion using the document attached to this issue does produce a significant increase, approx 1mb for each conversion! Just loading and disposing the document without exporting to PDF results in almost the same increase. While loading and disposing the same document but in ODP format shows not much increase at all. So my guess is that most of the leak is caused by the PPT import filter. However, converting from ODP to PDF also increases memory usage by some 200kb per conversion so there may be a problem with the impress PDF export filter as well. The problem also occurs when you try to convert a ppt to an swf. I can confirm the issue in OO version 3.0. Opening doc-files leaves about 20kb and ppt about 700kb in memory (depending the size of the files). A fix would be very, very nice :) I too have a very difficult time understanding what after four years this defect doesn't seem to be important. I am working on a software application where we use the OpenOffice software in server mode. We are running OpenOffice 3.0. We use OpenOffice to extract text from Microsoft formatted documents (word, PowerPoint and Excel). After running for about an hour (I know, not much of a metric), the OpenOffice server hangs. By hang I mean: all processing stops (the CPU usage is approximately zero), but the OpenOffice process is still live (or at least shown in the Windoz process monitor). Our guess is that this problem is due to a memory error. For anyone using OpenOffice in server mode, this is a pretty critical problem. I guess the problem with this sort of bug reports - and why they end up getting stuck in "OOo Later" forever - is that they are a bit too generic. We (as users) should follow the rule of "one problem, one issue". How can we expect the OOo team to fix all possible memory leaks for all possible import/export formats as part of a single issue? I tried to focus this issue on a specific case, i.e. PPT import which seemed to be the biggest culprit, in my comment (some 3 years ago now). But people keep adding generic complaints relating to various formats, with the only result that this issue will never be solved, because it's impossible to solve an ill-defined problem. (Incidentally, JODConverter 3.0 for its part now does provide a workaround: it automatically restarts OOo - see http://jodconverter.googlecode.com if interested.) good job mnasato, haha though i am using the java OODaemon instead for production. is it possible to write unit test case for each of the input and output pair so that it will be easier for the developer to test? Amazing. I've been subscribed to this bug for nearly 4 years now, and I'm still getting emails about it :) Is there any way to monitor soffice.bin for active/unreleased objects? I just processed about 600 documents and used memory jumped from 80M to 190M, and it's not being released. We're experiencing a similar problem. An application converts many .doc archives to odt format, and, at a particular moment, the soffice process had 5.2G of resident memory, in a 8GB machine. We'll start monitoring closer this problem, but seems really strange to me the age of this bug, almost 5 years. Is this Category of milestone (OOo Later) hidden to the Open Office team? My code opens a simple Calc document with four cells filled and a Chart based on them. It exports the chart (via XFilter) and closes the document (the right way as described at http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/OfficeDev/Closing_Documents). After 1000 of such operations (less than 10 minutes) soffice.bin grows from 84Mb to 330Mb. The workaround that I'm going to use is to kill OO process periodically, so it's ok to mark this issue as FIXED. And concerning the comment from mnasato about issue being too generic. Lets make it clear: this issue is not related to any sort of import/export. OO leaks even if you just open and close a document - just like that. The next piece of code will make OO grow 100Mb per 2 minutes (demo.ods is a spreadsheet with four cells and one chart): public static void main(String[] args) throws Exception { XComponentContext xCompContext = com.sun.star.comp.helper.Bootstrap.bootstrap(); XMultiComponentFactory xMCF = xCompContext.getServiceManager(); Object xDesktop = xMCF.createInstanceWithContext("com.sun.star.frame.Desktop", xCompContext); XComponentLoader aLoader = UnoRuntime.queryInterface(XComponentLoader.class, xDesktop); PropertyValue[] loadProps = new PropertyValue[]{new PropertyValue("Hidden", 0, true, null)}; while (true) { XComponent xComponent = aLoader.loadComponentFromURL("file:///tmp/demo.ods", "_blank", 0, loadProps); close(xComponent); } } private static void close(XComponent xComponent) { XCloseable xCloseable = UnoRuntime.queryInterface(XCloseable.class, xComponent); if (xCloseable != null) { try { xCloseable.close(true); } catch (CloseVetoException e) {} } else xComponent.dispose(); } I just hit this bug while using a small Python based script found at: http://www.linuxjournal.com/content/convert-spreadsheets-csv-files-python-and-pyuno-part-1v2 When converting Excel files to CSV, memory usage slowly increases. Server crashes after a few hundreds conversion. cc myself Same problem for me. The difference is my conversion method can convert more than 43 000 documents before OOo crashes. Hope that bug will be fixed soon.... Moulay Is there any way you can show us the bulk of the Java code (if it is Java) and what it is you are disposing so successfully that you can convert 4300. I have % OO instances and I can only convert about 2300 total before all the instances lockup or crash. The once that do lockup use over 2G of memory. (In reply to comment #41) > Same problem for me. > The difference is my conversion method can convert more than 43 000 documents > before OOo crashes. > Hope that bug will be fixed soon.... > > Moulay In response to the comments by Bjoern Milcke on 2005-02-14: Some leak also occurs if each relevant Open Office call in Java is followed by System.gc(); System.runFinalization(); I verified this by using uno_dumpEnvironment(logFile, binaryUno_.get(), 0); in bridge.cxx, the object count in the bridge is constant between opening documents. Also, I can confirm that a leak occurs just by creating and closing text documents in a Java loop (with gc and runFinalization) using the URP UNO bridge. In order to find this bug, I'd suggest running valgrind not for 3 conversions (as was done in the attached valgrind logs), but for example for 100 or even more conversions. This should make it a lot easier to find the leak, as the relevant loss records would probably be repeated. Could someone possibly do this? I have compiled AOO on Windows and do not have a corresponding tool available for Windows. Created attachment 83796 [details] Simple program to make it easier to find the leak. I'm attaching a very simple Java program to help hunting down this issue. All it does is creating and closing Open Office text documents in a loop, and calling the garbage collector/finalizer to ensure that things are properly cleaned up on the Java side. The program takes one argument, which is the number of loops (i.e. the number of times a text document is created and closed). If you run it with argument "1000" or "10000" you will note the increase in memory usage just by creating and closing documents. In order to run this program, you need the "Nice Office Access" jars (license LGPL) http://ubion.ion.ag/loesungen/004niceofficeaccess and add them to the classpath. Additionally, add java_uno.jar;juh.jar;jurt.jar;officebean.jar;ridl.jar;unoil.jar;unoloader.jar to the classpath. You may need to adjust the Open Office path final String officeInstallDir = "C:\\Program Files (x86)\\OpenOffice 4"; I'm sorry I could not attach a complete jar file containing all libs, because of the size restrictions. Feel free to request it via email. |