136683 – Profiler unable to handle large heap dumps

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 136683 - Profiler unable to handle large heap dumps

Summary: Profiler unable to handle large heap dumps

Status:	RESOLVED INVALID

Alias:	None

Product:	profiler
Classification:	Unclassified
Component:	Base (show other bugs)
Version:	6.x
Hardware:	All All

Importance:	P2 blocker (vote)
Assignee:	issues@profiler

URL:
Keywords:

Depends on:
Blocks:

Reported:	2008-06-06 18:34 UTC by Petr Nejedly
Modified:	2008-06-16 14:24 UTC (History)
CC List:	0 users

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Petr Nejedly 2008-06-06 18:34:27 UTC

On my 2GB RAM linux machine, I'm almost unable to handle heap dumps larger than ~1GB (700MB was still doable).
The profiler starts "opening" the heap dump, but the indefinite (sic!) progress bar just keep going and going, while the
machine is almost unusable (I/O bound).
I have observed that for 1.3GB heap dump, the java process was allotted only 1.1GB of RSS, while top claimed that 1.6GB
are caches. Both these numbers are a bit too small, as the sum of size of the external hash map and the size of the dump
file is slightly bigger than either of the memory usages. So the machine was constantly paging in and out.

When thinking more of it, I would say that mmapping the dump file, while looking clever, is a bad idea in the first place.
Correct me if I'm wrong, but the dump file is either processed sequentially (as during the indexing, or when looking up
all the incomming references or all the instances of given kind), or a very small portion of it is needed (as when
browsing the object graph). And such an access pattern would at least as well (if not better) be served by standard file
I/O (maybe with a very limited amount of caching), without consuming precious address space.

And here is why it can perform better: If you have N memory frames available and map some N+X pages into your address
space, the OS have to decide what page to trash when you touch the N+1st. In case of linear access, guess what page
will it trash? Likely the first one (LRU). Now if you sweep it 10 times, you'll go through the worst scenario you can
imagine - every page will be swapped in/out 10 times without ever being really reused once.

And why not mapping the dump, only mapping the index should speed up opening? Just because the (unnecessary) page-ins of
the dump file cause loses of the precious (really random-accessed) pages of the index, so the index creation is much
slower than it needs to be. It might be even feasible to allocate the index store directly in the off-heap space,
not even backed by a file, to reduce the OS temptation to swap the pages out (but the difference between swapping
anonymous pages to anonymous swap and swapping file-backed pages to a file might end up being none).

Comment 1 Petr Nejedly 2008-06-06 20:32:22 UTC

Now I have realized that in my case (1.3GB heap dump), the profiler fell back (through OOME) to the file implementation
of the index, thus making part of my claims inaccurate, but also making the slowness even more pronounced - the profiler
tried to build a 300MB+ index file by randomly seeking all over the place and rewriting small chunks of data all the time.
No wonder it was that slow!

Comment 2 Tomas Hurka 2008-06-16 14:24:21 UTC

I am sorry this not a bug report. The situation with memory mapping is not that simple as you think. Both implementation (memory mapped and plain file I/O) 
is available depending on size of available memory. According my tests memory mapping is faster. Opening heap dump on linux is known  to be slower that 
on other OSes.