This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 162706 - "Scanning projects..." incredibly slow in recent builds
Summary: "Scanning projects..." incredibly slow in recent builds
Status: VERIFIED FIXED
Alias: None
Product: editor
Classification: Unclassified
Component: Parsing & Indexing (show other bugs)
Version: 6.x
Hardware: PC Linux
: P2 blocker (vote)
Assignee: Vitezslav Stejskal
URL:
Keywords:
: 164107 (view as bug list)
Depends on: 161367 164745
Blocks: 162635
  Show dependency tree
 
Reported: 2009-04-15 12:31 UTC by kawazu428
Modified: 2009-05-15 10:09 UTC (History)
6 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
thread dump (93.39 KB, text/plain)
2009-04-15 12:32 UTC, kawazu428
Details
thread dump of IDE while in "scanning projects..." (40.40 KB, text/plain)
2009-04-24 08:35 UTC, kawazu428
Details
jmap histogram (395.55 KB, text/plain)
2009-04-24 14:01 UTC, kawazu428
Details
messages.log (38.43 KB, text/plain)
2009-04-28 19:07 UTC, kawazu428
Details
Indexing times (OpenOffice spreadsheet) (20.59 KB, application/vnd.oasis.opendocument.spreadsheet)
2009-05-04 14:05 UTC, Vitezslav Stejskal
Details
Jackrabbit project scan times (OpenOffice spreadsheet) (16.91 KB, application/vnd.oasis.opendocument.spreadsheet)
2009-05-04 15:59 UTC, Vitezslav Stejskal
Details
Jackrabbit project scan times after the fix (OpenOffice spreadsheet) (15.70 KB, application/vnd.oasis.opendocument.spreadsheet)
2009-05-05 16:37 UTC, Vitezslav Stejskal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kawazu428 2009-04-15 12:31:40 UTC
Product Version: NetBeans IDE Dev (Build 200904131401)
Java: 1.6.0_13; Java HotSpot(TM) Server VM 11.3-b02
System: Linux version 2.6.28-11-generic running on i386; UTF-8; de_DE (nb)
Userdir: /home/kr/.netbeans/dev-pre7

... using various "folder-of-projects" groups (each about 50 maven2 projects), the "Scanning projects..." process after
launching the IDE and/or switching between groups is just awfully slow in recent builds. At the moment it takes 5 .. 10
minutes until this eventually comes to an end, until then leaving the IDE pretty much unusable (unresponding, Java
source packages / files not updated in the "Projects" view, ...). 

Note #1: I am not sure whether this relates to #147656 , decided to file a separate one as my situation seems slightly
different.

Note #2: Why ain't there an "ergonomy" component available to file issues against? Using "ide" seems rather, well,
"fuzzy"...

Thread dump attached, for whatever it's worth.
Comment 1 kawazu428 2009-04-15 12:32:49 UTC
Created attachment 80136 [details]
thread dump
Comment 2 Milan Kubec 2009-04-15 13:35:08 UTC
Please evaluate. Thanks.
Comment 3 kawazu428 2009-04-16 07:19:44 UTC
Adding to this, this morning I noticed that most of the time spent on "Scanning projects..." actually seems to get
"wasted" on scanning all the binary artifacts in my local .m2/repository; so maybe it's a maven2 related issue after all?
Comment 4 hlavki 2009-04-22 10:45:46 UTC
It looks like problem with memory consumption. While process "Scanning Projects..." is running heap memory is full in
couple of seconds... I created heap dump but tar.bz2 archive has 78MB. I can upload it somewhere if you want to...
I can confirm, this is not only problem with maven projects but also with netbeans projects...
Comment 5 kawazu428 2009-04-24 08:35:09 UTC
Created attachment 80819 [details]
thread dump of IDE while in "scanning projects..."
Comment 6 kawazu428 2009-04-24 08:39:03 UTC
Added another thread dump and increased priority as this seems to have gotten worse in recent builds. Currently using
04231401, I have to see that not only the "Scanning projects..." takes incredibly long but at the moment I've also been
waiting a couple of minutes with the IDE window being all "grey", obviously not repainting anymore. This is even worse
than just "scanning".  Following hlavkis comment, I also created a heap dump which I won't attach for obvious reasons:


    Date taken: Fri Apr 24 09:37:39 CEST 2009
    File: /tmp/visualvm.dat/localhost-22843/heapdump-1240558658553.hprof
    File size: 82 MB

    Total bytes: 68.102.166
    Total classes: 11.734
    Total instances: 1.077.288
    Classloaders: 619
    GC roots: 4.501


Not sure how to be of more help on this, but at the moment this is utterly annoying... :)
Comment 7 Dusan Balek 2009-04-24 09:51:55 UTC
Could you please generate a jmap histogram (it's smaller than a full heap dump) and attach it to this issue? Thanks.
Comment 8 kawazu428 2009-04-24 14:00:50 UTC
Simple jmap output to follow, histogram attached.


[kr@n428 14:56:44] ~> jmap -heap 19231
Attaching to process ID 19231, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.3-b02

using thread-local object allocation.
Parallel GC with 2 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 421527552 (402.0MB)
   NewSize          = 1048576 (1.0MB)
   MaxNewSize       = 4294901760 (4095.9375MB)
   OldSize          = 4194304 (4.0MB)
   NewRatio         = 3
   SurvivorRatio    = 8
   PermSize         = 100663296 (96.0MB)
   MaxPermSize      = 134217728 (128.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 100007936 (95.375MB)
   used     = 17144472 (16.350242614746094MB)
   free     = 82863464 (79.0247573852539MB)
   17.14311152266956% used
From Space:
   capacity = 2490368 (2.375MB)
   used     = 344064 (0.328125MB)
   free     = 2146304 (2.046875MB)
   13.81578947368421% used
To Space:
   capacity = 2359296 (2.25MB)
   used     = 0 (0.0MB)
   free     = 2359296 (2.25MB)
   0.0% used
PS Old Generation
   capacity = 96665600 (92.1875MB)
   used     = 67536360 (64.4076919555664MB)
   free     = 29129240 (27.779808044433594MB)
   69.86597093485169% used
PS Perm Generation
   capacity = 100663296 (96.0MB)
   used     = 52361152 (49.93548583984375MB)
   free     = 48302144 (46.06451416015625MB)
   52.01613108317057% used
Comment 9 kawazu428 2009-04-24 14:01:36 UTC
Created attachment 80853 [details]
jmap histogram
Comment 10 Dusan Balek 2009-04-24 14:17:11 UTC
Thanks for the histogram. However, when I'm looking at it, I'm afraid that we need the full heap dump. Could you please
try to upload it to http://editor.netbeans.org/servlets/ProjectDocumentAdd ?
Comment 11 kawazu428 2009-04-24 14:32:14 UTC
should be here:

http://editor.netbeans.org/files/documents/17/2466/heapdump-1240558658553.hprof

hope it has been fully uploaded.
Comment 12 Dusan Balek 2009-04-24 16:11:48 UTC
According to heap dump, more than 30% of available memory used for char[] usually held via
org.apache.maven.wagon.providers.ssh.knownhost.FileKnownHostsProvider. Other 20% are some strange images.
Reassigning to maven support for further evaluation.
Comment 13 Milos Kleint 2009-04-27 12:40:42 UTC
recently fixed issue
http://www.netbeans.org/issues/show_bug.cgi?id=162652
is fixing the problem from the thread dumps with abundant threads with Jaxws waiting code.

there is not much to do with the known hosts strings.. I've tried to reduce the number of MavenEmbedder instances that
hang around the IDE in http://hg.netbeans.org/main/rev/98137bcec208, however the projects are already loaded at the
Scanning projects phase and the only way they could be involved in the observed slowness is by occupying the memory.

as hlavki mentions the scanning issue is however not relevant to maven projects only. reassigning back to java.

I suppose we should get a generated snapshot that would reveal what time is spent where. 



Comment 14 Rastislav Komara 2009-04-27 17:07:46 UTC
@mkleint: Thank you for improvements.

@reproter: is scanning still slow? Can you provide some real information how long does it take? Please provide snapshot
for your build. Provide also heap dump again please.
Comment 15 kawazu428 2009-04-27 19:56:20 UTC
Gave it a test drive again, using recent build on an idle Ubuntu Linux machine (Core2 Duo CPU, 2 GHz, 2 GB RAM), no
other applications running at this time, didn't work with the machine except for watching the IDE start up.

Started out with an empty $HOME/.netbeans/ folder, created the project group anew, restarted the IDE. Project group
contains 56 maven2 projects (war,jar,pom), some of them hosting JAX-WS web services. Using a vanilla 6.7 nightly build,
no additional modules installed.


20:20 IDE restart

20:21 "Task scanning", first run done

20:22 "Opening projects" done, "Scanning projects..." started. 

20:42 still "Scanning projects", but something has changed, the "Tasks" window now display "TODO" tasks and such things.

20:45 "Scanning projects" finished.


As pointed out before, looking at the process while "Scanning projects" is under way, it seems there aren't any
"strange" things happening but just an outstanding amount of time gets "wasted" on scanning .jar files in jdk/jre/lib/
and 3.7GB of artifacts inside $HOME/.m2/repository/ before, eventually, any of the opened projects get scanned. Does
this really have to happen each time the IDE gets started up?




Product Version: NetBeans IDE Dev (Build 200904270201)
Java: 1.6.0_13; Java HotSpot(TM) Server VM 11.3-b02
System: Linux version 2.6.28-11-generic running on i386; UTF-8; de_DE (nb)
Userdir: /home/kr/.netbeans/dev-pre7
Comment 16 nleck 2009-04-27 22:35:41 UTC
For me the project scan is still a major issue.

We appear to have fix the initial lock up but now the database is corrupt and we can't find classes that haven't been
viewed.

I've had numerous problems with this in the daily builds ( 6.7m3 just locked up and not usable). We seem to oscillating
between always scanning and completing quickly but corrupt.
Comment 17 Milos Kleint 2009-04-28 06:35:59 UTC
kawazu: the theory with the IDE scanning the complete local repository is interesting, if it does so it's certainly
wrong. It should only scan jar files used by the 50 projects of yours (which I hope isn't gigabytes).
To prove the theory, could you try moving the local repository away and only populate it with items required by your 50
projects? and see if the scanning improves?

vstejskal, moonko: is there a way to show the logs for what is actually being parsed, scanned, compiled?
Comment 18 kawazu428 2009-04-28 08:20:29 UTC
@mkleint: Moving .m2/repository to .m2/repo_old (starting over with an empty repository) reduces time spent on project
scan to < apprx. 30 seconds. Good. 
After building all my projects (allowing maven to download all the dependencies required, along with their dependencies,
...), .m2/repository has grown to 187 megs. Restarted the IDE, opening projects / tasks scanning relatively fast again.
"Scanning projects" however is slow as before. By now I can dump the theory of too many unrelated artifacts being
scanned as there are none around at the moment. Seems it's simply takes time doing what it does... :(
Comment 19 Vitezslav Stejskal 2009-04-28 15:43:40 UTC
"is there a way to show the logs for what is actually being parsed, scanned, compiled?" - Yes. Please start the IDE with
-J-Dorg.netbeans.modules.parsing.impl.indexing.RepositoryUpdater.level=FINE switch (ie. from command line or add it to
netbeans.conf). In <ide-userdir>/var/log/messages.log there should be messages like "Scanning sources root: XYZ" and
"Scanning binary root: XYZ". There should be one root for every classpath item used by your projects (ie. source and
test folders in your projects, jars they depend on, JDK jars, etc). Please attach the log file here so that we can have
a look.

Also, would it be possible for me (a total maven ignorant so far) to set up a similar environment on my laptop. Ideally
I'd like to do the exact same think as kawazu, but if that is not possible any local repository with some interdependent
projects and libraries should do, I guess. I can then check what is really indexed vs what should be indexed.

kawazu, what is the heap size available for the IDE (it should be in the log file, look for -Xmx)? Could you try
increasing it (ie. add -J-Xmx512m to your netbeans.conf)? Scanning in general needs a lot of memory to run. When it
finishes the memory will be released, but while scanning the available amount of memory is crucial.

Comment 20 kawazu428 2009-04-28 19:05:32 UTC
@vstejskal: Configured logging and heap memory as you suggested, "scanning projects..." speed hasn't that much improved
that much, so far. Please find the log file attached. Hope it might be of some help. About reproducing the issue, I
unfortunately can't provide you with our complete source tree (not that there's too much IP of ours in that but it
contains some information about our infrastructure which I am not allowed to give out); however I think if the problem
is related to the amount of projects opened, I think this should easily be reproducible using virtually any more complex
maven2 source structure - Apache Jackrabbit? Glassfish v3?

Please let me know if there is any additional information I can provide you with, as well as thanks for taking care of
that everyone.
Comment 21 kawazu428 2009-04-28 19:07:06 UTC
Created attachment 81112 [details]
messages.log
Comment 22 kawazu428 2009-04-29 07:29:39 UTC
Just a side note, after spending some time pondering this last nite, I posted a feature request related to project
management here, eventually easing this by reducing the amount of projects "opened"?

http://www.netbeans.org/issues/show_bug.cgi?id=164004
Comment 23 Vitezslav Stejskal 2009-05-04 14:01:19 UTC
@kawazu428: Thanks for the log file. Is it from a subsequent IDE run (ie. no userdir or var/cache/index cleanup)? From
what it shows I guess it is. However, the strange thing is that most of your source roots seem to be up-to-date (ie. no
changes and no files are really indexed), but almost all your binary roots (ie. jars/zips) are not up-to-date and the
IDE reindexed them. I think that that's why scanning took so long. The question here is why the binary files were
modified or more precisely why is their timestamp (last-modified-time) different from the previous IDE session.

I'm going to attach an openoffice spreadsheet which summarizes times spent by indexing particular classpath roots. The
roots are sorted (descending) by the time spent in their indexing. The times are in milliseconds. Total time is approx.
14 minutes. When you look at the times there are bunch of roots which indexing took less than one 1 second, those are
likely the up-to-date roots. The rest are those that the IDE reindexed again for some reason and most of them are .jar
files.

I downloaded Jackrabbit project as suggested, built it and opened all projects in Netbeans. The initial scan took a
while, but subsequent starts were reasonably fast and from the log files I can confirm that none of the source or binary
roots were indexed again during the subsequent IDE session. I'll do the same analysis as before and attach the
spreadsheet here. Could you please try reproducing the issue with Jackrabbit and tell me step-by-step what you do so
that I can replicate the problem on my own? Thanks a lot for your help.
Comment 24 Vitezslav Stejskal 2009-05-04 14:05:43 UTC
Created attachment 81466 [details]
Indexing times (OpenOffice spreadsheet)
Comment 25 Vitezslav Stejskal 2009-05-04 15:59:57 UTC
Created attachment 81474 [details]
Jackrabbit project scan times (OpenOffice spreadsheet)
Comment 26 Vitezslav Stejskal 2009-05-04 16:06:39 UTC
I just attached the scanning times for Jackrabbit project on my laptop. The Netbeans/jdk/OS info from the log file:

Product Version         = NetBeans IDE Dev (Build 200905031401)
Operating System        = Linux version 2.6.22-16-generic running on i386
Java; VM; Vendor        = 1.6.0_10-ea; Java HotSpot(TM) Client VM 11.0-b11; Sun Microsystems Inc.

The total initial scan was ~125sec, total time on the subsequent start was ~59sec. The log file shows no scanning on the
subsequent start, which is good. The time itself, however, is not so good. I personally would expect the up-to-date
check to take much less time then roughly half of the full scan time. Sigh.
Comment 27 nleck 2009-05-05 00:08:10 UTC
how many files are being scanned in the Jackrabbit project ?

The main issue is that the scan starts again even after waiting for the VERY long initial scan. 

There are many examples of people having problems with long scan times on the message broads, please test a larger project. 
Comment 28 Vitezslav Stejskal 2009-05-05 16:27:45 UTC
The problem was in broken cache of SourceForBinaryQuery results. It's fixed in
http://hg.netbeans.org/jet-main/rev/e58558ba5304. I'm also attaching new measurements taken after the fix. The initial
scan took ~78sec and scanning on subsequent start took only ~5sec. Which shows ~40% improvement in initial scan time,
but more importantly approximately 10 times faster scanning on the second start.
Comment 29 Vitezslav Stejskal 2009-05-05 16:37:48 UTC
Created attachment 81576 [details]
Jackrabbit project scan times after the fix (OpenOffice spreadsheet)
Comment 30 Vitezslav Stejskal 2009-05-05 20:44:09 UTC
*** Issue 164009 has been marked as a duplicate of this issue. ***
Comment 31 David Konecny 2009-05-05 21:28:31 UTC
Good job Vita! Thanks.
Comment 32 Quality Engineering 2009-05-07 07:56:42 UTC
Integrated into 'main-golden', will be available in build *200905070201* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/e58558ba5304
User: Vita Stejskal <vstejskal@netbeans.org>
Log: #162706 - removing the old GlobalSourcePath and RepositoryUpdater and replacing their usages by calls to the new infrastructure. The old half-dead, half-alive GlobalSourcePath caused major slowdown in binary-to-source root translation
Comment 33 rmichalsky 2009-05-07 13:10:24 UTC
*** Issue 164107 has been marked as a duplicate of this issue. ***
Comment 34 nleck 2009-05-07 13:43:12 UTC
Just with #164107 there are three different issues :-
 1) The scans are slow -- this maybe fixed now
 2) The repeatedly scans even when files aren't changed
 3) The scans block the navigation.

Of course if the navigation wasn't block then we wouldn't care so much about the scans. 

Should the bug #164107 really be closed ?
Comment 35 Michel Graciano 2009-05-07 15:38:50 UTC
It looks reeeeeeeally fast now. :-D
Great job!
Comment 36 Michel Graciano 2009-05-08 18:55:27 UTC
You can see some comments about last changes at this area in NetCAT mail list [1], with build after this change.

[1] http://www.netbeans.org/servlets/ReadMsg?list=netcat&msgNo=15892
Comment 37 nleck 2009-05-08 22:51:45 UTC
This is *NOT* fixed see bug#164745 and bug#161367

There is a new deadlock #164745 and while there is no longer any feedback saying scanning the navigation is blocked for
a very long time #161367. Also the symbol database gets corrupted. 

This is far far worst than 6.5 
Comment 38 nleck 2009-05-08 22:54:05 UTC
while yes the second/third time the ide the scan was incredibly fast, I couldn't find any existing classes. 
Comment 39 Vitezslav Stejskal 2009-05-11 11:01:13 UTC
"This is *NOT* fixed..." - What makes you think so? This issue described specific usecase that I was able to reproduce,
measure, profile and fix. I'm not saying that all problems in scanning are resolved now, but this particular one IMO is.
If you don't think so, could you please post here your measurements with jackrabbit project done on a recent dev build?

The issue #164745 is deadlock between ergonomics and j2ee as explained by jlahoda and will no doubt be dealt with
accordingly. The issue #161367 is still open and I'm working on it. In particular there was
http://hg.netbeans.org/main-silver/rev/9a676dc9ed8d and I'm now making the initial scan job interruptible.

I would really like to stress out that providing exact, reasonably simple, reproducible usecases is essential for me to
be able to fix something. General cries no matter how loud they are just don't help. I'm sorry.
Comment 40 nleck 2009-05-11 11:48:17 UTC
I downloaded the daily build and opened the project while waiting for the scan to complete the IDE locked up a number of
times. I then just opened the IDE and went to lunch to give the IDE time to do it's scan. Once completed I then tried
"go to Type..." but couldn't find ANY classes. I closed and opened the IDE again and I still couldn't find any classes
but the scan was fast.

I then manually navigated to the class in question and a very long scan started. This scan was without any user feedback
but I could tell that it was scanning because the navigator was blank for this class. Once this scan was completed I
could find classes in this package.

I don't have the legal rights to upload the bank's project, but I believe many people are having the same or similar
issues. I know the slow scanning is due to the large size of the project but again Eclipse handles this size project well.

The navigation now works after I deleted the .netbeans directory and started the IDE a fresh and left it to scan but
it's still very slow and frequent rescans. 

I don't know what to say, if you want to call this fixed well.... the scanning in 6.5 was awful, in 6.7 it's worst.
There will be a lot of upset developers if NB is released as is.
Comment 41 Pavel Flaska 2009-05-11 11:58:48 UTC
Could you, please, attach the log file?

Comment 42 nleck 2009-05-11 12:02:22 UTC
I've lost the logs for the lock up because I've deleted the .netbeans directory. 

Do you want logs now that I can find classes ?
Comment 43 Pavel Flaska 2009-05-11 12:11:36 UTC
No. Try to run with -J-Dorg.netbeans.modules.parsing.impl.indexing.RepositoryUpdater.level=FINER. It should log the scanning times for classpath roots. 
Once you get in trouble, look at the log and attach it, if possible. Could you, please, also describe your environment?. Type of projects, network disks?
Comment 44 kawazu428 2009-05-11 12:18:31 UTC
verified as far as I am concerned. Started and stopped a recent nightly build twice "asides" regularly working on my
machine, and things seem to have greatly improved, "Scanning projects..." by now just takes a couple of moments to
finish on my project group. Good work and thanks a bunch for tracking this down! :)
Comment 45 nleck 2009-05-11 12:27:41 UTC
I'll get the logs and attach.

The system that we use to develop is a 8 cpu 32 gigs Solaris box. The netbeans home is on a local disk but the sources
are on a NFS. There are a number of these machines which are shared by a about 50 developers in total but as we are from
around the world so there is only about 1/3 of these active at anyone time.

There are approximately 6,500 Java files in the project. The scanner picks up ~19,000 because it's looking into the jar
files etc which are within the source root ( wish I could exclude directories from the scan).

Most developers are using Eclipse so changing the project structure to suite NB is not an option at this time. We would
like the developers to move to NB because we can have a shared ANT build scripts.
Comment 46 Vitezslav Stejskal 2009-05-12 13:02:05 UTC
Folks, can we please keep the discussion related to problems experienced and reported by nleck in issue #161367? It
contains number of related data (threaddumps, logs, the description of nleck's project, environment, etc.) and as I
pointed out earlier it is still open :-). Thanks