This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 244336 - Thread deadlock due to TopLogging use by RequestProcessor starting with Java 7u25 (7u55 is current and still exhibits the problem)
Summary: Thread deadlock due to TopLogging use by RequestProcessor starting with Java ...
Status: NEW
Alias: None
Product: platform
Classification: Unclassified
Component: -- Other -- (show other bugs)
Version: 7.1.2
Hardware: PC Windows 7
: P3 normal (vote)
Assignee: Antonin Nebuzelsky
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-06 18:59 UTC by PineForest
Modified: 2014-05-06 19:31 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
Thread dump excerpt showing the three Netbeans threads contributing to the Application freezing (1.81 KB, text/plain)
2014-05-06 18:59 UTC, PineForest
Details

Note You need to log in before you can comment on or make changes to this bug.
Description PineForest 2014-05-06 18:59:12 UTC
Created attachment 147147 [details]
Thread dump excerpt showing the three Netbeans threads contributing to the Application freezing

Note
----
I am adding this to help users troubleshoot this issue because the bug which appears to be the same issue has a slightly different signature and is not well documented (175987).

Context
-------
Our application (hereafter referred to as the "Application") is built over Netbeans platform 7.1.2 and launched via Java Web Start. It freezes within 1 or 2 hours of launch when executed over Java 7u25 through 7u55 (current as of this writing). When run over Java 7u21 and launched by Java Web Start, our Application does not freeze, even after 3 days of continuous up time.

Conclusion
----------
The cause is due to an indirect thread deadlock that occurs within the Netbeans platform code. The critical component of the deadlock that affects the Application is when the QA Tool AWT event thread blocks while waiting on the RequestProcessor queue manager to complete its work. However, the RequestProcessor queue manager is in a deadlock with the logging queue flushing thread. It can be solved by removing or replacing the Netbeans TopLogging component.

Summary
-------
The Netbeans Request processor manager (“Manager”) component initiates work and locks a semaphore (I will refer to it as P – for “pool”). The Manager does some work and decides to log some information. It calls the Netbeans TopLogging class (“Logger”) to do the logging. The Logger discovers the log queue is full. It issues a request for the Netbeans flushing task (“Flusher”) to wake up to empty the queue and then waits for an opening in the logging queue. The Flusher initiates work and tries to lock semaphore P. Given P is locked by the Manager, the Flusher waits until the lock can be acquired. Enter the Application. The Application is listening for input from an AWT event thread. When it needs to do some work the AWT thread tries to lock semaphore P. When it finds it locked, it waits until it can acquire the lock. At this point the Application is frozen.

Attachment
----------
Note: This refers to the attached thread dump excerpt containing the three threads of interest at the point in time the Application has frozen.

Here is the sequence that I have concluded results in our Application freezing:
1. The Manager starts to enqueue a Task to execute.
2. The Manager requests and acquires a lock on the Processor pool stack (a Stack of Processors) at RequestProcessor.java:1869

   ATTENTION: At this point or after the Flusher thread times out waiting for a new job and wakes up and requests a lock on the Processor pool stack. It finds it has already been acquired and so enters a wait state at RequestProcessor.java:1953.

   ATTENTION: At this point or after (and before step 6) the AWT event queue thread which the QA Tool is listening on wakes up and requests a lock on the Processor pool stack. It finds it has already been acquired and so enters a wait state at RequestProcessor.java:1869.

3. The Manager requests a top level thread group to ensure the task will survive and detects that the top level thread group returned differs from the current top level thread group. It attempts to log this at RequestProcessor.java:2120. 

   NOTE: the logging of this message starts to show up in logs once we updated to Java 7u25 and on. When executing the Application under Java 7u21, it does not show up at all. This suggest to me that perhaps the check performed in bullet (4) stopped working as of Java 7u25.

4. The Logger is running in its own thread. To prevent a deadlock, it checks to see if its thread is a RequestProcessor thread at TopLogging.java:498. For some reason, this check finds that it is not a RequestProcessor thread, allowing it to continue into its critical section.
5. The Logger logs records in a FIFO manner. To accomplish this it uses a Java concurrent LinkedBlockingQueue. This has a characteristic of blocking thread execution when the queue is full and a new entry is being “put” into the queue. When the queue has room, the blocked thread is unblocked and the entry is placed in the queue. At TopLogging.java:501 the Logger tries to “offer” the Manager’s log record into the queue (this is a non-blocking call). It fails to do so because the queue is full.
6. The Logger requests that the Flusher immediately remove entries from the queue at TopLogging.java:505. The Flusher is scheduled (even though it is blocked) and this request returns success.
7. The Logger “puts” the record into the queue at TopLogging.java:505. This action causes the Logger thread (which is the Manager’s thread) to block, waiting until the queue once again has room for the record. It is expecting that the Flusher will be making this room for it momentarily. However, the Flusher is waiting for the Processor pool stack lock which the Manager still holds.
8. The AWT thread which the Application receives event notifications from (e.g. keyboard key strikes) runs as a RequestProcessor. It wakes up and requests a lock on the already held Processor pool stack object at RequestProcessor.java:1870. It enters a wait state and it will not wake up because the requested lock will never be freed by the deadlocked threads.
9. Game Over.

I concluded that if I were to remove the TopLogging installed handlers and ran the Application over Java 7u25 or 7u55 that I would not see this thread deadlock and our Application would not freeze. I tried this and the Application ran successfully for 3+ days without freezing.
Comment 1 PineForest 2014-05-06 19:15:24 UTC
Question: I looked at the 8.0 source code for TopLogging and discovered that the publish method behavior has radically changed (sometime after 7.1.2 and up through 8.0). Would someone familiar with this change please let me know if it has simply moved the deadlock potential off to another location or that the change was significant enough that the problem has been eliminated?
Comment 2 PineForest 2014-05-06 19:31:58 UTC
Note
----
I have not verified the following adequately, though I have see this when running the Application for 1/2 a day. I have been told this by fellow developers familiar with the issue here at my company.

The Application does not hang if it is launched as a local Java application and not through Java Web Start. If this is truly valid, then I speculate that the problems are not just with Java 7u25+, but also with changes in how Java Web Start launches the Application.

Also, be aware that one of the first things that I fixed regarding Java 7u25 was to ensure that all of the distribution JARs contained a MANIFEST.MF with the proper settings for the Codebase, Permissions and Trusted-Library properties and that they were signed. This includes Netbeans as well as the Application core and branding JARs.