When facing a stuck Thread problem, a common question is how can I dynamically kill the observed stuck Threads in order to quickly recover my middleware environment?
This is a question I’m getting quite often from my work colleagues and clients.
This short article will provide you with background on why this is not a good idea and not possible with current Java specifications and high level strategies to prevent stuck Threads at the first place.
Stuck Thread – What is it?
Stuck Thread problems are very common and can be hard to solve. A stuck Thread is basically a Thread which is hanging and / or has stopped its current assigned tasks for various reasons:
· Thread waiting on a blocking IO call ex: hanging socket.read() operation · Thread waiting to acquire a lock on an Object monitor (synchronized) · Thread forced to go in the wait() state ex: waiting to acquire a free JDBC Connection from a pool etc. · Thread involved in a real deadlock scenario e.g. Thread A waiting on Object monitor held by Thread B, Thread B waiting on Object monitor held by Thread A · Thread hanging on a disk IO operation · Thread hanging / paused due to excessive garbage collection going on · More scenarios…
Exactly the problem I’m facing, please tell me how I can kill these stuck Threads
The simple answer is that you cannot. Earlier Java specifications used to have Thread.stop(). This method was originally designed to destroy a thread without any cleanup:
· Any monitors it held would have remained locked. However, the method was never implemented. If it were to be implemented, it would be deadlock-prone in much the manner of suspend(). · If the target thread held a lock protecting a critical system resource when it was destroyed, no thread could ever access this resource again. · If another thread ever attempted to lock this resource, deadlock would result. Such deadlocks typically manifest themselves as "frozen" processes.
You can also consult the official Oracle documentation on this subject. http://docs.oracle.com/javase/6/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html
As you can see, because of the above risk scenarios, such Thread stop mechanism is not implemented which does not allow your middleware vendor to expose any Thread stop button / functionality to the end user / middleware administrator.
In order to terminate stuck Threads, you have to bring down the entire JVM.
Any proposed solution?
Your best strategy is to prevent stuck Threads at the first place as much as possible and to properly perform root cause analysis & Thread Dump analysis post stuck Thread related incidents. As you can see in the above scenarios, typical stuck Thread problems are just “symptoms” of other problems so solutions include:
· Proper timeout implementation to prevent forever hanging stuck Thread on blocking IO calls. Most communication API out there properly expose timeout methods allowing you to cap in seconds how long you are allowing your Threads to wait on remote resources (Web Services etc.) · Application code should be reviewed and optimized in order to eliminate wrong synchronized usage · Proper capacity planning of your environment is a must in order to prevent overload of key infrastructure components such as an Oracle DB which can trigger slow running queries and stuck Thread on middleware side · Deadlock scenarios are usually symptoms of code problem / code not Thread safe ex: re-using the same JDBC Connection object between 2..n Threads · Excessive IO / disk access can trigger stuck Threads, again symptoms of application problem(s) performing too much IO / class loading calls etc. · Lack of tuning, capacity planning and / or Java Heap memory leak may lead to excessive garbage collection and inevitably to stuck Threads; again symptoms of a bigger problem which requires proper Java Heap tuning and Heap Dump / application memory footprint analysis I hope this article has helped you understand why you cannot and should not rely on Thread.stop() to fix your stuck Thread problems and strategies to prevent these problems at the first place.
Please don’t hesitate to post any comment or question about any stuck Thread problem you are currently facing.