Bug 60123

Summary: Tomcat JMX requestProcessingTime sometimes reports a request processor with decades of runtime (since 1-1-1970)
Product: Tomcat 7 Reporter: Edward Kuns <eddie.kuns>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 7.0.57   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Edward Kuns 2016-09-13 01:02:16 UTC
We have an application that periodically monitors JMX "Catalina:type=RequestProcessor,worker=*,name=*" and looks at each entry returned from that wildcard, getting requestProcessingTime.  e.g., code with lines like this:

    ObjectName requestProcessorWildcard = new ObjectName("Catalina:type=RequestProcessor,worker=*,name=*");
    Set<ObjectName> mbeans = mbs.queryNames(requestProcessorWildcard, null);
    for (ObjectName name : mbeans) {
        // Get "processing time" for the current request, if any
        long currentReqProcTime = getLongValue(mbs, name, "requestProcessingTime") / 60000;

We sometimes see requestProcessingTime returning a value suggesting the request started on 1-1-1970, currently 46+ years ago.  Looking at Tomcat 7.0.57 source code (as what I have available to look at), I see this method in java/org/apache/coyote/RequestInfo.java:

    public long getRequestProcessingTime() {
        if ( getStage() == org.apache.coyote.Constants.STAGE_ENDED ) return 0;
        else return (System.currentTimeMillis() - req.getStartTime());
    }

Clearly, if req.getStartTime() == 0, this method will return a nonsensical request processing time.  This method ought to make sure the start time isn't zero before doing the subtraction.  When we see this, the request processor reports itself to be in stage 3 ... aka "STAGE_SERVICE".  Clearly the requests weren't started in 1970.  We don't know how the request is in the stage "service" but has its start time zeroed.

Note that the person in this thread http://osdir.com/ml/users-tomcat.apache.org/2016-06/msg00204.html was probably experiencing the same flaw.  If you do the math 1466499689496 msec corresponds to the time span from 1-1-70 to Tue, 21 Jun 2016 09:01:29.496 GMT ... and the EMail was posted on 21 June 2016!  I haven't followed the code through to see what can cause this to occur.
Comment 1 Mark Thomas 2016-09-20 19:31:58 UTC
I can see a few ways this can happen. I'll look at making that code more robust.
Comment 2 Mark Thomas 2016-09-20 19:42:08 UTC
This has been fixed in the following branches:
- 9.0.x for 9.0.0.M11 onwards
- 8.5.x for 8.5.6 onwards
- 8.0.x for 8.0.38 onwards
- 7.0.x for 7.0.73 onwards
- 6.0.x for 6.0.46 onwards